Models Integration¶
This guide explains how VT.ai integrates with different AI model providers and how to work with the models API.
Architecture Overview¶
VT.ai uses LiteLLM as a unified interface to multiple AI providers, which provides:
- A consistent API across different models
- Automatic fallbacks and retries
- Standardized error handling
- Easy switching between models
Provider Integration¶
Built-in Providers¶
VT.ai comes with built-in support for several providers:
- OpenAI (GPT-o1, GPT-o3, GPT-4o)
- Anthropic (Claude models)
- Google (Gemini models)
- Local models via Ollama
- And others (DeepSeek, Cohere, etc.)
Provider Configuration¶
Provider configuration is managed in vtai/utils/llm_providers_config.py
, where:
- Model-to-provider mappings are defined
- Environment variable names for API keys are specified
- Default parameters for each model are set
- Icons and display names are configured
Working with Models¶
Model Selection¶
Models are selected through several mechanisms:
- User Selection: Via UI or command line
- Semantic Router: Automatically based on query
- Specialized Handlers: For specific tasks (vision, image generation)
Model Configuration¶
Models can be configured with parameters like:
# Example configuration
model_params = {
"model": "o3-mini", # OpenAI GPT-o3 Mini model
"temperature": 0.7, # Controls randomness
"top_p": 0.9, # Controls diversity
"max_tokens": 1000 # Maximum output length
}
Calling Models¶
VT.ai uses LiteLLM's completion interface for consistency:
# Example asynchronous call using LiteLLM
from litellm import acompletion
async def call_model(messages, model="o3-mini", **kwargs):
try:
response = await acompletion(
model=model,
messages=messages,
**kwargs
)
return response
except Exception as e:
# Error handling
raise
Specialized Model Usage¶
Vision Models¶
Vision models require special handling for image inputs:
# Example vision model call
async def call_vision_model(image_url, prompt, model="4o", **kwargs):
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{
"role": "user",
"content": [
{"type": "text", "text": prompt},
{"type": "image_url", "image_url": {"url": image_url}}
]
}
]
response = await acompletion(
model=model,
messages=messages,
**kwargs
)
return response
TTS Models¶
Text-to-speech models handle audio generation:
# Example TTS call
async def generate_speech(text, model="tts-1", voice="alloy"):
from litellm import atts
try:
response = await atts(
text=text,
model=model,
voice=voice
)
return response
except Exception as e:
# Error handling
raise
Image Generation Models¶
VT.ai supports advanced image generation capabilities with multiple models:
# Example image generation call
async def generate_image(prompt, **kwargs):
# Configure settings for image generation
image_size = settings.get(SETTINGS_IMAGE_GEN_IMAGE_SIZE) # 1024x1024, 1536x1024, etc.
image_quality = settings.get(SETTINGS_IMAGE_GEN_IMAGE_QUALITY) # "standard", "high"
background = settings.get(SETTINGS_IMAGE_GEN_BACKGROUND) # "auto", "transparent"
output_format = settings.get(SETTINGS_IMAGE_GEN_OUTPUT_FORMAT) # "png", "jpeg"
compression = settings.get(SETTINGS_IMAGE_GEN_OUTPUT_COMPRESSION) # 0-100
# GPT-Image-1 is now the default image generation model
response = await client.images.generate(
model="gpt-image-1",
prompt=prompt,
n=1,
size=image_size,
quality=image_quality,
background=background,
output_format=output_format,
output_compression=compression,
**kwargs
)
return response
GPT-Image-1 Configuration¶
GPT-Image-1 supports several configuration options in VT.ai:
-
Image Size: Control the dimensions of generated images
1024x1024
(square - default)1536x1024
(landscape)1024x1536
(portrait)
-
Image Quality: Control the rendering quality
standard
- Regular quality (default)high
- Enhanced quality for detailed images
-
Background Type: Control transparency
auto
- Let the model decide (default)transparent
- Create images with transparent backgrounds (for PNG format)opaque
- Force an opaque background
-
Output Format: Select image format
jpeg
- Good for photographs (default)png
- Best for images needing transparencywebp
- Optimized for web use with good compression
-
Moderation Level: Content filtering level
auto
- Standard moderation (default)low
- Less restrictive moderation
-
Compression Quality: For JPEG and WebP formats
- Values from 0-100 (75 is default)
- Higher values produce better quality but larger files
All these settings can be configured through environment variables:
# Example configuration
export VT_SETTINGS_IMAGE_GEN_IMAGE_SIZE="1536x1024"
export VT_SETTINGS_IMAGE_GEN_IMAGE_QUALITY="high"
export VT_SETTINGS_IMAGE_GEN_BACKGROUND="transparent"
export VT_SETTINGS_IMAGE_GEN_OUTPUT_FORMAT="png"
export VT_SETTINGS_IMAGE_GEN_OUTPUT_COMPRESSION="90"
export VT_SETTINGS_IMAGE_GEN_MODERATION="auto"
Error Handling¶
VT.ai implements robust error handling for model calls:
- API rate limiting errors
- Authentication errors
- Model-specific errors
- Network errors
The main error handling is centralized in vtai/utils/error_handlers.py
.
Model Performance¶
Streaming Responses¶
VT.ai supports streaming responses for a better user experience:
# Example streaming call
async def stream_model_response(messages, model="o3-mini", **kwargs):
from litellm import acompletion
response_stream = await acompletion(
model=model,
messages=messages,
stream=True,
**kwargs
)
collected_content = ""
async for chunk in response_stream:
content = chunk.choices[0].delta.content
if content:
collected_content += content
# Handle chunk processing
return collected_content
Caching¶
VT.ai implements caching for model responses to improve performance and reduce API costs:
- In-memory cache for short-term use
- Disk-based cache for persistent storage
Adding New Model Providers¶
To add support for a new model provider:
- Update the provider configuration in
llm_providers_config.py
- Add the appropriate API key handling
- Test compatibility with the semantic router
- Implement any specialized handling if needed
See the Extending VT.ai guide for more details.
This page is under construction. More detailed information about model integration will be added soon.