Models¶

This page provides information about the AI models supported by VT.ai and how to use them effectively.

Supported Models¶

VT.ai integrates with multiple AI providers and supports a wide range of models:

OpenAI Models¶

GPT-o1 (o1): High-performance general purpose model
GPT-o1 Mini (o1-mini): Compact version of GPT-o1
GPT-o1 Pro (o1-pro): Enhanced version with advanced capabilities
GPT-o3 Mini (o3-mini): Compact, efficient model for everyday tasks
GPT-4.5 Preview (gpt-4.5-preview): Preview of next-generation capabilities
GPT-4o (4o): Advanced vision and multimodal capabilities

Anthropic Models¶

Claude 3.5 Sonnet (c3.5-sonnet): Balanced performance and efficiency
Claude 3.7 Sonnet (sonnet): Advanced reasoning capabilities
Claude 3.5 Haiku (c3.5-haiku): Fast, efficient model for common tasks
Claude 3 Opus (opus): Highest capability model for complex tasks

Google Models¶

Gemini 1.5 Pro (gemini-1.5-pro): Advanced multimodal capabilities
Gemini 1.5 Flash (gemini-1.5-flash): Fast, efficient model
Gemini 2.0 (gemini-2.0): Advanced reasoning model
Gemini 2.5 Pro (gemini-2.5-pro): Latest Google model with enhanced capabilities
Gemini 2.5 Flash (gemini-2.5-flash): Fast version of Gemini 2.5

DeepSeek Models¶

DeepSeek-Coder (deepseek-coder): Specialized for coding tasks
DeepSeek Chat (deepseek-chat): General conversation model
DeepSeek R1 Series (deepseek-r1): Next-generation reasoning models (multiple sizes)

Groq Models¶

Llama 4 Scout 17b Instruct: Fast inference of Llama 4 Scout via Groq
Llama 3 8b/70b: Optimized versions of Llama 3 on Groq's infrastructure
Mixtral 8x7b: Fast inference of Mixtral model

Cohere Models¶

Command: General purpose instruction model
Command-R: Enhanced reasoning capabilities
Command-Light: Lightweight, efficient model
Command-R-Plus: Advanced reasoning with extended capabilities

OpenRouter Integration¶

VT.ai supports many models through OpenRouter, including:

Qwen Models: Qwen 2.5 VL 32B, Qwen 2.5 Coder 32B, etc.
Mistral Models: Mistral Small 3.1 24B and others
Additional proprietary and open models

Local Models (via Ollama)¶

Run models locally for privacy and offline use:

Llama 3: Multiple sizes (8B, 70B)
DeepSeek R1: Various sizes (1.5B, 7B, 8B, 14B, 32B, 70B)
Qwen2.5-coder: Multiple versions (7b, 14b, 32b)
Mistral: Various versions
Many other open source models

Model Selection¶

You can select models in several ways:

Command Line:

vtai --model sonnet

UI Settings:
Use the model selector in the settings menu
Change models during a conversation
Dynamic Routing:
Allow VT.ai to automatically select the best model for your query
Enable in settings with "Use Dynamic Conversation Routing"

Model Capabilities¶

Different models have different capabilities:

Vision-Capable Models¶

For analyzing images and visual content:

GPT-4o
Gemini 1.5 Pro/Flash
Gemini 2.5 Pro/Flash
Claude 3 Sonnet/Opus
Llama 3.2 Vision
Qwen 2.5 VL

TTS-Capable Models¶

For text-to-speech generation:

GPT-4o mini TTS
TTS-1
TTS-1-HD
Various voice options: alloy, echo, fable, onyx, nova, shimmer

Image Generation Models¶

For creating images from text descriptions:

DALL-E 3: OpenAI's image generation model
GPT-Image-1: Advanced image generation model with customizable settings
Supports transparent backgrounds
Multiple output formats (PNG, JPEG, WEBP)
Customizable dimensions (square, landscape, portrait)
Quality settings (standard, high)
Advanced compression options (0-100 for webp/jpeg)
HD options for higher quality outputs

Reasoning-Enhanced Models¶

VT.ai supports special "thinking mode" with these models:

DeepSeek Reasoner
DeepSeek R1 series
Qwen 2.5 models
Claude 3 models
GPT-4o

Performance Considerations¶

When choosing models, consider these factors:

Speed: Models like GPT-o3 Mini, Groq-accelerated models, and Claude 3 Haiku are faster
Quality: Models like GPT-o1, GPT-4o, and Claude 3 Opus offer higher quality
Cost: Smaller models generally cost less to use
Multimodal Needs: Only some models support image analysis
Local Computation: Ollama models run locally but require more resources
API Availability: Some models may require specific API keys

API Key Configuration¶

For most models, you'll need to configure the appropriate API keys:

# OpenAI models (o1, o3-mini, 4o, etc.)
export OPENAI_API_KEY="sk-your-key-here"

# Anthropic models (sonnet, opus, etc.)
export ANTHROPIC_API_KEY="sk-ant-your-key-here"

# Google models (gemini series)
export GEMINI_API_KEY="your-key-here"

# DeepSeek models
export DEEPSEEK_API_KEY="your-key-here"

# Groq models
export GROQ_API_KEY="your-key-here"

# Cohere models
export COHERE_API_KEY="your-key-here"

# OpenRouter (for access to multiple providers)
export OPENROUTER_API_KEY="your-key-here"

You can also set these keys when starting VT.ai:

vtai --api-key openai=sk-your-key-here

For more details on configuration, see the Configuration page.