VT.ai Architecture¶
This document provides an overview of VT.ai's architecture, explaining how its components work together to create a powerful multimodal AI chat application.
Implementation Options¶
VT.ai is available in two implementations:
- Python Implementation: The original implementation built with Python, Chainlit, and LiteLLM
- Rust Implementation: A high-performance port focused on efficiency and reliability
This document covers both implementations, highlighting their architectural differences and similarities.
Python Implementation¶
High-Level Architecture¶
The Python implementation follows a modular architecture designed for flexibility and extensibility:
┌─────────────────┐ ┌───────────────┐ ┌────────────────┐
│ Web Interface │ │ Semantic │ │ Model │
│ (Chainlit) │────▶│ Router │────▶│ Providers │
└─────────────────┘ └───────────────┘ └────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌───────────────┐ ┌────────────────┐
│ File/Media │ │ Assistant │ │ API Key │
│ Processing │ │ Tools │ │ Management │
└─────────────────┘ └───────────────┘ └────────────────┘
Core Components¶
1. Entry Point (vtai/app.py
)¶
The main application entry point handles initialization, manages the Chainlit web interface, and coordinates the conversation flow. It:
- Initializes the application configuration
- Sets up chat profiles and user sessions
- Manages message handling and routing
- Processes assistant runs and tool calls
- Handles user settings and TTS features
2. Routing Layer (vtai/router/
)¶
The semantic routing system classifies user queries and directs them to appropriate handlers based on intent:
- Encoder: Uses FastEmbed with the BAAI/bge-small-en-v1.5 embedding model
- Classification: Performs vector similarity matching against predefined intents
- Dynamic Dispatch: Routes queries to specialized conversation handlers
3. Model Management¶
The Python implementation uses LiteLLM as a unified interface to multiple AI providers:
- Provider Abstraction: Standardizes API calls across different model providers
- Model Switching: Allows seamless switching between models based on query needs
- Error Handling: Provides consistent error handling across providers
4. Utility Modules (vtai/utils/
)¶
Various utility modules provide supporting functionality:
- Configuration Management: Handles API keys, settings, and environment variables
- Conversation Handlers: Processes different types of conversations (standard, thinking mode)
- Media Processors: Handles image analysis, generation, and TTS features
- File Handlers: Manages file uploads and processing
- Error Handlers: Provides consistent error handling and reporting
5. Assistant Tools (vtai/assistants/
)¶
Implements specialized assistant capabilities:
- Code Interpreter: Executes Python code for data analysis
- Thread Management: Handles persistent conversation threads
- Tool Processing: Manages function calls and tool outputs
Rust Implementation¶
High-Level Architecture¶
The Rust implementation follows a similar architecture but with performance-focused components:
┌─────────────────┐ ┌───────────────┐ ┌────────────────┐
│ Web Server │ │ Semantic │ │ Model │
│ (Axum) │────▶│ Router │────▶│ Providers │
└─────────────────┘ └───────────────┘ └────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌───────────────┐ ┌────────────────┐
│ WebSocket │ │ Tool │ │ Configuration │
│ Handlers │ │ Registry │ │ Management │
└─────────────────┘ └───────────────┘ └────────────────┘
Core Components¶
1. Web Server (src/app/
)¶
The entry point for the Rust implementation is based on the Axum web framework:
- HTTP Server: Handles REST API endpoints
- WebSocket Server: Manages real-time chat communication
- Static Files: Serves the web interface assets
- API Endpoints: Exposes model selection and configuration endpoints
2. Routing System (src/router/
)¶
The Rust implementation includes a semantic routing system similar to the Python version:
- Intent Classification: Uses embeddings for query classification
- Router Registry: Manages available routers and their capabilities
- Dynamic Dispatch: Routes requests to appropriate handlers
3. Tool Integration (src/tools/
)¶
The Rust implementation provides a robust tool integration system:
- Tool Registry: Central registry for available tools
- Code Execution: Safely runs code in isolated environments
- File Operations: Handles file uploads and processing
- Search Tools: Provides search capabilities within conversations
4. Assistant Integration (src/assistants/
)¶
Implements the OpenAI Assistant API integration:
- Assistant Management: Creates and manages assistants
- Thread Handling: Manages conversation threads
- Tool Invocation: Handles tool calls from the assistant
5. Utilities (src/utils/
)¶
Common utilities shared across the application:
- Error Handling: Comprehensive error types and handling
- Configuration: Environment-based configuration
- Model Management: Model alias resolution and provider abstraction
- Authentication: API key management and validation
Data Flow Comparison¶
Both implementations follow a similar data flow pattern with some implementation-specific differences:
Python Implementation¶
- User Input: User submits a message through the Chainlit web interface
- Message Processing: The message is processed to extract text and media
- Intent Classification: The semantic router classifies the query intent
- Model Selection: A specific model is selected based on classification
- Query Execution: The query is sent to the appropriate model provider via LiteLLM
- Response Processing: The response is processed (may include media generation)
- UI Rendering: The formatted response is displayed in the web interface
Rust Implementation¶
- User Input: User submits a message through the web interface over WebSocket
- Message Processing: The WebSocket handler processes the incoming message
- Intent Classification: The router classifies the message intent
- Model Selection: The appropriate model is selected based on the classification
- Query Execution: The query is sent directly to the model provider API
- Response Streaming: The response is streamed back to the client via WebSocket
- UI Rendering: The client-side JavaScript renders the response in the interface
Configuration System¶
Both implementations use a similar configuration approach with some implementation differences:
Python Implementation¶
Uses a layered configuration system with priorities:
- Command-line arguments: Highest priority
- Environment variables: Secondary priority
- Configuration files:
~/.config/vtai/.env
for persistence - Default values: Used when no specific settings are provided
Rust Implementation¶
Uses a Rust-native configuration approach:
- Command-line arguments: Processed using Clap
- Environment variables: Read using the dotenv crate
- Configuration files: Similar structure to Python implementation
- Default values: Defined using Rust's Option and Default traits
Extension Points¶
Both implementations provide extension points for customization:
Python Implementation¶
- New Model Providers: Add support for additional AI providers
- Custom Intents: Extend the semantic router with new intent classifications
- Specialized Handlers: Create custom handlers for specific conversation types
- New Assistant Tools: Add new tools for specialized tasks
Rust Implementation¶
- New Model Providers: Implement new provider traits
- Custom Routers: Add new router implementations to the registry
- Tool Extensions: Implement the Tool trait for new functionality
- Middleware Components: Add new middleware for request/response processing
For details on extending either implementation, see the Extending VT.ai guide.