Provider Comparison
**Referenced Files in This Document** - [main.rs](file://src/main.rs) - [free_models.json](file://openrouter_models/free_models.json) - [all_models.json](file://openrouter_models/all_models.json) - [get_free_models.py](file://bin/get_free_models.py) - [readme.md](file://readme.md)Table of Contents
- Introduction
- Cost Analysis
- Latency and Performance
- Privacy Considerations
- Model Quality and Reliability
- Ease of Setup and Configuration
- Hybrid Strategies and Fallback Mechanisms
- Decision-Making Guidance
- Configuration Patterns
Introduction
This document provides a comprehensive comparison of the supported LLM providers in the aicommit tool: OpenRouter, Ollama, and OpenAI-compatible endpoints. The analysis covers key dimensions including cost, latency, privacy, model quality, reliability, and ease of setup. Special attention is given to the hybrid strategy of using Ollama by default with OpenRouter as fallback, and the decision-making process for different user profiles.
Section sources
Cost Analysis
The cost structure varies significantly between provider types:
Free vs. Paid Models
OpenRouter offers both free and paid models, with free models identified by “:free” in their ID or zero pricing. The system maintains a curated list of preferred free models in order of preference, prioritizing larger parameter models like “meta-llama/llama-4-maverick:free” and “nvidia/llama-3.1-nemotron-ultra-253b-v1:free”. Paid models on OpenRouter have transparent pricing per 1k tokens for both prompt and completion.
Ollama and OpenAI-compatible local instances are effectively free after initial setup costs, as they run on local hardware without per-token charges. However, cloud-based OpenAI-compatible services may have their own pricing structures.
Token Pricing Structure
OpenRouter’s pricing is dynamically fetched from their API, with costs varying by model capability. For example:
- High-end models like “openai/gpt-4-turbo” cost $0.01 per 1k prompt tokens and $0.03 per 1k completion tokens
- Mid-tier models like “mistralai/mistral-small-3.1-24b-instruct:free” are available at no cost
- Entry-level models have minimal costs, often fractions of a cent per 1k tokens
The system tracks usage costs and displays them after each commit generation, providing transparency into expenses.
flowchart TD
A[Start] --> B{Provider Type}
B --> |OpenRouter| C[Check Pricing Model]
C --> D{Free Model?}
D --> |Yes| E[Zero Cost]
D --> |No| F[Calculate based on token usage]
B --> |Ollama| G[Local Execution - No Direct Cost]
B --> |OpenAI-Compatible| H{Cloud or Local?}
H --> |Cloud| I[Follow Provider Pricing]
H --> |Local| J[Hardware Cost Only]
Diagram sources
Section sources
Latency and Performance
Latency characteristics differ substantially between cloud and local execution models:
Cloud-Based Services (OpenRouter)
Cloud-based services typically have higher latency due to network round-trip times, API processing, and potential queueing during peak usage. The system implements a 30-second timeout for requests to OpenRouter, reflecting typical response times. Performance can vary based on:
- Network connectivity quality
- Server load at the provider
- Model complexity and size
- Request queue position
Local Models (Ollama)
Local models offer significantly lower latency when running on capable hardware, as they eliminate network transmission time. Response times are primarily determined by:
- Local hardware specifications (CPU/GPU)
- Model size and complexity
- System resource availability
- Ollama server optimization
The trade-off is that high-performance local inference requires substantial computational resources, particularly for larger models.
Performance Monitoring
The system includes built-in performance tracking through model statistics that record success/failure rates and timestamps. This data informs the model selection algorithm, favoring consistently responsive models while temporarily avoiding those with repeated failures.
flowchart LR
A[Request Initiated] --> B{Execution Location}
B --> |Cloud| C[Network Transmission]
C --> D[Remote Processing]
D --> E[Network Return]
E --> F[Response Received]
B --> |Local| G[Local Processing]
G --> H[Response Generated]
style C stroke:#ff6b6b,stroke-width:2px
style D stroke:#4ecdc4,stroke-width:2px
style E stroke:#ff6b6b,stroke-width:2px
style G stroke:#4ecdc4,stroke-width:2px
classDef network fill:#ffe6e6,stroke:#ff6b6b;
classDef processing fill:#e6f7f7,stroke:#4ecdc4;
class C,E network
class D,G processing
Diagram sources
Section sources
Privacy Considerations
Privacy implications vary significantly between provider types, representing a critical decision factor:
Data Transmission Risks
Cloud-based providers (OpenRouter, cloud OpenAI-compatible) require transmitting code changes over the internet, potentially exposing sensitive information:
- Git diffs containing code changes are sent to external servers
- Contextual information about project structure and functionality is shared
- Authentication tokens must be trusted to the provider
The system mitigates risks by:
- Using secure HTTPS connections
- Supporting API key management
- Providing clear indication of data transmission
Local Execution Benefits
Ollama and local OpenAI-compatible servers offer superior privacy:
- All code analysis occurs on the local machine
- No data leaves the user’s environment
- Complete control over model inputs and outputs
- No dependency on external service availability
This makes local execution ideal for projects with sensitive codebases, proprietary algorithms, or strict compliance requirements.
Hybrid Privacy Strategy
The recommended configuration uses Ollama as the primary provider with OpenRouter as fallback, balancing privacy and reliability:
- Default to local processing for maximum privacy
- Fall back to cloud services only when local models fail
- Maintain offline capability for sensitive environments
Section sources
Model Quality and Reliability
Model quality and reliability are assessed through multiple dimensions:
Quality Assessment
Model quality is primarily determined by:
- Parameter count and model architecture
- Training data quality and recency
- Specialization (general vs. coding-specific)
- Context window size
The system prioritizes higher-parameter models in its selection algorithm, with preferences ordered from largest to smallest within categories. Models like “meta-llama/llama-4-scout:free” (512K context) and “nvidia/llama-3.1-nemotron-ultra-253b-v1:free” (253B parameters) represent the high end of available free models.
Reliability Mechanisms
The system implements sophisticated reliability features:
- Retry Logic: Up to 3 retry attempts by default when generation fails
- Model Jail System: Temporarily disables models with consecutive failures
- Blacklisting: Permanently excludes consistently problematic models
- Fallback Selection: Intelligent model selection based on historical performance
The jail system implements escalating penalties:
- Initial failure: Monitor closely
- Consecutive failures: Temporary jail (hours)
- Repeated issues: Permanent blacklisting (days)
This ensures reliable operation even when individual models experience issues.
stateDiagram-v2
[*] --> Active
Active --> Jailed : Consecutive failures
Jailed --> Active : Jail period expired
Jailed --> Blacklisted : Repeated offenses
Blacklisted --> Active : Manual unjail or retry period
Active --> Active : Successful requests
Active --> Failed : Request failure
Failed --> Active : Isolated incident
Failed --> Jailed : Consecutive pattern
Diagram sources
Section sources
Ease of Setup and Configuration
The configuration system supports multiple provider types with flexible setup options:
Configuration Structure
The system uses a JSON configuration file (~/.aicommit.json) that supports multiple providers simultaneously:
{
"providers": [{
"id": "provider-id",
"provider": "provider-type",
"configuration": { /* provider-specific settings */ }
}],
"active_provider": "provider-id",
"retry_attempts": 3
}
Each provider type has specific configuration requirements:
- OpenRouter: API key, model name, URL
- Ollama: API URL, model name
- OpenAI-Compatible: API key, URL, model name
Setup Methods
Multiple setup methods are available:
- Interactive: Guided setup with prompts
- Non-interactive: Command-line arguments for automation
- Manual editing: Direct JSON configuration
The system supports seamless switching between providers by changing the active_provider field, enabling easy experimentation and fallback configurations.
Section sources
Hybrid Strategies and Fallback Mechanisms
The system implements sophisticated hybrid strategies for optimal performance:
Default Fallback Strategy
The recommended approach uses Ollama by default with OpenRouter as fallback:
- Primary: Ollama (local, private, no cost)
- Secondary: OpenRouter free models (cloud, reliable, no cost)
- Tertiary: OpenRouter paid models (cloud, high quality, paid)
This strategy maximizes privacy while ensuring reliability through cloud fallbacks.
Simple Free OpenRouter Intelligence
The SimpleFreeOpenRouter provider implements advanced model selection:
- Preferred Model List: Uses curated PREFERRED_FREE_MODELS ordering
- Dynamic Selection: Queries OpenRouter API for currently available models
- Performance Tracking: Maintains statistics on model success/failure rates
- Intelligent Fallback: Falls back to predefined list if API unavailable
The selection algorithm follows this priority:
- Previously successful model (if available)
- Preferred models in order
- Largest available model by parameter count
- Least recently jailed model
- Any available model as last resort
Retry and Recovery
The system implements robust retry logic:
- Configurable retry attempts (default: 3)
- 5-second delay between attempts
- Comprehensive error reporting
- Final failure with detailed diagnostics
flowchart TD
A[Generate Message] --> B{Success?}
B --> |Yes| C[Return Result]
B --> |No| D{Retry Limit Reached?}
D --> |No| E[Wait 5 Seconds]
E --> A
D --> |Yes| F[Return Error]
style A stroke:#4ecdc4,stroke-width:2px
style C stroke:#2ecc71,stroke-width:2px
style F stroke:#e74c3c,stroke-width:2px
classDef success fill:#d5f5e3,stroke:#2ecc71;
classDef failure fill:#fadbd8,stroke:#e74c3c;
classDef process fill:#ebf5fb,stroke:#4ecdc4;
class C success
class F failure
class A,D,E process
Diagram sources
Section sources
Decision-Making Guidance
Recommendations vary by user profile and requirements:
Individual Developers
For individual developers, prioritize:
- Privacy: Use Ollama with local models
- Cost: Leverage free OpenRouter models
- Convenience: Simple setup with minimal configuration
Recommended configuration: Ollama as primary, SimpleFreeOpenRouter as backup.
Teams
For development teams, consider:
- Consistency: Standardized configurations across team members
- Reliability: Multiple fallback options
- Cost Management: Track and limit expenses
Recommended approach: Centralized configuration with OpenRouter paid models for consistency, supplemented by local Ollama instances for privacy-sensitive work.
Enterprise
For enterprise environments:
- Security: On-premises Ollama instances
- Compliance: Avoid transmission of sensitive code
- Scalability: Dedicated inference servers
- Support: Commercially supported models
Enterprise strategy should emphasize air-gapped operation with periodic updates, avoiding external API dependencies for production systems.
Section sources
Configuration Patterns
The system enables seamless switching between providers through configuration:
Configuration File Structure
The ~/.aicommit.json file supports multiple providers with an active provider selector:
{
"providers": [
{
"id": "uuid-1",
"provider": "ollama",
"url": "http://localhost:11434",
"model": "llama2"
},
{
"id": "uuid-2",
"provider": "openrouter",
"api_key": "your-key",
"model": "mistralai/mistral-tiny"
}
],
"active_provider": "uuid-1"
}
Switching Providers
Providers can be switched via:
- Command line:
aicommit --set <provider-id> - Configuration edit: Manual JSON modification
- Interactive setup: Guided provider addition
The system validates the selected provider exists before updating the configuration.
Environment Variables
Configuration can be overridden with environment variables:
AICOMMIT_PROVIDER- Set active providerAICOMMIT_MODEL- Override model selectionAICOMMIT_TEMPERATURE- Adjust creativity level
This enables dynamic configuration in CI/CD pipelines and automated environments.
Section sources