Provider Comparison

**Referenced Files in This Document** - [main.rs](file://src/main.rs) - [free_models.json](file://openrouter_models/free_models.json) - [all_models.json](file://openrouter_models/all_models.json) - [get_free_models.py](file://bin/get_free_models.py) - [readme.md](file://readme.md)

Table of Contents

  1. Introduction
  2. Cost Analysis
  3. Latency and Performance
  4. Privacy Considerations
  5. Model Quality and Reliability
  6. Ease of Setup and Configuration
  7. Hybrid Strategies and Fallback Mechanisms
  8. Decision-Making Guidance
  9. Configuration Patterns

Introduction

This document provides a comprehensive comparison of the supported LLM providers in the aicommit tool: OpenRouter, Ollama, and OpenAI-compatible endpoints. The analysis covers key dimensions including cost, latency, privacy, model quality, reliability, and ease of setup. Special attention is given to the hybrid strategy of using Ollama by default with OpenRouter as fallback, and the decision-making process for different user profiles.

Section sources

Cost Analysis

The cost structure varies significantly between provider types:

Free vs. Paid Models

OpenRouter offers both free and paid models, with free models identified by “:free” in their ID or zero pricing. The system maintains a curated list of preferred free models in order of preference, prioritizing larger parameter models like “meta-llama/llama-4-maverick:free” and “nvidia/llama-3.1-nemotron-ultra-253b-v1:free”. Paid models on OpenRouter have transparent pricing per 1k tokens for both prompt and completion.

Ollama and OpenAI-compatible local instances are effectively free after initial setup costs, as they run on local hardware without per-token charges. However, cloud-based OpenAI-compatible services may have their own pricing structures.

Token Pricing Structure

OpenRouter’s pricing is dynamically fetched from their API, with costs varying by model capability. For example:

The system tracks usage costs and displays them after each commit generation, providing transparency into expenses.

flowchart TD
A[Start] --> B{Provider Type}
B --> |OpenRouter| C[Check Pricing Model]
C --> D{Free Model?}
D --> |Yes| E[Zero Cost]
D --> |No| F[Calculate based on token usage]
B --> |Ollama| G[Local Execution - No Direct Cost]
B --> |OpenAI-Compatible| H{Cloud or Local?}
H --> |Cloud| I[Follow Provider Pricing]
H --> |Local| J[Hardware Cost Only]

Diagram sources

Section sources

Latency and Performance

Latency characteristics differ substantially between cloud and local execution models:

Cloud-Based Services (OpenRouter)

Cloud-based services typically have higher latency due to network round-trip times, API processing, and potential queueing during peak usage. The system implements a 30-second timeout for requests to OpenRouter, reflecting typical response times. Performance can vary based on:

Local Models (Ollama)

Local models offer significantly lower latency when running on capable hardware, as they eliminate network transmission time. Response times are primarily determined by:

The trade-off is that high-performance local inference requires substantial computational resources, particularly for larger models.

Performance Monitoring

The system includes built-in performance tracking through model statistics that record success/failure rates and timestamps. This data informs the model selection algorithm, favoring consistently responsive models while temporarily avoiding those with repeated failures.

flowchart LR
A[Request Initiated] --> B{Execution Location}
B --> |Cloud| C[Network Transmission]
C --> D[Remote Processing]
D --> E[Network Return]
E --> F[Response Received]
B --> |Local| G[Local Processing]
G --> H[Response Generated]
style C stroke:#ff6b6b,stroke-width:2px
style D stroke:#4ecdc4,stroke-width:2px
style E stroke:#ff6b6b,stroke-width:2px
style G stroke:#4ecdc4,stroke-width:2px
classDef network fill:#ffe6e6,stroke:#ff6b6b;
classDef processing fill:#e6f7f7,stroke:#4ecdc4;
class C,E network
class D,G processing

Diagram sources

Section sources

Privacy Considerations

Privacy implications vary significantly between provider types, representing a critical decision factor:

Data Transmission Risks

Cloud-based providers (OpenRouter, cloud OpenAI-compatible) require transmitting code changes over the internet, potentially exposing sensitive information:

The system mitigates risks by:

Local Execution Benefits

Ollama and local OpenAI-compatible servers offer superior privacy:

This makes local execution ideal for projects with sensitive codebases, proprietary algorithms, or strict compliance requirements.

Hybrid Privacy Strategy

The recommended configuration uses Ollama as the primary provider with OpenRouter as fallback, balancing privacy and reliability:

Section sources

Model Quality and Reliability

Model quality and reliability are assessed through multiple dimensions:

Quality Assessment

Model quality is primarily determined by:

The system prioritizes higher-parameter models in its selection algorithm, with preferences ordered from largest to smallest within categories. Models like “meta-llama/llama-4-scout:free” (512K context) and “nvidia/llama-3.1-nemotron-ultra-253b-v1:free” (253B parameters) represent the high end of available free models.

Reliability Mechanisms

The system implements sophisticated reliability features:

The jail system implements escalating penalties:

This ensures reliable operation even when individual models experience issues.

stateDiagram-v2
[*] --> Active
Active --> Jailed : Consecutive failures
Jailed --> Active : Jail period expired
Jailed --> Blacklisted : Repeated offenses
Blacklisted --> Active : Manual unjail or retry period
Active --> Active : Successful requests
Active --> Failed : Request failure
Failed --> Active : Isolated incident
Failed --> Jailed : Consecutive pattern

Diagram sources

Section sources

Ease of Setup and Configuration

The configuration system supports multiple provider types with flexible setup options:

Configuration Structure

The system uses a JSON configuration file (~/.aicommit.json) that supports multiple providers simultaneously:

{
  "providers": [{
    "id": "provider-id",
    "provider": "provider-type",
    "configuration": { /* provider-specific settings */ }
  }],
  "active_provider": "provider-id",
  "retry_attempts": 3
}

Each provider type has specific configuration requirements:

Setup Methods

Multiple setup methods are available:

The system supports seamless switching between providers by changing the active_provider field, enabling easy experimentation and fallback configurations.

Section sources

Hybrid Strategies and Fallback Mechanisms

The system implements sophisticated hybrid strategies for optimal performance:

Default Fallback Strategy

The recommended approach uses Ollama by default with OpenRouter as fallback:

This strategy maximizes privacy while ensuring reliability through cloud fallbacks.

Simple Free OpenRouter Intelligence

The SimpleFreeOpenRouter provider implements advanced model selection:

  1. Preferred Model List: Uses curated PREFERRED_FREE_MODELS ordering
  2. Dynamic Selection: Queries OpenRouter API for currently available models
  3. Performance Tracking: Maintains statistics on model success/failure rates
  4. Intelligent Fallback: Falls back to predefined list if API unavailable

The selection algorithm follows this priority:

  1. Previously successful model (if available)
  2. Preferred models in order
  3. Largest available model by parameter count
  4. Least recently jailed model
  5. Any available model as last resort

Retry and Recovery

The system implements robust retry logic:

flowchart TD
A[Generate Message] --> B{Success?}
B --> |Yes| C[Return Result]
B --> |No| D{Retry Limit Reached?}
D --> |No| E[Wait 5 Seconds]
E --> A
D --> |Yes| F[Return Error]
style A stroke:#4ecdc4,stroke-width:2px
style C stroke:#2ecc71,stroke-width:2px
style F stroke:#e74c3c,stroke-width:2px
classDef success fill:#d5f5e3,stroke:#2ecc71;
classDef failure fill:#fadbd8,stroke:#e74c3c;
classDef process fill:#ebf5fb,stroke:#4ecdc4;
class C success
class F failure
class A,D,E process

Diagram sources

Section sources

Decision-Making Guidance

Recommendations vary by user profile and requirements:

Individual Developers

For individual developers, prioritize:

Recommended configuration: Ollama as primary, SimpleFreeOpenRouter as backup.

Teams

For development teams, consider:

Recommended approach: Centralized configuration with OpenRouter paid models for consistency, supplemented by local Ollama instances for privacy-sensitive work.

Enterprise

For enterprise environments:

Enterprise strategy should emphasize air-gapped operation with periodic updates, avoiding external API dependencies for production systems.

Section sources

Configuration Patterns

The system enables seamless switching between providers through configuration:

Configuration File Structure

The ~/.aicommit.json file supports multiple providers with an active provider selector:

{
  "providers": [
    {
      "id": "uuid-1",
      "provider": "ollama",
      "url": "http://localhost:11434",
      "model": "llama2"
    },
    {
      "id": "uuid-2",
      "provider": "openrouter",
      "api_key": "your-key",
      "model": "mistralai/mistral-tiny"
    }
  ],
  "active_provider": "uuid-1"
}

Switching Providers

Providers can be switched via:

The system validates the selected provider exists before updating the configuration.

Environment Variables

Configuration can be overridden with environment variables:

This enables dynamic configuration in CI/CD pipelines and automated environments.

Section sources