What is AI API pricing?
AI API pricing is the cost of using model APIs for inference, usually normalized by input and output tokens, requests, images, video, or other usage units.
This cheapest AI API comparison sorts approved public-source listings by low input price, then output price, so buyers can quickly shortlist lower-cost inference options.
This cheapest AI API comparison sorts approved public-source listings by low input price, then output price, so buyers can quickly shortlist lower-cost inference options.
Teams comparing public pricing before choosing a model API, marketplace route, or provider shortlist.
Public comparison pages only show approved price listings with source links, verification status, and last checked dates.
| Model | Provider | Region | Input / 1M | Output / 1M | Latency | Uptime | Source | Verification | Checked | Updated | View |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Groq Llama 3.1 8B Instant 128k | GroqVerified | Global | $0.05 | $0.08 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Groq GPT OSS 20B 128k | GroqVerified | Global | $0.075 | $0.30 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| DeepInfra Qwen3-32B | DeepInfraVerified | Global | $0.08 | $0.28 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| DeepInfra Llama-3.3-70B-Instruct-Turbo | DeepInfraVerified | Global | $0.10 | $0.32 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Gemini 2.5 Flash-Lite | Google AI Gemini APIVerified | Global | $0.10 | $0.40 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Groq Llama 4 Scout 17Bx16E 128k | GroqVerified | Global | $0.11 | $0.34 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Groq GPT OSS 120B 128k | GroqVerified | Global | $0.15 | $0.60 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Together AI gpt-oss-120B | Together AIVerified | Global | $0.15 | $0.60 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| OpenAI: GPT-4o-mini | OpenRouterVerified | Global | $0.15 | $0.60 | N/A | N/A | marketplaceSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Gemini 2.5 Flash | Google AI Gemini APIVerified | Global | $0.15 | $1.25 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| DeepInfra DeepSeek-V3.1 | DeepInfraVerified | Global | $0.21 | $0.79 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Gemini 3.1 Flash-Lite Preview | Google AI Gemini APIVerified | Global | $0.25 | $1.50 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| DeepInfra DeepSeek-V3.2 | DeepInfraVerified | Global | $0.26 | $0.38 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| DeepSeek: DeepSeek V4 Pro | OpenRouterVerified | Global | $0.435 | $0.87 | N/A | N/A | marketplaceSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Together AI Qwen3.6-Plus | Together AIVerified | Global | $0.50 | $3.00 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Together AI DeepSeek-V3.1 | Together AIVerified | Global | $0.60 | $1.70 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| GPT-5.4 mini | OpenAIVerified | Global | $0.75 | $4.50 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Claude Haiku 3.5 | AnthropicVerified | Global | $0.80 | $4.00 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Fireworks Kimi K2.6 | Fireworks AIVerified | Global | $0.95 | $4.00 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Claude Haiku 4.5 | AnthropicVerified | Global | $1.00 | $5.00 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Gemini 2.5 Pro | Google AI Gemini APIVerified | Global | $1.25 | $10.00 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Fireworks DeepSeek V4 Pro | Fireworks AIVerified | Global | $1.74 | $3.48 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Together AI DeepSeek V4 Pro | Together AIVerified | Global | $2.10 | $4.40 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| OpenAI: GPT-4o | OpenRouterVerified | Global | $2.50 | $10.00 | N/A | N/A | marketplaceSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| GPT-5.4 | OpenAIVerified | Global | $2.50 | $15.00 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Claude Sonnet 4.6 | AnthropicVerified | Global | $3.00 | $15.00 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| GPT-Realtime-2 text | OpenAIVerified | Global | $4.00 | $24.00 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Claude Opus 4.6 | AnthropicVerified | Global | $5.00 | $25.00 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| Claude Opus 4.7 | AnthropicVerified | Global | $5.00 | $25.00 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
| GPT-5.5 | OpenAIVerified | Global | $5.00 | $30.00 | N/A | N/A | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Verification indicates how confidently the listed price matches its public source.
Prices are collected from public provider pages and may change over time. Use the source links to confirm the latest provider pricing.
Source note: Source links open public provider pages. Prices may change over time.
cheapest AI API
AI API pricing is the cost of using model APIs for inference, usually normalized by input and output tokens, requests, images, video, or other usage units.
Most LLM APIs charge separately for input tokens and output tokens. Some providers also offer batch, cached input, marketplace, or enterprise tiers.
The cheapest option depends on the model, provider, source type, region, usage volume, and whether you optimize for input price, output price, latency, or reliability.
Output tokens usually cost more because generating responses consumes more inference resources than reading prompt input.