What is AI API pricing?
AI API pricing is the cost of using model APIs for inference, usually normalized by input and output tokens, requests, images, video, or other usage units.
This AI API pricing comparison brings approved public-source listings into one table so buyers can compare model costs, providers, source links, and verification status before starting procurement.
Need a provider shortlist? Submit your model, usage volume, region, and budget. Inferras helps organize provider options before you contact vendors.
This AI API pricing comparison brings approved public-source listings into one table so buyers can compare model costs, providers, source links, and verification status before starting procurement.
Teams comparing public pricing before choosing a model API, marketplace route, or provider shortlist.
Compare AI API reseller options, pricing models and risk factors on Inferras before choosing which providers to contact.
Public comparison pages only show approved price listings with source links, verification status, and last checked dates.
| Model | Provider | Provider category | Region | Input / 1M | Output / 1M | Source | Verification | Checked | Updated | View |
|---|---|---|---|---|---|---|---|---|---|---|
gpt-5.2 | AICopy API | Submitted provider | China / Global | CN¥0.25 | CN¥2.00 | resellerSource | Source listed | May 15, 2026 | May 15, 2026 | View |
gpt-5.4 | AICopy API | Submitted provider | China / Global | CN¥0.50 | CN¥3.00 | resellerSource | Source listed | May 15, 2026 | May 15, 2026 | View |
gpt-5.5 | AICopy API | Submitted provider | China / Global | CN¥0.50 | CN¥3.00 | resellerSource | Source listed | May 15, 2026 | May 15, 2026 | View |
gpt-5.4-mini | AICopy API | Submitted provider | China / Global | CN¥0.75 | CN¥4.50 | resellerSource | Source listed | May 15, 2026 | May 15, 2026 | View |
gpt-5.3-codex | AICopy API | Submitted provider | China / Global | CN¥1.75 | CN¥14.00 | resellerSource | Source listed | May 15, 2026 | May 15, 2026 | View |
GPT-5.5-对话Chat route / Variant | AICopy API | Submitted provider | China / Global | CN¥5.00 | CN¥30.00 | resellerSource | Source listed | May 15, 2026 | May 15, 2026 | View |
Groq Llama 3.1 8B Instant 128k | GroqOfficial source | Official model owner | Global | $0.05 | $0.08 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Groq GPT OSS 20B 128k | GroqOfficial source | Official model owner | Global | $0.075 | $0.30 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
DeepInfra Qwen3-32B | DeepInfraOfficial source | Official model owner | Global | $0.08 | $0.28 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
DeepInfra Llama-3.3-70B-Instruct-Turbo | DeepInfraOfficial source | Official model owner | Global | $0.10 | $0.32 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Gemini 2.5 Flash-Lite | Google AI Gemini APIOfficial source | Official model owner | Global | $0.10 | $0.40 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Groq Llama 4 Scout 17Bx16E 128k | GroqOfficial source | Official model owner | Global | $0.11 | $0.34 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Groq GPT OSS 120B 128k | GroqOfficial source | Official model owner | Global | $0.15 | $0.60 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
OpenAI: GPT-4o-mini | OpenRouterOfficial source | Official model owner | Global | $0.15 | $0.60 | marketplaceSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Together AI gpt-oss-120B | Together AIOfficial source | Official model owner | Global | $0.15 | $0.60 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Gemini 2.5 Flash | Google AI Gemini APIOfficial source | Official model owner | Global | $0.15 | $1.25 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
DeepInfra DeepSeek-V3.1 | DeepInfraOfficial source | Official model owner | Global | $0.21 | $0.79 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Gemini 3.1 Flash-Lite Preview | Google AI Gemini APIOfficial source | Official model owner | Global | $0.25 | $1.50 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
DeepInfra DeepSeek-V3.2 | DeepInfraOfficial source | Official model owner | Global | $0.26 | $0.38 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
DeepSeek: DeepSeek V4 Pro | OpenRouterOfficial source | Official model owner | Global | $0.435 | $0.87 | marketplaceSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Together AI Qwen3.6-Plus | Together AIOfficial source | Official model owner | Global | $0.50 | $3.00 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Together AI DeepSeek-V3.1 | Together AIOfficial source | Official model owner | Global | $0.60 | $1.70 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
GPT-5.4 mini | OpenAIOfficial source | Official model owner | Global | $0.75 | $4.50 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Claude Haiku 3.5 | AnthropicOfficial source | Official model owner | Global | $0.80 | $4.00 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Fireworks Kimi K2.6 | Fireworks AIOfficial source | Official model owner | Global | $0.95 | $4.00 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Claude Haiku 4.5 | AnthropicOfficial source | Official model owner | Global | $1.00 | $5.00 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Gemini 2.5 Pro | Google AI Gemini APIOfficial source | Official model owner | Global | $1.25 | $10.00 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Fireworks DeepSeek V4 Pro | Fireworks AIOfficial source | Official model owner | Global | $1.74 | $3.48 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Together AI DeepSeek V4 Pro | Together AIOfficial source | Official model owner | Global | $2.10 | $4.40 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
OpenAI: GPT-4o | OpenRouterOfficial source | Official model owner | Global | $2.50 | $10.00 | marketplaceSource | Source listed | May 10, 2026 | May 10, 2026 | View |
GPT-5.4 | OpenAIOfficial source | Official model owner | Global | $2.50 | $15.00 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Claude Sonnet 4.6 | AnthropicOfficial source | Official model owner | Global | $3.00 | $15.00 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
GPT-Realtime-2 text | OpenAIOfficial source | Official model owner | Global | $4.00 | $24.00 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Claude Opus 4.6 | AnthropicOfficial source | Official model owner | Global | $5.00 | $25.00 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Claude Opus 4.7 | AnthropicOfficial source | Official model owner | Global | $5.00 | $25.00 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
GPT-5.5 | OpenAIOfficial source | Official model owner | Global | $5.00 | $30.00 | officialSource | Source listed | May 10, 2026 | May 10, 2026 | View |
Verification indicates how confidently the listed price matches its public source.
Prices are collected from public provider pages and may change over time. Verify pricing, billing units, rate limits, and terms directly on the source page before purchase.
Source note: Source links open public provider pages. Prices may change over time.
AI API pricing
AI API pricing is the cost of using model APIs for inference, usually normalized by input and output tokens, requests, images, video, or other usage units.
Most LLM APIs charge separately for input tokens and output tokens. Some providers also offer batch, cached input, marketplace, or enterprise tiers.
The cheapest option depends on the model, provider, source type, region, usage volume, and whether you optimize for input price, output price, latency, or reliability.
Output tokens usually cost more because generating responses consumes more inference resources than reading prompt input.