LLM API budget planning
LLM API Cost Calculator
Estimate monthly API spend from input and output token volume, then compare popular and low-cost models using normalized USD prices per 1M tokens.
Start with your workload
Change token volume once; every model estimate updates.
Use this when budgeting a chatbot, RAG pipeline, coding assistant, or batch analysis workload before choosing a provider.
Estimate your workload cost
Monthly token volume
This estimate uses available model metadata. Provider invoices may include routing, caching, discounts, minimums, or account-specific terms.
Popular Model Cost Estimates
Open hot models| Model | Provider | Input / 1M | Output / 1M | Your Cost | Context | Rank |
|---|---|---|---|---|---|---|
| 🔥DeepSeek V4 Flash | DeepSeek | $0.112 | $0.224 | $0.22 | 1.05M | #1 |
| 🔥Hy3 preview | Tencent | $0.066 | $0.26 | $0.2 | 262.14K | #2 |
| 🔥Claude Opus 4.7 | Anthropic | $5 | $25 | $17.5 | 1M | #3 |
| 🔥Claude Sonnet 4.6 | Anthropic | $3 | $15 | $10.5 | 1M | #4 |
| 🔥Owl Alpha | OpenRouter | $0 | $0 | $0 | 1.05M | #5 |
| 🔥Gemini 3 Flash Preview | $0.5 | $3 | $2 | 1.05M | #6 | |
| 🔥DeepSeek V3.2 | DeepSeek | $0.252 | $0.378 | $0.44 | 131.07K | #7 |
| 🔥DeepSeek V4 Pro | DeepSeek | $0.435 | $0.87 | $0.87 | 1.05M | #8 |
| 🔥Kimi K2.6 | MoonshotAI | $0.73 | $3.49 | $2.48 | 262.14K | #9 |
| 🔥Step 3.5 Flash | StepFun | $0.1 | $0.3 | $0.25 | 262.14K | #10 |
| 🔥Claude Opus 4.6 | Anthropic | $5 | $25 | $17.5 | 1M | #11 |
| 🔥MiniMax M2.7 | MiniMax | $0.279 | $1.2 | $0.88 | 204.8K | #12 |
Low-Cost Shortlist
Browse cheapest models| Model | Provider | Your Cost | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|---|
| 🔥Owl Alpha | OpenRouter | $0 | $0 | $0 | 1.05M |
| 🔥Nemotron 3 Super (free) | NVIDIA | $0 | $0 | $0 | 1M |
| CoBuddy (free) | Baidu Qianfan | $0 | $0 | $0 | 131.07K |
| Nemotron 3 Nano Omni (free) | NVIDIA | $0 | $0 | $0 | 256K |
| Laguna XS.2 (free) | Poolside | $0 | $0 | $0 | 131.07K |
| Laguna M.1 (free) | Poolside | $0 | $0 | $0 | 131.07K |
| DeepSeek V4 Flash (free) | DeepSeek | $0 | $0 | $0 | 1.05M |
| Gemma 4 26B A4B (free) | $0 | $0 | $0 | 262.14K | |
| Gemma 4 31B (free) | $0 | $0 | $0 | 262.14K | |
| Trinity Large Thinking (free) | Arcee AI | $0 | $0 | $0 | 262.14K |
| Lyria 3 Pro Preview | $0 | $0 | $0 | 1.05M | |
| Lyria 3 Clip Preview | $0 | $0 | $0 | 1.05M |
Cost Planning FAQ
How does the calculator estimate cost?
It multiplies input tokens by the model's input price per 1M tokens, then adds output tokens multiplied by the model's output price per 1M tokens.
Why separate input and output tokens?
Chatbots, agents, and code assistants often spend more on output tokens, while retrieval and classification workloads may be input-heavy. Separating them prevents a cheap-looking model from winning the wrong workload.
What should I do after estimating cost?
Open the model page or compare it against a close alternative, then verify current provider limits, discounts, and availability before production use.