LLM API budget planning
LLM API Cost Calculator
Estimate monthly API spend from input and output token volume, then compare popular and low-cost models using normalized USD prices per 1M tokens.
Start with your workload
Change token volume once; every model estimate updates.
Use this when budgeting a chatbot, RAG pipeline, coding assistant, or batch analysis workload before choosing a provider.
Estimate your workload cost
Monthly token volume
This estimate uses available model metadata. Provider invoices may include routing, caching, discounts, minimums, or account-specific terms.
Popular Model Cost Estimates
Open popular models| Model | Provider | Input / 1M | Output / 1M | Your Cost | Context | Popularity |
|---|---|---|---|---|---|---|
| 🔥Claude Opus 4.7 | Anthropic | $5 | $25 | $17.5 | 1M | #1 |
| 🔥DeepSeek V4 Flash | DeepSeek | $0.1 | $0.2 | $0.2 | 1.05M | #2 |
| 🔥Hy3 preview | Tencent | $0.066 | $0.26 | $0.2 | 262.14K | #3 |
| 🔥Claude Sonnet 4.6 | Anthropic | $3 | $15 | $10.5 | 1M | #4 |
| New🔥Owl Alpha | OpenRouter | $0 | $0 | $0 | 1.05M | #5 |
| 🔥DeepSeek V4 Pro | DeepSeek | $0.435 | $0.87 | $0.87 | 1.05M | #6 |
| 🔥DeepSeek V3.2 | DeepSeek | $0.252 | $0.378 | $0.44 | 131.07K | #7 |
| 🔥MiMo-V2.5-Pro | Xiaomi | $0.435 | $0.87 | $0.87 | 1.05M | #8 |
| 🔥Gemini 3 Flash Preview | $0.5 | $3 | $2 | 1.05M | #9 | |
| 🔥Claude Opus 4.6 | Anthropic | $5 | $25 | $17.5 | 1M | #10 |
| 🔥Nemotron 3 Super (free) | NVIDIA | $0 | $0 | $0 | 1M | #11 |
| 🔥Gemini 2.5 Flash Lite | $0.1 | $0.4 | $0.3 | 1.05M | #12 |
Low-Cost Shortlist
Browse cheapest models| Model | Provider | Your Cost | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|---|
| New🔥Owl Alpha | OpenRouter | $0 | $0 | $0 | 1.05M |
| 🔥Nemotron 3 Super (free) | NVIDIA | $0 | $0 | $0 | 1M |
| New🔥Laguna M.1 (free) | Poolside | $0 | $0 | $0 | 262.14K |
| gpt-oss-120b (free) | OpenAI | $0 | $0 | $0 | 131.07K |
| GLM 4.5 Air (free) | Z.ai | $0 | $0 | $0 | 131.07K |
| NewLaguna XS.2 (free) | Poolside | $0 | $0 | $0 | 262.14K |
| gpt-oss-20b (free) | OpenAI | $0 | $0 | $0 | 131.07K |
| Nemotron 3 Nano 30B A3B (free) | NVIDIA | $0 | $0 | $0 | 256K |
| NewNemotron 3 Nano Omni (free) | NVIDIA | $0 | $0 | $0 | 256K |
| Gemma 4 31B (free) | $0 | $0 | $0 | 262.14K | |
| Nemotron Nano 9B V2 (free) | NVIDIA | $0 | $0 | $0 | 128K |
| Nemotron Nano 12B 2 VL (free) | NVIDIA | $0 | $0 | $0 | 128K |
Cost Planning FAQ
How does the calculator estimate cost?
It multiplies input tokens by the model's input price per 1M tokens, then adds output tokens multiplied by the model's output price per 1M tokens.
Why separate input and output tokens?
Chatbots, agents, and code assistants often spend more on output tokens, while retrieval and classification workloads may be input-heavy. Separating them prevents a cheap-looking model from winning the wrong workload.
What should I do after estimating cost?
Open the model page or compare it against a close alternative, then verify current provider limits, discounts, and availability before production use.