LLM API budget planning

LLM API Cost Calculator

Estimate monthly API spend from input and output token volume, then compare popular and low-cost models using normalized USD prices per 1M tokens.

Pricing data updated:  Prices normalized to USD per 1M tokens Calculator estimates planning cost, not final billing

Start with your workload

Change token volume once; every model estimate updates.

Use this when budgeting a chatbot, RAG pipeline, coding assistant, or batch analysis workload before choosing a provider.

Default scenario1M input + 500K output
Popular models12
Low-cost shortlist12
Comparable unitUSD / 1M tokens

Estimate your workload cost

Monthly token volume

Use input and output tokens separately; output-heavy apps can change the winner.

This estimate uses available model metadata. Provider invoices may include routing, caching, discounts, minimums, or account-specific terms.

Popular Model Cost Estimates

Open hot models
ModelProviderInput / 1MOutput / 1MYour CostContextRank
🔥DeepSeek V4 FlashDeepSeek$0.112$0.224$0.221.05M#1
🔥Hy3 previewTencent$0.066$0.26$0.2262.14K#2
🔥Claude Opus 4.7Anthropic$5$25$17.51M#3
🔥Claude Sonnet 4.6Anthropic$3$15$10.51M#4
🔥Owl AlphaOpenRouter$0$0$01.05M#5
🔥Gemini 3 Flash PreviewGoogle$0.5$3$21.05M#6
🔥DeepSeek V3.2DeepSeek$0.252$0.378$0.44131.07K#7
🔥DeepSeek V4 ProDeepSeek$0.435$0.87$0.871.05M#8
🔥Kimi K2.6MoonshotAI$0.73$3.49$2.48262.14K#9
🔥Step 3.5 FlashStepFun$0.1$0.3$0.25262.14K#10
🔥Claude Opus 4.6Anthropic$5$25$17.51M#11
🔥MiniMax M2.7MiniMax$0.279$1.2$0.88204.8K#12

Low-Cost Shortlist

Browse cheapest models
ModelProviderYour CostInput / 1MOutput / 1MContext
🔥Owl AlphaOpenRouter$0$0$01.05M
🔥Nemotron 3 Super (free)NVIDIA$0$0$01M
CoBuddy (free)Baidu Qianfan$0$0$0131.07K
Nemotron 3 Nano Omni (free)NVIDIA$0$0$0256K
Laguna XS.2 (free)Poolside$0$0$0131.07K
Laguna M.1 (free)Poolside$0$0$0131.07K
DeepSeek V4 Flash (free)DeepSeek$0$0$01.05M
Gemma 4 26B A4B (free)Google$0$0$0262.14K
Gemma 4 31B (free)Google$0$0$0262.14K
Trinity Large Thinking (free)Arcee AI$0$0$0262.14K
Lyria 3 Pro PreviewGoogle$0$0$01.05M
Lyria 3 Clip PreviewGoogle$0$0$01.05M

Cost Planning FAQ

How does the calculator estimate cost?

It multiplies input tokens by the model's input price per 1M tokens, then adds output tokens multiplied by the model's output price per 1M tokens.

Why separate input and output tokens?

Chatbots, agents, and code assistants often spend more on output tokens, while retrieval and classification workloads may be input-heavy. Separating them prevents a cheap-looking model from winning the wrong workload.

What should I do after estimating cost?

Open the model page or compare it against a close alternative, then verify current provider limits, discounts, and availability before production use.