Sort by Your cost when your monthly bill is the first constraint. Check both input and output prices before choosing a chatbot, agent, or batch workload model.
Workload-first model discovery
LLM API Model Finder
Filter the model catalog by token volume, use case, provider, and minimum context window before opening model pages or comparison search.
Shortlist by constraints
Start from workload shape instead of a model name.
Use this finder when you know the rough monthly token volume, context requirement, provider preference, or workload type but still need a practical shortlist.
The finder is a discovery aid, not a benchmark. Compare the shortlisted models and verify current provider terms before production use.
Estimate your workload cost
Monthly token volume
Costs use the selected token volume and normalized model prices. Missing price fields appear as N/A and rank behind priced models for cost-first views.
Recommended shortlist
| Model | Provider | Input / 1M | Output / 1M | Your Cost | Context | Rank | Next Step |
|---|---|---|---|---|---|---|---|
| 🔥DeepSeek V4 Flash | DeepSeek | $0.112 | $0.224 | $0.22 | 1.05M | #1 | Open model |
| 🔥Hy3 preview | Tencent | $0.066 | $0.26 | $0.2 | 262.14K | #2 | Open model |
| 🔥Claude Sonnet 4.6 | Anthropic | $3 | $15 | $10.5 | 1M | #3 | Open model |
| 🔥Owl Alpha | OpenRouter | $0 | $0 | $0 | 1.05M | #4 | Open model |
| 🔥DeepSeek V3.2 | DeepSeek | $0.252 | $0.378 | $0.44 | 131.07K | #5 | Open model |
| 🔥Gemini 3 Flash Preview | $0.5 | $3 | $2 | 1.05M | #6 | Open model | |
| 🔥Claude Opus 4.7 | Anthropic | $5 | $25 | $17.5 | 1M | #7 | Open model |
| 🔥DeepSeek V4 Pro | DeepSeek | $0.435 | $0.87 | $0.87 | 1.05M | #8 | Open model |
| 🔥Step 3.5 Flash | StepFun | $0.1 | $0.3 | $0.25 | 262.14K | #9 | Open model |
| 🔥MiniMax M2.7 | MiniMax | $0.279 | $1.2 | $0.88 | 204.8K | #10 | Open model |
| 🔥Nemotron 3 Super (free) | NVIDIA | $0 | $0 | $0 | 1M | #11 | Open model |
| 🔥Kimi K2.6 | MoonshotAI | $0.73 | $3.49 | $2.48 | 262.14K | #12 | Open model |
How to use the shortlist
Set a minimum context window before comparing RAG, long-document, or codebase workflows. A cheaper model may still be the wrong fit when context is tight.
Use provider filtering when procurement, region, account limits, or existing integrations matter. Then compare close alternatives inside the provider hub.