LLM API Pricing Guide

This guide explains practical LLM API pricing trade-offs and ranks models by a standardized workload estimate so teams can start from budget constraints.

50Models listed

1M input + 500K outputCost example tokens

USD / 1MNormalized prices

Quick shortlist

Start with Owl Alpha.

This guide is sorted by standard workload cost, so the first rows are the strongest budget shortlist before model-quality testing.

Lead model 🔥Owl Alpha

ProviderOpenRouter

Sample cost$0

Context1.05M

The ranking is a discovery aid, not a final recommendation. Always compare the model against your workload and verify provider pricing before production use.

LLM API Price Bands

Estimate your workload

34Free or zero-price models

155Under $1 sample workload

100$1 to $5 sample workload

87$5+ sample workload

Use price bands before shortlisting

Start with the lowest viable price band for your workload, then compare context window, provider fit, popularity signal, and output-token cost before choosing an API.

Watch output-heavy workloads

Assistant and chatbot products often spend more on output than input. Use the calculator when output tokens are the main driver of monthly cost.

How to read this ranking

Models are sorted by estimated cost for 1,000,000 input tokens and 500,000 output tokens. Use this page when your first constraint is API spend.

Estimate your workload cost

Customize guide costs

Prices are normalized to USD per 1M tokens.

Monthly input tokens Monthly output tokens

This estimate uses normalized public API pricing per 1M tokens. It is a planning aid, not a billing quote. Verify provider pricing, limits, and terms before production use.

Model Ranking

Browse all models

Model	Provider	Prompt	Output	Sample cost	Your Cost	Context	Popularity	Release
🔥Owl Alpha	OpenRouter	$0	$0	$0	$0	1.05M	#7	2026-04-28
New🔥Nemotron 3 Ultra (free)	NVIDIA	$0	$0	$0	$0	1M	#12	2026-06-04
🔥Laguna M.1 (free)	Poolside	$0	$0	$0	$0	262.14K	#14	2026-04-28
Nemotron 3 Super (free)	NVIDIA	$0	$0	$0	$0	1M	#23	2026-03-11
gpt-oss-120b (free)	OpenAI	$0	$0	$0	$0	131.07K	#33	2025-08-05
Laguna XS.2 (free)	Poolside	$0	$0	$0	$0	262.14K	#47	2026-04-28
GLM 4.5 Air (free)	Z.ai	$0	$0	$0	$0	131.07K	#50	2025-07-25
gpt-oss-20b (free)	OpenAI	$0	$0	$0	$0	131.07K	#67	2025-08-05
Gemma 4 31B (free)	Google	$0	$0	$0	$0	262.14K	#68	2026-04-02
Nemotron 3 Nano 30B A3B (free)	NVIDIA	$0	$0	$0	$0	256K	#75	2025-12-14
Kimi K2.6 (free)	MoonshotAI	$0	$0	$0	$0	262.14K	#83	2026-04-20
Nemotron 3 Nano Omni (free)	NVIDIA	$0	$0	$0	$0	256K	#94	2026-04-28
Nemotron Nano 9B V2 (free)	NVIDIA	$0	$0	$0	$0	128K	#105	2025-09-05
Nemotron Nano 12B 2 VL (free)	NVIDIA	$0	$0	$0	$0	128K	#107	2025-10-28
Gemma 4 26B A4B (free)	Google	$0	$0	$0	$0	262.14K	#140	2026-04-03
NewNemotron 3.5 Content Safety (free)	NVIDIA	$0	$0	$0	$0	128K	#181	2026-06-04
LFM2.5-1.2B-Thinking (free)	LiquidAI	$0	$0	$0	$0	32.77K	#184	2026-01-20
LFM2.5-1.2B-Instruct (free)	LiquidAI	$0	$0	$0	$0	32.77K	#195	2026-01-20
Qwen3 Next 80B A3B Instruct (free)	Qwen	$0	$0	$0	$0	262.14K	#210	2025-09-11
Llama 3.3 70B Instruct (free)	Meta	$0	$0	$0	$0	131.07K	#213	2024-12-06
Uncensored (free)	Venice	$0	$0	$0	$0	32.77K	#242	2025-07-09
Hermes 3 405B Instruct (free)	Nous	$0	$0	$0	$0	131.07K	#257	2024-08-16
Llama 3.2 3B Instruct (free)	Meta	$0	$0	$0	$0	131.07K	#258	2024-09-25
Lyria 3 Pro Preview	Google	$0	$0	$0	$0	1.05M	#283	2026-03-30
Lyria 3 Clip Preview	Google	$0	$0	$0	$0	1.05M	#291	2026-03-30
NewNorth Mini Code (free)	Cohere	$0	$0	$0	$0	256K		2026-06-17
NewKimi K2.7 Code (free)	MoonshotAI	$0	$0	$0	$0	262.14K		2026-06-12
NewNex-N2-Pro (free)	Nex AGI	$0	$0	$0	$0	262.14K		2026-06-08
CoBuddy (free)	Baidu Qianfan	$0	$0	$0	$0	131.07K		2026-05-06
DeepSeek V4 Flash (free)	DeepSeek	$0	$0	$0	$0	1.05M		2026-04-24
Trinity Large Thinking (free)	Arcee AI	$0	$0	$0	$0	262.14K		2026-04-01
MiniMax M2.5 (free)	MiniMax	$0	$0	$0	$0	204.8K		2026-02-12
Free Models Router	OpenRouter	$0	$0	$0	$0	200K		2026-02-01
Qwen3 Coder 480B A35B (free)	Qwen	$0	$0	$0	$0	1.05M		2025-07-23
Ling-2.6-flash	inclusionAI	$0.01	$0.03	$0.03	$0.03	262.14K	#43	2026-04-21
Mistral Nemo	Mistral	$0.02	$0.03	$0.04	$0.04	131.07K	#39	2024-07-19
Llama 3.1 8B Instruct	Meta	$0.02	$0.03	$0.04	$0.04	131.07K	#44	2024-07-23
Llama 3 8B Lunaris	Sao10K	$0.04	$0.05	$0.07	$0.07	8.19K	#127	2024-08-13
Granite 4.0 Micro	IBM	$0.017	$0.112	$0.07	$0.07	131K	#225	2025-10-20
Qwen2.5 7B Instruct	Qwen	$0.04	$0.1	$0.09	$0.09	131.07K	#106	2024-10-16
LFM2-24B-A2B	LiquidAI	$0.03	$0.12	$0.09	$0.09	128K	#119	2026-02-25
Mistral Small 3	Mistral	$0.05	$0.08	$0.09	$0.09	32.77K	#138	2025-01-30
MythoMax 13B	gryphe	$0.06	$0.06	$0.09	$0.09	4.1K	#193	2023-07-02
gpt-oss-20b	OpenAI	$0.029	$0.14	$0.1	$0.1	131.07K	#61	2025-08-05
Granite 4.1 8B	IBM	$0.05	$0.1	$0.1	$0.1	131.07K	#156	2026-04-30
Gemma 3 4B	Google	$0.05	$0.1	$0.1	$0.1	131.07K	#158	2025-03-13
gpt-oss-120b	OpenAI	$0.03	$0.15	$0.1	$0.1	131.07K	#22	2025-08-05
Nova Micro 1.0	Amazon	$0.035	$0.14	$0.11	$0.11	128K	#110	2024-12-05
Command R7B (12-2024)	Cohere	$0.0375	$0.15	$0.11	$0.11	128K	#228	2024-12-14
Trinity Mini	Arcee AI	$0.045	$0.15	$0.12	$0.12	131.07K	#179	2025-12-01

Pricing FAQ

How is the sample workload cost calculated?

The sample workload uses 1,000,000 input tokens plus 500,000 output tokens, then applies each model's normalized USD price per 1 million tokens.

Why do input and output token prices matter separately?

Many applications are output-token heavy, while retrieval and classification workloads may be input-token heavy. Comparing both prices helps avoid picking a model that is cheap for the wrong workload shape.

Should I verify prices before production use?

Yes. AI Model Matrix normalizes public pricing metadata for comparison, but provider availability, limits, and prices can change. Always verify the final contract or provider dashboard before production use.