Llama 3.3 Nemotron Super 49B V1.5

NVIDIA model details for pricing, context, and release tracking.

Pricing data updated:  Prices normalized to USD per 1M tokens Sample workload: 1M input + 500K output
Model Specs
ProviderNVIDIA
Model IDnvidia/llama-3.3-nemotron-super-49b-v1.5
Prompt Price
per 1M tokens
$0.1
Completion Price
per 1M tokens
$0.4
Sample Workload Cost
1M input + 500K output
$0.3
Context Window131.07K
Release Date2025-10-10
Popularity RankUnranked
Daily DemandN/A

Estimate your workload cost

Estimate this model for your workload

Prices are normalized to USD per 1M tokens.
Llama 3.3 Nemotron Super 49B V1.5 Calculating… Estimated monthly API cost
Unit prices $0.1 input / $0.4 output Per 1M tokens

This estimate uses normalized public API pricing per 1M tokens. It is a planning aid, not a billing quote. Verify provider pricing, limits, and terms before production use.

Model Introduction

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...

Best Fit

Llama 3.3 Nemotron Super 49B V1.5 is best suited for cost-sensitive production traffic.

Cost Example

A 1M input token plus 500K output token workload is estimated at $0.3.

Decision Shortcuts

Compare this model

Search head-to-head pages that include Llama 3.3 Nemotron Super 49B V1.5 and review input price, output price, context, and sample workload cost.

Find comparisons

NVIDIA catalog

See other NVIDIA models before narrowing your shortlist.

Open provider hub

Cheaper alternatives

Start from models ranked by a standard cost estimate when budget is the first constraint.

Browse low-cost models

Popular Comparisons

Search all comparisons
ComparisonNewest Release
No related comparisons are available yet.