[ECHO]June 4, 20264 min read

The Open-Weights Question

NVIDIA used its June 1 GTC Taipei keynote to announce Nemotron 3 Ultra, a 550-billion-parameter open-weights model scoring 48 on Artificial Analysis's Intelligence Index, the strongest open-weights model a US lab has shipped and the clearest signal yet that the open-weights category is a real procurement alternative. The buyer-side question has quietly changed shape: not which closed-source frontier lab to standardize on, but whether a team has tested an open-weights baseline against its actual workloads in the last six months. The company that sells compute to every closed-source lab just shipped the evidence that the alternative has gotten close, and most enterprise procurement teams have tested only one of the three categories now on the table.

NVIDIA used the GTC Taipei keynote on June 1 to announce Nemotron 3 Ultra, a 550-billion-parameter open-weights model with 55 billion active parameters via a hybrid mixture-of-experts and state-space design. The full release lands on June 4 across Hugging Face, ModelScope, OpenRouter, and NVIDIA's own NIM microservice catalog, with weights, training recipes, and a substantial portion of the training data published together.

On Artificial Analysis's Intelligence Index, Nemotron 3 Ultra scores 48. That puts it well ahead of the next strongest US open-weights options: Google's Gemma 4 31B at 39, NVIDIA's own Nemotron 3 Super at 36, and OpenAI's gpt-oss-120B at 33. By a meaningful margin, it's the strongest open-weights model an American lab has shipped, and it still trails the leading Chinese open-weights models, which top the open-weights leaderboard outright and, as Decrypt put it, cost roughly thirty times less than the closed-source frontier models most enterprise buyers have standardized on.

The asymmetry is the point of the announcement. NVIDIA is the company that sells compute to every closed-source frontier lab. When it ships its best open-weights model, the framing isn't "we're catching up to Anthropic." It's that the open-weights category itself is now a real strategic alternative for any enterprise that has been running every workload on closed-source frontier without testing the substitution.

Three buyer-side categories converged this week. Most enterprise procurement teams have only tested one.

The buyer-side data point that makes this concrete sits in OpenRouter's public rankings: Chinese AI providers accounted for under two percent of OpenRouter's weekly token volume a year ago. Their combined share now sits above forty-five percent, with Xiaomi alone running ahead of OpenAI on weekly token throughput. DeepSeek's V3.2, a 685-billion-parameter mixture-of-experts model released in late 2025, prices at roughly $0.28 per million input tokens and $0.42 per million output, which is not thirty percent cheaper than Claude Sonnet or GPT-5.5 but roughly thirty times cheaper.

Imagine a 200-engineer firm that licensed Claude Enterprise eighteen months ago because the buyer-side guidance at the time was clear: closed-source frontier, top of the leaderboard, lowest operational friction. That decision would have aged into a multi-million-dollar annual line item. The CTO would know the per-seat cost. The CTO wouldn't have run the same workload, with the same prompts, against an open-weights model self-hosted or routed through OpenRouter in over a year. The firm wouldn't know what it's actually paying for, because it wouldn't have a recent baseline.

The steelman is that operational complexity still favors closed-source: self-hosting open weights at production scale requires inference infrastructure, evaluation tooling, security review, and an MLOps team that most enterprises don't have. True for self-hosting. The argument doesn't hold for hosted open-weights inference through OpenRouter, Hugging Face, or DeepInfra, which has thirty-tokens-per-second latency floors and pay-per-token pricing the procurement team already knows how to handle. The "operational complexity" objection was load-bearing in 2024. In 2026 it gates only the self-hosting path, not the open-weights path itself.

The thing worth seeing is that the buyer-side procurement question has quietly changed shape. Two years ago it was which closed-source frontier lab to standardize on; now it's whether you've tested an open-weights baseline against your actual workloads in the last six months. If you have, you know what closed-source is buying you. If you haven't, you're paying a premium of unknown size for a margin of capability you can't quantify, and the strongest evidence yet for how close the alternative has gotten just landed from the company that sells the compute to both sides.

What to Do With This

Pick one workload your team currently runs on a closed-source frontier model. This week, run the same prompts on Nemotron 3 Ultra via OpenRouter or NVIDIA's NIM catalog, and on DeepSeek V3.2 through any hosted endpoint. Use your real prompts, not a benchmark. Compare three numbers: quality (how often the output is usable without rework), latency, and cost per task.

If your closed-source vendor still wins on all three for that workload, you have evidence for the premium you're paying. If they don't, you have a procurement decision the rest of the org should see.

Also on the Radar

Vera Rubin Enters Full Production With OpenAI, Anthropic, and SpaceX as First Customers

NVIDIA confirmed on June 1 that the Vera Rubin platform is now in full production, with OpenAI, Anthropic, and SpaceX as the first named customers. Huang cited a supply chain twice the size of the one assembled for Grace Blackwell, with rack assembly time down from roughly two hours to five minutes. The buyer-side read pairs with the lead: every frontier closed-source lab and the largest compute landlord behind them are all now on the same NVIDIA platform cadence, which means their roadmaps move together.

Ai2 Ships MolmoAct 2, an Open Robotics Control Model

Allen Institute for AI released MolmoAct 2 on June 1 alongside the GTC Taipei news cycle, positioning it as an open model for real-world robot control. Parallel signal to the Nemotron 3 Ultra story: another US institution shipping open weights into a domain (physical AI) where the closed-source moat is even thinner than in language. Worth tracking for any buyer running automation or robotics pilots through a closed vendor.

LinkedIn X Email

The Open-Weights Question

What to Do With This

Also on the Radar

Vera Rubin Enters Full Production With OpenAI, Anthropic, and SpaceX as First Customers

Ai2 Ships MolmoAct 2, an Open Robotics Control Model

Read Next

The Token Bill Comes Due

The Maintainer Bill

The Calibration Pitch