Nemotron 3 Ultra is now available on Vercel AI Gateway. It is an open mixture-of-experts model with a 1M token context window. The model supports multi-turn agent workflows including planning, tool use, sub-agent delegation, and error recovery. Throughput reaches 350 tokens per second with up to 30% lower cost on agentic tasks. Users set model to nvidia/nemotron-3-ultra-550b-a55b in the AI SDK to access it.
Tap to vote and see what everyone thinks.
[AINews] NVIDIA Cosmos 3, Nemotron 3 Ultra, and RTX Spark
Summary by ByteBrief