
NVIDIA’s Nemotron 3 Ultra enables long-running AI agents with efficient reasoning, context retention, and tool use across extended interactions.
Agent orchestration
Handles the hardest calls in agent workflows, such as sustaining architectural decisions across coding sessions.
Long-horizon planning
Manages complex, multi-step tasks with extended planning horizons, as shown in EnterpriseOps-Gym benchmarks.
Coding and terminal tasks
Supports terminal-based coding benchmarks like Terminal-Bench 2.0 for automated development workflows.
Instruction following
Maintains high accuracy on complex instruction-following tasks (IFBench: 82%).
Knowledge work
Excels at professional work tasks, including search-based knowledge work (ProfBench Search: 56%).
Long-context processing
Handles context windows up to 1 million tokens (Ruler @1M: 95%), enabling analysis of extensive documents or research sources.
Hybrid Mamba-Transformer layers
Combines state-space model and transformer architectures for efficient long-context handling across extended agent interactions.
NVFP4 quantization
Enables deployment across multiple GPU architectures with up to 5x higher throughput compared to standard precision.
LatentMoE expert routing
Optimizes which expert sub-models handle each input, improving efficiency in Mixture-of-Experts inference.
Multi-token prediction
Increases generative speed for multi-turn tasks by predicting multiple tokens simultaneously.
Multi-Teacher On-Policy Distillation
Continuously improves domain specialization by training with dense feedback from over ten domain-specific teacher models.
Open recipes, weights, and licensing
Provides fully open model weights, training recipes, and licensing for broad adoption and fine-tuning by developers.
Transparent pretraining and RL data pipeline
Offers a fully documented data pipeline for pretraining and reinforcement learning, enabling reproducibility and customization.
NVIDIA’s Nemotron 3 Ultra enables long-running AI agents with efficient reasoning, context retention, and tool use across extended interactions.
Category:Agents
Tags:NVIDIA Nemotron、AI agents、long-context reasoning、tool use、efficient AI