Article
AI News Tech Giants

NVIDIA GB300 NVL72 runs 20x more agents per megawatt in AgentPerf

by TechDefused Newsroom
The image showcases a close-up of a computer circuit board featuring a prominent Nvidia chip, surrounded by glowing components and circuitry. This visual highlights the technological advancements in server rack design, specifically relating to Nvidia's competitive positioning in the AI chip market. aiImage created using AI — flux_2_pro

NVIDIA's GB300 NVL72 rack-scale Blackwell Ultra platform delivered the highest performance in the first AgentPerf agentic AI benchmark, running up to 20x more agents per megawatt than the NVIDIA HGX H200 system.

AgentPerf, published by Artificial Analysis as the industry's first benchmark for agentic AI, measures chained model calls and tool-invocation patterns rather than single-request inference, reflecting how agents break goals into many steps and sustain long, context-rich interactions.

The benchmark runs DeepSeek V4 Pro, a large mixture-of-experts model, and reports how many concurrent agentic coding trajectories a platform can support while meeting defined thresholds for responsiveness and output token rate, with tool calls simulated as representative CPU processing time so results isolate accelerated compute performance.

NVIDIA credits the GB300 NVL72 advantage to full-stack codesign: 72 GPUs in a single rack-scale system, CUDA kernels that overlap communication and compute so coordination cost is absorbed, and TensorRT LLM optimizations that separate input processing from output generation to sustain efficiency as concurrent sessions scale.

Inference providers Baseten, DeepInfra and Together AI are already serving agentic workloads on Blackwell, including Together's real-time Cursor coding agents and DeepInfra's Pam.ai dealership agents running on the platform.

The company also notes its Vera Rubin architecture is now in full production, bringing additional infrastructure capacity aimed at growing agentic AI demand.

by TechDefused Newsroom