NVIDIA Blackwell Takes No. 1 in the AI-Agent Hardware Benchmark
NVIDIA's Blackwell GB300 NVL72 platform took the top spot in "AA-AgentPerf," a new benchmark that measures hardware performance for AI agents. Introduced by Artificial Analysis in 2026, the benchmark reproduces real coding-agent workloads to evaluate throughput per unit of power, and Blackwell was found to handle roughly 20 times more AI agents per megawatt than the previous-generation Hopper. It shows that in the age of AI agents, "power efficiency" has emerged as a key competitive metric.
What Was Announced
NVIDIA's Blackwell took the lead in the first agent-hardware benchmark. In the inaugural AA-AgentPerf results released by Artificial Analysis, the Blackwell GB300 NVL72 posted the top performance, beating AMD's MI355X in the same evaluation. The model used in the evaluation was DeepSeek V4 Pro, released around April 2026. NVIDIA stated that Blackwell handled about 20 times more agents per megawatt than the previous-generation Hopper.
What Does AA-AgentPerf Measure?
AA-AgentPerf is a benchmark that measures hardware throughput by reproducing real AI-agent workloads. Rather than one-off questions, it replays real coding-agent workflows spanning up to 200 turns and more than 100,000 tokens, and it measures how many agents a single system can handle simultaneously while meeting a service-level agreement (SLA). Its core metric is "Agents per Megawatt"—what sets it apart is that it looks at how many agents you can sustain on the same amount of power.
Why Power Efficiency Is Key
It's because as the number of AI agents grows, data-center power becomes the biggest bottleneck. An agent performs not a single inference but a chain of dozens to hundreds of them, so even the same task consumes far more compute and power. As a result, "how many agents can you run on the same power" determines operating cost more than raw speed does. Adopting Agents per Megawatt as the core metric means hardware competition has now moved to a stage that weighs performance and power efficiency together.
Why It Matters
These results show that the center of gravity in the AI-infrastructure race is shifting toward "agent-processing efficiency." That's because not only model performance but the hardware that runs it cheaply at scale determines the profitability of AI services. A platform that leads in power efficiency can serve more users on the same electricity, making it an important advantage in 2026, when competition over data-center investment and securing power is fierce.
Sources: Artificial Analysis · NVIDIA Blog · Crypto Briefing