NVIDIA Blackwell tops MLPerf benchmarks

NVIDIA’s Blackwell platform has taken pole position in the latest MLPerf Inference V5.0 benchmarks, setting a new standard in artificial intelligence (AI) inference performance.

These results highlight the platform’s ability to tackle some of the most challenging inference scenarios.

MLPerf benchmarks are widely regarded as the gold standard for evaluating AI inference performance. The inference benchmarks evaluate how quickly trained models can process new data to make predictions or decisions. These tests cover scenarios ranging from data centre operations to edge devices to ensure relevance across various deployment environments

For the round of benchmarks, NVIDIA submitted the GB200 NVL72 system, a rack-scale solution that connects 72 Blackwell GPUs to operate as a single, massive GPU.

The results were astounding — the architecture delivered up to 30 times higher throughput on the demanding Llama 3.1 405B benchmark compared to NVIDIA’s previous H200 NVL8 submission. This shows Blackwell is not just a minor upgrade; it’s a massive boost that empowers researchers to push the boundaries of AI by enabling faster execution, greater precision and cost-effective scaling,.

The dramatic improvement was made possible by more than tripling the performance per GPU and expanding the NVIDIA NVLink interconnect domain by nine times to enable seamless communication between GPUs.

This year’s MLPerf benchmarks introduced new challenges, including the Llama 3.1 405B model and a stricter Llama 2 70B Interactive benchmark, which better reflect real-world production constraints. On the Llama 2 70B Interactive test, the NVIDIA DGX B200 system with eight Blackwell GPUs tripled its performance compared to systems using H200 GPUs, setting a new benchmark for low-latency, high-throughput AI inference.

Hopper continues to improve

In addition to Blackwell’s record-breaking performance, NVIDIA’s Hopper architecture also demonstrated continued value in AI inference and training.

Introduced in 2022, Hopper-powered GPUs has shown consistent improvements through software optimisations, achieving up to 1.6x increase in throughput on benchmarks such as Llama 2 70B over the past year.

Fifteen NVIDIA partners, including Dell and Google Cloud, submitted results on NVIDIA platforms. The widespread adoption highlights NVIDIA’s dominance across cloud providers and server manufacturers worldwide.