NVIDIA Blackwell tops SemiAnalysis InferenceMAX benchmarks

NVIDIA’s Blackwell platform has swept the newly-launched SemiAnalysis InferenceMAX v1 benchmarks on AI inference.

Unlike MLPerf, which has long been the industry’s reference for AI benchmarking, InferenceMAX v1 evaluates both raw throughput and economic metrics such as total cost of ownership and token generation for real-world enterprise AI factories.

While MLPerf focuses primarily on standardised throughput and latency for AI models, InferenceMAX v1 benchmarks expand the evaluation to include return on investment (ROI), cost per token and power efficiency at scale. InferenceMAX v1 reflects real-world economics and operational efficiency rather than just theoretical speed.

NVIDIA’s Blackwell GB200 NVL72 system achieved a 15x ROI, with a US$5 million investment estimated to generate US$75 million in token revenue.

“Inference is where AI delivers value every day. These results show that NVIDIA’s full-stack approach gives customers the performance and efficiency they need to deploy AI at scale,” said Ian Buck, Vice President of Hyperscale and High-performance Computing at NVIDIA.

With ever-increasing demands for complex AI reasoning and multistep queries, benchmarks such as InferenceMAX v1 give enterprises the information they need to select hardware that delivers both speed and cost efficiency at scale.