MLCommons, a consortium dedicated to benchmarking Artificial Intelligence infrastructure, has released the results of the MLPerf Training v5.0 benchmark on June 4, 2025. This critical industry standard assesses the training performance of hardware used for large-scale Artificial Intelligence workloads. Companies such as NVIDIA, AMD, and Intel develop specialized chips for these tasks, and major vendors, including Dell and Oracle, deploy this hardware in their infrastructure offerings. The MLPerf benchmarks gauge real-world training and, more recently, have shifted focus to the time required to train or adapt state-of-the-art foundation models, such as Llama 3.1 405B, replacing older tests like GPT-3 training.
Results released in this benchmark cycle highlight extraordinary progress over the past six months. Performance gains were particularly striking: for example, the time to train Stable Diffusion improved by 2.28 times, while Llama 2 70B training times accelerated by 2.10 times compared to last year’s MLPerf Training v4.1. Among the standouts, NVIDIA’s new Blackwell-generation chips more than doubled the training speed of their previous Hopper-generation, according to company-published comparison data. Notably, NVIDIA was the sole submitter to provide results across all MLPerf v5.0 testing categories, underscoring the company’s broad market penetration and extensive portfolio.
Meanwhile, AMD disclosed impressive results from its Instinct MI325X hardware, reporting up to 8% faster performance compared to NVIDIA’s H200 chip when tested on additional LoRA learning for the Llama 2 70B model. AMD also showed that the MI325X surpassed its own previous-generation Instinct MI300X chip by as much as 30%. Further, AMD presented uniform results across multiple vendor implementations, arguing its solution delivers consistently high performance. This round of benchmarks highlights escalating competition as both NVIDIA and AMD claim leadership in different metrics, and it signals a new era of rapid and diverse hardware innovation for Artificial Intelligence infrastructure. Full results and data are available on the MLCommons website, providing a resource for industry players comparing hardware capabilities.