Microsoft challenges hyperscalers with Maia 200 artificial intelligence chip

Microsoft has introduced its Maia 200 artificial intelligence accelerator chip, positioning it as the most performant first party silicon among hyperscalers and a direct challenger to Amazon Web Services and Google. The company is targeting reduced dependence on Nvidia, Intel and AMD while powering services such as Microsoft Copilot and advanced OpenAI models.

Microsoft has launched its Maia 200 artificial intelligence accelerator chip and is positioning the processor as “the most performant, first-party silicon from any hyperscaler” as it seeks to reduce its reliance on third-party silicon vendors such as Intel, AMD and Nvidia. The Redmond-based company said Maia 200 is focused on inference workloads and is designed to outperform custom artificial intelligence chips from Amazon Web Services and Google on several key measures, particularly in low-precision numerical formats used for a growing number of artificial intelligence inference tasks.

The company has already deployed Maia 200 systems in its U.S. Central region near Des Moines, Iowa, with U.S. West 3 near Phoenix, Ariz., planned as the next available region and more regions expected to follow. These systems are currently powering Microsoft Copilot and Microsoft Foundry workloads and are also being used to run advanced artificial intelligence models, including OpenAI’s latest GPT-5.2 models and models under development by Microsoft’s Superintelligence team led by Microsoft artificial intelligence CEO Mustafa Suleyman. Scott Guthrie, executive vice president of Microsoft’s cloud and artificial intelligence group, said the Maia 200 has enabled “higher utilization, faster time to production and sustained improvements in performance-per-dollar and per-watt at cloud scale,” attributing these gains to Microsoft’s silicon development programs that validate as much of the end-to-end system as possible before final silicon availability.

Microsoft claimed that the Maia 200 can achieve nearly 10,200 teraflops of 8-bit floating-point (FP4) performance, which the company said makes the chip four times more powerful than Amazon Web Services’ Trainium3 chip. The company also said Maia 200 can reach just over 5,000 teraflops of 8-bit floating-point performance (FP8), which it said gives the chip a 9 percent advantage over Google’s seventh-generation TPU and more than double the FP8 performance of Trainium3. Using HBM3E high-bandwidth memory, the Maia 200 comes with 216 GB of memory and a memory bandwidth of 7 TBps in contrast to the 144 GB and 4.9 TBps of Trainium3 and the 192 GB and 7.4 TBps of TPU v7, and the chip supports a scale-up bandwidth of 2.8 TBps versus the 2.56 TBps maximum of Trainium3 and 1.2 TBps of TPU v7. Microsoft did not disclose total performance or power details for full Maia 200 server racks, even as Amazon Web Services has said its Trn3 UltraServers can pack up to 144 Trainium3 chips to deliver up to 362 petaflops of FP8 performance and Google has said its TPU v7 pod features 9,216 seventh-generation TPUs to deliver 42.5 exaflops of FP8. While a Microsoft spokesperson did not provide similar rack-level details or competitive comparisons on energy use or cost, the company said Maia 200 provides 30 percent more performance-per-dollar than the first-generation Maia 100 while operating at a 750-watt thermal design power that is only 50 watts higher than the maximum power envelope of its predecessor, which Microsoft has provisioned for 500 watts, and customers will ultimately compare Maia 200 against Trainium3 and TPU v7 based on workload costs and the effectiveness of each provider’s software stack.

68

Impact Score

Continual learning with reinforcement learning for large language models

Researchers are finding that on-policy reinforcement learning can help large language models learn new tasks over time while preserving prior skills, outperforming supervised finetuning in continual learning setups. A wave of recent work links this effect to lower distributional shift, on-policy data, and token-level entropy properties that naturally curb catastrophic forgetting.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.