DeepSeek is gearing up to launch its highly anticipated R2 model, signaling a major advance in China´s domestic Artificial Intelligence infrastructure. According to industry insiders, the DeepSeek R2 will be powered by clusters of Huawei´s Ascend 910B chips and possibly Huawei´s Atlas 900 platform, all orchestrated by DeepSeek´s proprietary distributed training framework. This configuration achieves an impressive 82% accelerator utilization, amounting to 512 PetaFLOPS of FP16 processing power, which is nearly half an exaFLOP. Lab data from Huawei shows this equates to about 91% of the performance of the established NVIDIA A100 clusters, while DeepSeek claims a dramatic reduction in per-unit training costs—up to 97.3% cheaper.
The hardware backbone behind DeepSeek R2 reflects a strategic partnership ecosystem. Tuowei Information, a leading original equipment manufacturer in the Huawei Ascend family, manages more than half of DeepSeek´s supercomputing hardware procurement. Sugon, another partner, supplies advanced liquid-cooled server racks capable of sustaining up to 40 kW per unit, and Innolight´s silicon-photonics transceivers further optimize power efficiency by reducing energy consumption 35% compared to conventional methods. These collaborations have enabled DeepSeek to construct a scalable, cost-effective infrastructure ready to support the country’s growing Artificial Intelligence ambitions.
Operationally, DeepSeek has distributed its resources across several geographic hubs. The sprawling South China supercomputing center, overseen by Runjian Shares, is backed by annual contracts exceeding ¥5 billion, while Zhongbei Communications maintains a 1,500-PetaFLOP capacity in Northwest China for on-demand scaling. In North China, Hongbo Shares´ Yingbo Digital supervises a node providing a substantial 3,000 PetaFLOPS. On the software front, DeepSeek R2 is already in use for private deployment and fine-tuning, supporting smart-city projects across 15 provinces using the Yun Sai Zhilian platform. In scenarios of computational scarcity, Huawei´s CloudMatrix 384 system is ready for deployment as a domestic counterpart to NVIDIA’s GB200 NVL72. CloudMatrix delivers 1.7 times the total petaFLOPS and 3.6 times the memory bandwidth of NVIDIA´s solution, albeit with higher power consumption and lower per-chip performance. With the R2 model launch imminent, the community awaits official benchmarks to assess its real-world impact on Artificial Intelligence development within China.