Huawei has announced advancements in its Artificial Intelligence hardware lineup with the upcoming Ascend 920C accelerator, aimed at closing the efficiency gap with rivals such as NVIDIA. The new accelerator is part of the Ascend 920 family and is manufactured using SMIC´s 6 nm process node. Reports indicate that each Ascend 920C card will surpass 900 TeraFLOPS of BF16 half-precision compute power, a significant step forward compared to the current Ascend 910C model.
The memory subsystem also sees a major upgrade, as the 920C will be equipped with next-generation HBM3 modules, increasing total bandwidth to 4,000 GB/s from the 3,200 GB/s found in the 910C´s HBM2E configuration. Huawei is maintaining the chiplet-based architecture but is refining internal tensor acceleration engines to better serve demanding Transformer and Mixture-of-Experts models, used widely in large-scale Artificial Intelligence training. Along with this, the chip-to-chip interconnect and system support will advance to PCIe 5.0 and new high-throughput interconnect protocols, further boosting node-to-node communication crucial for dense cluster deployments.
Internal projections at Huawei estimate that training efficiency with the Ascend 920C could improve by 30 to 40 percent over the previous 910C, which peaks at 780 TeraFLOPS. This leap is expected to narrow the performance-per-watt differences versus competing solutions. While a firm release date for the new accelerator was not disclosed, sources suggest that mass production will commence in the second half of 2025, positioning Huawei to challenge rivals in the Artificial Intelligence infrastructure market in the near future.