AMD announced that Zyphra has developed ZAYA1, the first large-scale mixture-of-experts foundation model trained using an AMD GPU and networking platform. The milestone is detailed in a Zyphra technical report published today and marks a production-scale training effort built on AMD hardware and software. The announcement frames the work as an achievement in scaling large foundation models with an alternative accelerator and networking stack.
The training platform combined AMD Instinct MI300X GPUs with AMD Pensando networking and was enabled by the AMD ROCm open software stack. Zyphra credits that integrated stack for supporting the distributed training requirements of the mixture-of-experts architecture. The technical report links the performance of the system to the specific combination of MI300X compute, Pensando networking, and ROCm software used during model development and training.
According to Zyphra’s reported results, ZAYA1 delivers competitive or superior performance compared with leading open models on a set of benchmarks covering reasoning, mathematics, and coding. Zyphra positions these results as evidence of both the scalability and the efficiency of AMD Instinct GPUs for production-scale Artificial Intelligence workloads. The company’s technical report serves as the source for performance claims and implementation details presented in the announcement.
