AMD has launched ROCm 6.4, its latest open-source GPU compute stack for high-performance and Artificial Intelligence workloads. The update delivers key technical upgrades, including improved compatibility between ROCm’s user-space libraries and the AMDKFD kernel driver, enhancing ROCm’s portability across diverse Linux kernel versions. AMD’s expanded internal testing now covers more user and kernel version pairings, which is expected to ease integration in diverse deployment environments typical of HPC and deep learning.
Developers benefit from newly added out-of-the-box support for PyTorch 2.5 and 2.6, eliminating the need to build framework support from source and accelerating adoption of new deep-learning capabilities. Significant performance improvements also arrive via Megatron-LM integration, featuring three new fused GPU kernels—Attention (QKV), Layer Norm, and ROPE—optimizing transformer model training by combining multiple computational steps. ROCm’s video processing toolset is enhanced, with both rocDecode and rocPyDecode gaining VP9 decoding support and a streamlined bitstream reader module, broadening media pipeline functionality.
Additional platform compatibility comes with official support for Oracle Linux 9 and validation of the Radeon PRO W7800 48 GB workstation GPU within the ROCm environment. Memory performance options expand with the enablement of CPX mode and NPS4 memory configurations for MI Instinct accelerators, targeting high-bandwidth scenarios. However, ROCm 6.4 still does not offer official support for RDNA 4 GPUs like the RX 9070 series. Although community reports suggest partial, unofficial functionality on these new cards, the absence of formal support means RDNA 4’s advanced features—doubled FP16 throughput, INT4 sparsity acceleration, and FP8 support—remain largely inaccessible for production Artificial Intelligence workflows in ROCm. On the Linux front, consumer Radeon support is limited, despite broadened Windows compatibility since 2022 for RDNA 2 and 3 GPUs. As anticipation builds toward AMD’s ´Advancing AI´ event in June, developers are eager for news regarding official RDNA 4 integration into ROCm. In the meantime, those prioritizing guaranteed GPU compatibility for Artificial Intelligence may continue to evaluate alternative platforms.