Tech companies are exploring a shift in high-bandwidth memory design that would embed GPU cores directly into the base die of next-generation HBM stacks. Reports name Meta and NVIDIA as evaluating so-called custom HBM architectures with SK Hynix and Samsung involved in early discussions. HBM stacks multiple DRAM dies atop a base die that handles external I/O, and HBM4 is expected to reach mass production next year with an onboard controller to improve bandwidth and efficiency. Placing compute inside the memory aims to reduce data movement and cut power usage by shortening the path between compute and memory for Artificial Intelligence workloads.
The approach promises performance and energy-efficiency gains for Artificial Intelligence processing but faces significant technical challenges. Sources cite limited die area in Through-Silicon Vias based stacks, power delivery constraints, and the difficulty of cooling compute-heavy GPU logic embedded in the base die. Kim Joung-ho, a professor in the School of Electrical Engineering at KAIST, said, ‘The speed of technological transition where the boundary between memory and system semiconductors collapses for Artificial Intelligence advancement will accelerate,’ and added, ‘Domestic companies must expand their ecosystem beyond memory into the logic sector to preempt the next-generation HBM market.’ The quote underscores industry pressure for memory vendors to broaden capabilities into packaging and logic.
Design choices by major accelerator makers illustrate divergent strategies. AMD’s Instinct MI430X accelerator, built on the next-generation AMD CDNA architecture, supports 432 GB of HBM4 memory and 19.6 TB/s of memory bandwidth. NVIDIA’s ‘Vera Rubin’ Superchip instead integrates two reticle-sized compute chiplets paired with eight HBM4 stacks, delivering around 288 GB of HBM4 per GPU and roughly 576 GB of HBM4 across the full Superchip. Market implications are clear: firms with strong packaging and logic capabilities stand to benefit, while pure memory vendors may need to expand into system-level semiconductor technologies to remain competitive.
