Next-gen HBM eyes embedded GPU cores for Artificial Intelligence

Companies including Meta and NVIDIA are reported to be evaluating custom HBM that embeds GPU cores into the base die to push compute closer to memory for Artificial Intelligence workloads.

Tech companies are exploring a shift in high-bandwidth memory design that would embed GPU cores directly into the base die of next-generation HBM stacks. Reports name Meta and NVIDIA as evaluating so-called custom HBM architectures with SK Hynix and Samsung involved in early discussions. HBM stacks multiple DRAM dies atop a base die that handles external I/O, and HBM4 is expected to reach mass production next year with an onboard controller to improve bandwidth and efficiency. Placing compute inside the memory aims to reduce data movement and cut power usage by shortening the path between compute and memory for Artificial Intelligence workloads.

The approach promises performance and energy-efficiency gains for Artificial Intelligence processing but faces significant technical challenges. Sources cite limited die area in Through-Silicon Vias based stacks, power delivery constraints, and the difficulty of cooling compute-heavy GPU logic embedded in the base die. Kim Joung-ho, a professor in the School of Electrical Engineering at KAIST, said, ‘The speed of technological transition where the boundary between memory and system semiconductors collapses for Artificial Intelligence advancement will accelerate,’ and added, ‘Domestic companies must expand their ecosystem beyond memory into the logic sector to preempt the next-generation HBM market.’ The quote underscores industry pressure for memory vendors to broaden capabilities into packaging and logic.

Design choices by major accelerator makers illustrate divergent strategies. AMD’s Instinct MI430X accelerator, built on the next-generation AMD CDNA architecture, supports 432 GB of HBM4 memory and 19.6 TB/s of memory bandwidth. NVIDIA’s ‘Vera Rubin’ Superchip instead integrates two reticle-sized compute chiplets paired with eight HBM4 stacks, delivering around 288 GB of HBM4 per GPU and roughly 576 GB of HBM4 across the full Superchip. Market implications are clear: firms with strong packaging and logic capabilities stand to benefit, while pure memory vendors may need to expand into system-level semiconductor technologies to remain competitive.

60

Impact Score

Rapidus plans second Hokkaido plant, targets 1.4 nm production in early 2029

Rapidus reportedly plans to begin construction of a second factory in Hokkaido in 2027 and aims to start production of 1.4 nm chips in early 2029 as part of a trillion-yen initiative. A Rapidus spokesperson said the recent reports are speculation and that any roadmap updates will come directly from the company.

Artificial Intelligence for the real world

Leaders in construction, logistics, and energy are moving Artificial Intelligence from cloud dashboards into jobsites, deploying sensors, computer vision, and IoT to cut waste, speed schedules, and lower emissions.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.