NVIDIA Rubin Ultra reportedly hits packaging limits at TSMC

NVIDIA is reportedly running into manufacturing problems with Rubin Ultra as its planned package pushes beyond current TSMC capabilities. The issue centers on CoWoS-L packaging for a much larger multi-die, high-bandwidth memory design.

NVIDIA is reportedly facing manufacturing issues with its next-generation Rubin Ultra GPU design due to the limits of current packaging technology. The company is already shipping customer samples of the standard Rubin GPUs, with mass shipments set to begin this summer. However, the roadmap for Rubin Ultra may be running into technical constraints because its design targets appear too aggressive for TSMC’s packaging capabilities.

NVIDIA reportedly plans to expand the regular Rubin two-die package with 8 HBM4 modules into a Rubin Ultra package that will include four silicon dies and 16 HBM4E modules in a single package. This configuration is scheduled for 2027, but the total amount of silicon may be too much for TSMC’s packaging technology, according to Global Semi Research. In a typical CoWoS package, TSMC combines multiple smaller dies and multiple HBM memory modules into a unified package that supports the broader Artificial Intelligence build-out.

For Rubin Ultra, NVIDIA had planned to use CoWoS-L, which was expected to support the design concept behind the chip. In a 2+2 die package, meaning four dies in this architecture, TSMC is reportedly encountering warping issues. The package, including the substrate, is said to be bending in multiple directions, preventing the compute dies of Rubin Ultra from making complete contact with the underlying substrate.

That instability is pushing TSMC to consider other packaging options within its portfolio. One alternative under consideration is CoPoS, short for Chip-on-Panel-on-Substrate. The reported shift highlights how packaging has become a central constraint in advanced processor development, especially for increasingly large multi-die configurations aimed at high-end Artificial Intelligence workloads.

72

Impact Score

ARC-AGI-3 exposes limits in Artificial Intelligence reasoning

ARC-AGI-3 introduces interactive, instruction-free environments designed to test whether frontier Artificial Intelligence systems can adapt to genuinely novel situations. Early results show top models performing near zero, highlighting a sharp gap between pattern recognition and open-ended exploration.

Intel BOT reshapes code execution through vectorization

Intel’s Binary Optimization Tool is changing how executable applications run on Arrow Lake Refresh systems, with measurable gains in some workloads. Primate Labs found that the tool cuts instruction counts and aggressively shifts execution from scalar code to vector instructions, prompting Geekbench to label BOT-enhanced results.

Replication studies challenge quantum computing claims

Physicists reviewing prominent topological quantum computing results found that signals described as breakthroughs could also be explained by simpler alternatives. Their effort also exposed how hard it can be to publish replication work in high-profile science journals.

Compression and voice models reshape Artificial Intelligence efficiency

Recent releases focused on infrastructure rather than headline model breakthroughs, with gains in compression and voice systems pointing to lower inference costs and broader deployment. Google and Mistral highlighted two distinct paths for real-time audio, while TurboQuant targeted one of the most expensive bottlenecks in long-context inference.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.