Nvidia plans stronger fp64 performance for next gen high performance computing gpus

Nvidia is reaffirming its commitment to 64-bit floating point performance in high performance computing, signaling that upcoming architectures will restore and enhance fp64 capabilities after recent generations prioritized lower precision throughput.

Nvidia is pushing back against the perception that it is moving away from high performance computing and 64-bit precision, clarifying that recent product choices do not signal an exit from the space. The company told HPCWire that 64-bit floating point data remains central to its roadmap, even as recent architectures such as Hopper and Blackwell have emphasized lower precision formats more aligned with acceleration for artificial intelligence workloads.

Dion Harris, senior director of high performance computing and artificial intelligence hyperscale infrastructure solutions at Nvidia, said the company is “definitely looking to bring some additional [FP64] capabilities in our future gen architectures” and stressed that Nvidia is “very serious about making sure that we can deliver the required performance to power those simulation workloads.” The comments are aimed at users who rely on sustained double precision throughput, particularly in scientific and engineering domains, and who have been concerned by stagnating fp64 metrics in newer flagship accelerators.

The acceleration of 64-bit floating-point data paths is described as crucial for the high performance computing community, with the life sciences called out as a key beneficiary. Users have noted that when a workload demands sustained high-precision support, Nvidia’s recent generations have not met expectations. For comparison, Nvidia’s current most powerful B300 “Blackwell Ultra” accelerator achieves only 1.2 TeraFLOPS of FP64 performance. In contrast, the older H200 “Hopper” reaches an impressive 34 TeraFLOPS of FP64 compute at its peak. For FP8 low-precision, the B300 delivers 9 PetaFLOPS, while the H200 provides 3.958 PetaFLOPS. These figures highlight how Nvidia has so far optimized its newest platforms for lower precision formats, even as it now publicly commits to improving double precision capabilities in its next generation designs.

55

Impact Score

Intel details disaggregated Core Ultra Series 3 Panther Lake H die

Intel’s Core Ultra Series 3 Panther Lake H mobile processors use a disaggregated multi-tile design that splits compute, graphics, and I/O across different process nodes. The layout closely follows Lunar Lake, with variations in graphics tiles between mainstream and ultraportable configurations.

Pentagon surveillance powers collide with artificial intelligence limits

A dispute between the Pentagon and leading artificial intelligence companies is exposing how far US surveillance law lags behind modern data collection and analysis capabilities. Contracts, not legislation, are currently setting the boundaries for military use of powerful artificial intelligence tools.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.