Nvidia Blackwell Ultra GB300 NVL72 targets massive gains in agentic artificial intelligence inference

February 17, 2026

Nvidia’s GB300 NVL72 system, built on the Blackwell Ultra GPU and a codesigned software stack, sharply improves performance and cost for agentic artificial intelligence workloads, from low-latency assistants to long-context coding tools. New data highlights throughput-per-megawatt and token-cost advantages over both Hopper and prior Blackwell platforms.

The Nvidia Blackwell platform is seeing broad adoption among inference providers such as Baseten, DeepInfra, Fireworks Artificial Intelligence and Together Artificial Intelligence, with deployments already reducing cost per token by up to 10x compared with earlier generations. Agentic Artificial Intelligence use cases and coding assistants are driving rapid growth in software-programming-related Artificial Intelligence queries, which increased from 11% to about 50% last year according to OpenRouter’s State of Inference report, and these workloads demand both low latency across multistep workflows and long context to reason over entire codebases. New SemiAnalysis InferenceX performance data indicates that Nvidia’s combination of software optimizations and the next-generation Blackwell Ultra platform pushes Nvidia GB300 NVL72 systems to deliver up to 50x higher throughput per megawatt, resulting in 35x lower cost per token compared with the Nvidia Hopper platform.

Earlier analysis from Signal65 found that Nvidia GB200 NVL72 with tightly codesigned hardware and software delivers more than 10x more tokens per watt, which results in one-tenth the cost per token compared with the Nvidia Hopper platform, and these gains have been expanding as the stack improves. Continuous optimizations from teams behind Nvidia TensorRT-LLM, Nvidia Dynamo, Mooncake and SGLang are significantly boosting Blackwell NVL72 throughput for mixture-of-experts inference at all latency targets, and Nvidia TensorRT-LLM library changes alone have delivered up to 5x better performance on GB200 for low-latency workloads compared with just four months ago. Building on these advances, GB300 NVL72 with the Blackwell Ultra GPU extends throughput-per-megawatt to 50x compared with Hopper, and this translates into up to 35x lower cost per million tokens at low latency where agentic applications operate, enabling real-time interactive assistants to scale to many more users.

The benefits of GB300 NVL72 are particularly pronounced in long-context scenarios, such as Artificial Intelligence coding assistants that must reason across entire repositories. For workloads with 128,000-token inputs and 8,000-token outputs, GB300 NVL72 delivers up to 1.5x lower cost per token compared with GB200 NVL72, helped by Blackwell Ultra’s 1.5x higher NVFP4 compute performance and 2x faster attention processing that allow efficient understanding of entire code bases. Major cloud providers including Microsoft, CoreWeave and Oracle Cloud Infrastructure are deploying GB300 NVL72 for low-latency and long-context use cases, with CoreWeave emphasizing that Grace Blackwell NVL72 improves token economics and makes large-scale inference more usable for customers. Looking ahead, the Nvidia Rubin platform, which combines six new chips into a single Artificial Intelligence supercomputer, is positioned to deliver further improvements, including up to 10x higher throughput per megawatt for mixture-of-experts inference compared with Blackwell that translate into one-tenth the cost per million tokens, and the ability to train large mixture-of-experts models using just one-fourth the number of GPUs compared with Blackwell.

Source

68

Impact Score

Latest News

Artificial intelligence drives new search for antibiotic peptides across nature and prehistory

February 17, 2026

Bioengineer César de la Fuente is using artificial intelligence to mine genomes, venoms, and even extinct species for antimicrobial peptides that could counter rising drug resistance. His work aims to transform early stage discoveries into candidate drugs using generative models and a new multimodal system called ApexOracle.

Cyberstalking campaign against researcher and artificial intelligence voice cloning for a singer with ALS

February 17, 2026

A cybersecurity investigator targeted with online death threats set out to expose her harassers, while a musician with ALS used artificial intelligence voice cloning tools to return to the stage.

Artificial Intelligence reshapes audio for hybrid collaboration

February 17, 2026

Shure and Zoom are using Artificial Intelligence and advanced audio engineering to turn remote and hybrid communication into a seamless, context-aware experience where technology recedes into the background.

Seedance 2.0 artificial intelligence video of Jujutsu Kaisen fight ignites creative debate

February 17, 2026

A hyper realistic Seedance 2.0 artificial intelligence recreation of the Sukuna vs Gojo battle from Jujutsu Kaisen has split anime fans, spotlighting both the creative potential and ethical concerns around artificial intelligence video tools.

Higgsfield brings cinematography-grade control to artificial intelligence video with Cinema Studio 2.0

February 17, 2026

Higgsfield’s Cinema Studio 2.0 aims to turn artificial intelligence video generation from a black box into a director-grade toolset, giving creators granular control over cameras, motion, and storytelling. Early adoption by major studios and brand work suggests the platform is moving from experimentation to practical production use.

Nvidia Blackwell Ultra GB300 NVL72 targets massive gains in agentic artificial intelligence inference

68

Impact Score

Latest News

Artificial intelligence drives new search for antibiotic peptides across nature and prehistory

Cyberstalking campaign against researcher and artificial intelligence voice cloning for a singer with ALS

Artificial Intelligence reshapes audio for hybrid collaboration

Seedance 2.0 artificial intelligence video of Jujutsu Kaisen fight ignites creative debate

Higgsfield brings cinematography-grade control to artificial intelligence video with Cinema Studio 2.0

Contact Us