Hot Chips, running Aug. 24-26 at Stanford University, will spotlight inference, networking and Artificial Intelligence reasoning for processor and system architects from industry and academia. NVIDIA will join Google and Microsoft in a tutorial on rack-scale architecture and will present four sessions and one tutorial covering technologies that span data-center fabrics, desktop supercomputers and developer toolchains. Presenters from NVIDIA include Idan Burstein, Marc Blackstein, Gilad Shainer and Andi Skende.
NVIDIA will detail how networking and rack-scale designs deliver Artificial Intelligence reasoning at scale. Burstein will discuss NVIDIA ConnectX-8 SuperNICs and how they enable high-speed, low-latency multi-GPU communication, supported by NVLink, NVLink Switch and NVLink Fusion for scale-up connectivity. Spectrum-X Ethernet provides the scale-out fabric for cluster interconnects, while the new Spectrum-XGS Ethernet scale-across technology aims to interconnect distributed data centers into giga-scale AI super-factories. Shainer will cover co-packaged optics switches with integrated silicon photonics and how those CPO switches push performance and efficiency for gigawatt-scale deployments. NVIDIA also highlights the GB200 NVL72 rack, an exascale system with 36 GB200 Superchips and an NVLink Switch delivering 130 terabytes per second of low-latency GPU communication.
The company will discuss compute and software that bring Artificial Intelligence to developers and end users. Blackwell architecture powers the NVIDIA GeForce RTX 5090 GPU and drives neural rendering and inference gains across graphics and simulation. Skende will present the NVIDIA GB10 Superchip and the DGX Spark desktop supercomputer, which supports NVFP4 for efficient agentic Artificial Intelligence inference and LLM workloads. NVIDIA emphasized CUDA as its broad developer platform and listed open-source libraries and frameworks it accelerates, including TensorRT-LLM, Dynamo, TileIR, Cutlass, NCCL and NIX. NVIDIA NIM microservices and model optimizations for frameworks such as PyTorch and vLLM are noted as ways to deploy models like gpt-oss and Llama 4 on preferred infrastructure.