UC San Diego lab accelerates generative Artificial Intelligence research with NVIDIA DGX B200

The Hao Artificial Intelligence Lab at UC San Diego is using a new NVIDIA DGX B200 system to push research on low-latency large language model serving, video generation, and benchmarking through games. Their DistServe work on disaggregated inference is already influencing production platforms like NVIDIA Dynamo.

The Hao Artificial Intelligence Lab at the University of California San Diego has received an NVIDIA DGX B200 system to advance its work on large language model inference. The system, hosted at the San Diego Supercomputer Center within the School of Computing, Information and Data Sciences, is fully accessible to the lab and the broader UC San Diego community. Assistant professor Hao Zhang describes the DGX B200 as one of the most powerful Artificial Intelligence systems from NVIDIA, and says it enables the team to prototype and experiment much faster than with previous-generation hardware.

The DGX B200 is already accelerating two flagship projects: FastVideo and the Lmgame benchmark. FastVideo is training a family of video generation models that are designed to produce a five-second video based on a given text prompt in just five seconds, and its research phase also taps into NVIDIA H200 GPUs along with the DGX B200 system. Lmgame-bench is a benchmarking suite that evaluates large language models through popular online games such as Tetris and Super Mario Bros, allowing users to test one model at a time or pit two models against each other. Other ongoing projects at the lab aim to achieve low-latency large language model serving for real-time responsiveness, with doctoral candidate Junda Chen noting that the DGX B200 is being used to explore the next frontier of low-latency serving on its advanced hardware specifications.

The lab’s earlier DistServe work has shaped a disaggregated inference approach that is influencing platforms like NVIDIA Dynamo, an open-source framework to scale generative Artificial Intelligence models efficiently. DistServe promotes the metric of “goodput,” which measures throughput while meeting user-specified latency service-level objectives, as a better indicator of system health than raw throughput alone. In typical large language model serving, prefill and decode phases historically run on the same GPU, but the DistServe researchers show that splitting them onto different GPUs maximizes goodput. By separating compute-intensive prefill from memory-intensive decode onto two different sets of GPUs, they can eliminate resource contention and make both jobs run faster, a process called prefill/decode disaggregation. This disaggregated inference method increases goodput, supports continuous workload scaling, and helps maintain low latency and high-quality responses. In parallel, cross-departmental collaborations in areas such as healthcare and biology are using the DGX B200 to optimize diverse research projects as UC San Diego teams further explore how Artificial Intelligence platforms can accelerate innovation.

50

Impact Score

Artificial Intelligence and crypto at token2049 singapore 2025

A token2049 singapore 2025 session examines how the convergence of artificial intelligence and cryptocurrency could enable autonomous financial systems and new models of agentic commerce. Speakers from fabric ventures, ritual, and virtuous protocol discuss blockchain’s role in trustless coordination and financing artificial intelligence infrastructure.

Artificial Intelligence in financial services: emerging global norms

A new IRSG report argues that Artificial Intelligence is reshaping financial services by amplifying existing risks rather than creating new ones, and calls for interoperable, principles-based supervision instead of new hard global rules. The analysis highlights broad alignment on high-level international principles but diverging national approaches to implementation.

MGAs urged to bring shadow Artificial Intelligence under strict governance in 2026

Managing general agents are entering 2026 under pressure to harness generative Artificial Intelligence while imposing robust controls on shadow usage, data protection and regulatory compliance. The article argues that banning tools is unrealistic, and instead calls for structured governance frameworks that bring Artificial Intelligence out of the shadows.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.