Nvidia introduces opt-in software for data center GPU fleet management

Nvidia is developing an optional, customer-installed software service that gives data center operators a unified view of their Artificial Intelligence GPU fleets, with real-time telemetry to improve uptime and efficiency.

Nvidia is building an optional software service designed to help data center operators manage and monitor large fleets of Nvidia GPUs used in Artificial Intelligence infrastructure. As the scale and complexity of Artificial Intelligence systems increase, operators need continuous visibility into performance, temperature and power usage across distributed environments. The new service aims to provide a centralized insights dashboard so cloud providers and enterprises can track GPU health, validate that systems operate at peak efficiency and reliability, and ultimately maximize uptime.

The offering is an opt-in, customer-installed service that focuses on monitoring GPU usage, configuration and errors rather than controlling hardware. Each GPU system will communicate and share metrics with an external cloud service to enable real-time monitoring. Nvidia states that its GPUs do not include hardware tracking technology, kill switches or backdoors. The service will ship with an open-source client software agent, reflecting Nvidia’s commitment to open and transparent tooling and giving customers a reference implementation they can adapt to their own monitoring solutions.

Through a portal hosted on Nvidia NGC, customers will be able to stream node-level GPU telemetry and visualize fleet utilization globally or by defined compute zones grouped by physical or cloud locations. The dashboard will show utilization, memory bandwidth, interconnect health, power usage spikes, thermal hotspots, airflow issues and software configuration consistency, helping identify bottlenecks, failing parts and configuration drift. The agent provides read-only, customer-managed telemetry and cannot modify GPU configurations or underlying operations, while also supporting customizable reporting on GPU fleet information. Nvidia positions this software as a tool to keep Artificial Intelligence data centers running at peak health as Artificial Intelligence applications grow in number and complexity, and points readers to the upcoming Nvidia GTC event in San Jose, California, for more details.

50

Impact Score

Indiana launches Artificial Intelligence business portal

Indiana is rolling out IN AI, a statewide portal meant to help employers adopt Artificial Intelligence with practical guidance, workshops and peer support. State leaders and business groups are positioning the effort as a way to raise productivity, wages and job growth while keeping workers at the center.

Goodfire launches model debugging tool for large language models

Goodfire has introduced Silico, a mechanistic interpretability platform designed to let developers inspect and adjust model behavior during development. The company is positioning it as a way to give smaller teams deeper control over open-source models and more trustworthy outputs.

Nvidia launches nemotron 3 nano omni for enterprise agents

Nvidia has introduced Nemotron 3 Nano Omni, a multimodal open model designed to support enterprise agents that reason across vision, speech and language. The launch extends Nvidia’s push beyond hardware into models and services while targeting more efficient agentic workflows.

Intel 18A-P node improves performance and efficiency

Intel plans to present new results for its 18A-P process at the VLSI 2026 Symposium, highlighting gains in performance, power efficiency, and manufacturing predictability. The updated node is positioned as a stronger option for customers seeking 18A density with better operating characteristics.

EA CEO defends broader Artificial Intelligence use in game development

EA CEO Andrew Wilson defended the company’s internal use of Artificial Intelligence after employee claims that the tools were slowing work rather than helping. He framed the technology as an aid for repetitive quality assurance tasks, even as concerns persist over its broader impact on development.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.