Red Hat Artificial Intelligence 3 tackles inference complexity

October 16, 2025

Red Hat introduced Red Hat Artificial Intelligence 3 to move enterprise models from pilots to production, with a strong focus on scalable inference on Kubernetes. The release adds llm-d, a unified API on Llama Stack, and tools for Model-as-a-Service delivery.

Red Hat is launching a revamped platform, Red Hat Artificial Intelligence 3, to help organizations move Artificial Intelligence workloads from proof of concept to production with a focus on inference. The company frames the release as an answer to stalled enterprise efforts, citing research from the Massachusetts Institute of Technology that found roughly 95 percent of organizations see no measurable financial return on enterprise Artificial Intelligence applications despite about €34.4 billion in spending. The suite includes Red Hat AI Inference Server, RHEL AI, and Red Hat OpenShift AI, and builds on the vLLM and llm-d community projects to deliver a consistent, enterprise-grade experience.

At the core is inference at scale. Red Hat OpenShift AI 3.0 introduces llm-d to run large language models natively on Kubernetes, combining intelligent distributed inference with Kubernetes orchestration. To maximize hardware acceleration, the platform leverages open source components such as the Kubernetes Gateway API Inference Extension, Nvidia Dynamo (NIXL) KV Transfer Library, and the DeepEP Mixture of Experts communication library. Red Hat says this enables cost reduction and faster responses via smart model scheduling and disaggregated serving. Operational guidance comes through prescribed “Well-lit Paths” for rolling out models at scale, and cross-platform support spans hardware accelerators from Nvidia and AMD.

The release also adds collaboration and delivery features for teams building generative applications. A Model-as-a-Service approach, built on distributed inference, lets IT teams operate as their own MaaS providers by centrally offering common models. A new Artificial Intelligence hub gives platform engineers a curated catalog to discover, deploy, and manage foundational assets, including validated and optimized generative models. For Artificial Intelligence engineers, a Gen Artificial Intelligence studio provides a hands-on environment for interacting with models, rapidly prototyping applications, and experimenting in a built-in, stateless playground.

Looking ahead, Red Hat positions the platform for the rise of agentic systems that will put heavy demands on inference. A Unified API layer based on Llama Stack aims to align with industry standards, including OpenAI-compatible large language model interface protocols. The company is also embracing the Model Context Protocol to streamline how models interact with external tools. For customization, a modular, extensible toolkit built on InstructLab supplies specialized Python libraries to give developers more flexibility and control. Together, these capabilities are intended to move enterprise Artificial Intelligence initiatives out of the experimental phase and into scalable production.

Source

55

Impact Score

Latest News

How aging clocks are reshaping what we know about biological age and whether we can reverse it

October 16, 2025

Researchers are using epigenetic aging clocks to probe why and how we age, while warning against consumer hype. New studies across mice and mammals hint that biological age is plastic and potentially reversible.

The Download: aging clocks, repairing the internet, and rare earth recycling

October 16, 2025

Today’s newsletter explores how biological aging clocks are reshaping our understanding of longevity, asks whether tech luminaries can fix a flawed internet, and spotlights a rare earth magnet recycling push set to scale outside China.

Nvidia DGX Spark arrives for world’s Artificial Intelligence developers

October 15, 2025

Nvidia is shipping DGX Spark, a compact desktop system that delivers a petaflop of Artificial Intelligence performance and unified memory to bring large model development and agent workflows on premises. Partner systems from major PC makers and channel partners broaden availability starting Oct. 15.

EU regulatory developments on the Artificial Intelligence Act

October 15, 2025

The European Commission finalized a General Purpose Artificial Intelligence Code of Practice and signaled phased enforcement of the Artificial Intelligence Act. Companies gain transitional breathing room but should use it to align with new transparency, copyright, and safety expectations.

Samsung becomes Nvidia foundry partner as Artificial Intelligence chip opportunities grow

October 15, 2025

Nvidia added Samsung Foundry and Intel to its NVLink Fusion ecosystem, opening its interconnect technology to third-party chips while unveiling new data center hardware. The move positions Samsung to pursue custom silicon clients as demand for Artificial Intelligence infrastructure accelerates.

Red Hat Artificial Intelligence 3 tackles inference complexity

55

Impact Score

Latest News

How aging clocks are reshaping what we know about biological age and whether we can reverse it

The Download: aging clocks, repairing the internet, and rare earth recycling

Nvidia DGX Spark arrives for world’s Artificial Intelligence developers

EU regulatory developments on the Artificial Intelligence Act

Samsung becomes Nvidia foundry partner as Artificial Intelligence chip opportunities grow

Contact Us