Red Hat Artificial Intelligence 3 tackles inference complexity

Red Hat introduced Red Hat Artificial Intelligence 3 to move enterprise models from pilots to production, with a strong focus on scalable inference on Kubernetes. The release adds llm-d, a unified API on Llama Stack, and tools for Model-as-a-Service delivery.

Red Hat is launching a revamped platform, Red Hat Artificial Intelligence 3, to help organizations move Artificial Intelligence workloads from proof of concept to production with a focus on inference. The company frames the release as an answer to stalled enterprise efforts, citing research from the Massachusetts Institute of Technology that found roughly 95 percent of organizations see no measurable financial return on enterprise Artificial Intelligence applications despite about €34.4 billion in spending. The suite includes Red Hat AI Inference Server, RHEL AI, and Red Hat OpenShift AI, and builds on the vLLM and llm-d community projects to deliver a consistent, enterprise-grade experience.

At the core is inference at scale. Red Hat OpenShift AI 3.0 introduces llm-d to run large language models natively on Kubernetes, combining intelligent distributed inference with Kubernetes orchestration. To maximize hardware acceleration, the platform leverages open source components such as the Kubernetes Gateway API Inference Extension, Nvidia Dynamo (NIXL) KV Transfer Library, and the DeepEP Mixture of Experts communication library. Red Hat says this enables cost reduction and faster responses via smart model scheduling and disaggregated serving. Operational guidance comes through prescribed “Well-lit Paths” for rolling out models at scale, and cross-platform support spans hardware accelerators from Nvidia and AMD.

The release also adds collaboration and delivery features for teams building generative applications. A Model-as-a-Service approach, built on distributed inference, lets IT teams operate as their own MaaS providers by centrally offering common models. A new Artificial Intelligence hub gives platform engineers a curated catalog to discover, deploy, and manage foundational assets, including validated and optimized generative models. For Artificial Intelligence engineers, a Gen Artificial Intelligence studio provides a hands-on environment for interacting with models, rapidly prototyping applications, and experimenting in a built-in, stateless playground.

Looking ahead, Red Hat positions the platform for the rise of agentic systems that will put heavy demands on inference. A Unified API layer based on Llama Stack aims to align with industry standards, including OpenAI-compatible large language model interface protocols. The company is also embracing the Model Context Protocol to streamline how models interact with external tools. For customization, a modular, extensible toolkit built on InstructLab supplies specialized Python libraries to give developers more flexibility and control. Together, these capabilities are intended to move enterprise Artificial Intelligence initiatives out of the experimental phase and into scalable production.

55

Impact Score

What businesses need to know about the EU cyber resilience act

The EU cyber resilience act is turning product cybersecurity into a legal requirement for companies that sell digital products into the European Union. A key compliance milestone arrives in September 2026, well before the full regulation takes effect in 2027.

Claude Mythos and cyber insurance’s next inflection point

Claude Mythos is being treated by governments and regulators as a potential systemic cyber risk with implications for financial stability and insurance markets. Its emergence is intensifying pressure on insurers to clarify whether Artificial Intelligence-enabled cyber losses are covered, excluded, or require new stand-alone products.

OpenAI expands ChatGPT ads with self-serve manager

OpenAI is widening its ChatGPT ads pilot with a beta self-serve Ads Manager, new bidding options and broader measurement tools. The push signals a deeper move into advertising as the company expands the program into several international markets.

OpenAI launches Artificial Intelligence deployment consulting unit

OpenAI has created a new consulting and deployment business aimed at helping enterprises build and roll out Artificial Intelligence systems. The move mirrors a similar push by Anthropic and signals a broader effort by model providers to capture more of the enterprise services market.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.