F5 unveils BIG-IP Next enhancements for Kubernetes and generative Artificial Intelligence workloads

F5 expands BIG-IP Next for Kubernetes with new features powered by NVIDIA BlueField-3 DPUs, targeting efficient and secure management of large-scale Artificial Intelligence deployments.

F5 has introduced new capabilities for its BIG-IP Next for Kubernetes solution, leveraging the power of NVIDIA BlueField-3 data processing units (DPUs) and the NVIDIA DOCA software framework. The enhancements are validated by Sesterce, a European leader in next-generation infrastructure and sovereign Artificial Intelligence, and are designed to address the growing demand for accelerated computing, high-performance artificial intelligence applications, and secure, scalable infrastructure.

Builtin support for NVIDIA BlueField-3 DPUs enables BIG-IP Next for Kubernetes to provide advanced traffic management and security specifically tuned for large-scale Artificial Intelligence workloads. Sesterce’s validation highlighted a 20% improvement in GPU utilization through enhanced multi-tenancy and security features. Integration with NVIDIA Dynamo and KV Cache Manager further optimizes GPU and memory throughput by reducing latency for large language model inference and supporting memory-efficient generative Artificial Intelligence use cases. The solution also features smart large language model routing via DPUs, coordinated with NVIDIA NIM microservices, enabling dynamic handling of workloads requiring multiple models and ensuring customers access optimal model performance across queries.

Additional updates focus on scalability and security through Model Context Protocol (MCP) support. BIG-IP Next acts as a reverse proxy, protecting MCP servers and facilitating more secure, flexible deployment of large language models. Robust programmability is available via F5 iRules, empowering organizations to quickly adjust data policies and security postures to evolving Artificial Intelligence application requirements. According to F5’s chief innovation officer Kunal Anand, programming routing logic on DPUs not only maximizes delivery efficiency and security for language model traffic but also lays a foundation for future co-innovation as enterprise Artificial Intelligence adoption grows.

By combining advanced load balancing, smart LLM routing, and GPU resource optimization within Kubernetes environments, F5’s expanded capabilities aim to streamline the deployment of multi-model Artificial Intelligence systems. This positions enterprise customers to innovate faster while maintaining tight control over security, performance, and cost across rapidly evolving Artificial Intelligence infrastructures.

69

Impact Score

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.

Please check your email for a Verification Code sent to . Didn't get a code? Click here to resend