Technology
F5 Expands AI Infrastructure Capabilities with NVIDIA, Enhancing Performance, Security, and Multi-Tenancy for Next-Gen Workloads
F5 (NASDAQ: FFIV), a global leader in application delivery and security, has announced a major expansion of its AI infrastructure capabilities in collaboration with NVIDIA. The new solution integrates F5 BIG-IP Next for Kubernetes with NVIDIA BlueField-3 DPUs and the DOCA software framework, providing customers with advanced performance, multi-tenancy, and security for complex AI applications.
The announcement follows successful deployment validation by Sesterce, a European operator specializing in sovereign AI and high-performance infrastructures. The collaboration between F5 and NVIDIA is designed to meet the increasing demands of AI-first application delivery, particularly in managing large-scale LLM inference systems.
Key Capabilities and Innovations
Sesterce validated the F5 and NVIDIA integration across several core capabilities:
-
20% improvement in GPU utilization, enhancing AI processing efficiency
-
Integration with NVIDIA Dynamo and KV Cache Manager to minimize latency and optimize GPU and memory usage
-
Smart LLM routing via BlueField DPUs, enabling seamless workload management across multiple AI models
-
Secured Model Context Protocol (MCP) operations with reverse proxy functionality for safer and scalable LLM applications
-
Advanced programmability with F5 iRules for dynamic traffic management and AI security customization
“F5’s dynamic load balancing and Kubernetes ingress support allow us to distribute traffic efficiently and bring added value to our customers,” said Youssef El Manssouri, CEO and Co-Founder of Sesterce. “We’re excited by F5’s growing support for NVIDIA AI use cases and look forward to future innovation.”
Redefining LLM Traffic Management
One of the standout features is LLM traffic routing within BIG-IP Next for Kubernetes. Simple AI queries can be directed to lightweight models while more complex tasks are sent to advanced LLMs. This intelligent traffic routing ensures reduced latency, improved time-to-first-token, and domain-specific optimization, elevating the user experience and system performance.
“Routing and classifying LLM traffic is compute-intensive,” noted Kunal Anand, Chief Innovation Officer at F5. “By embedding routing logic directly on BlueField-3 DPUs, we’re redefining how AI workloads are delivered securely and efficiently at scale.”
Enhancing Distributed AI Inference with NVIDIA Dynamo
The integration with NVIDIA Dynamo facilitates AI model orchestration, memory management, and real-time inference. The use of KV caching reduces recomputation overhead, significantly cutting infrastructure costs and increasing the efficiency of distributed inference environments.
“BIG-IP Next with BlueField-3 provides a unified control point for AI traffic, training, inference, and agentic AI,” added Ash Bhalgat, Senior Director of AI Networking and Security at NVIDIA. “The platform supports advanced features like KV caching while maintaining top-tier security and programmability.”
Strengthening Security for MCP and AI Workloads
With the adoption of Model Context Protocol (MCP) by organizations leveraging agentic AI, F5’s reverse proxy capabilities provide enhanced cybersecurity protections and protocol resilience. The iRules engine enables rapid response to changing threat landscapes and evolving AI architecture standards.
“F5 and NVIDIA are delivering integrated AI feature sets and automation capabilities we’re not seeing elsewhere in the market,” said Greg Schoeny, SVP at World Wide Technology.
Looking Ahead
F5’s collaboration with NVIDIA positions both companies at the forefront of scalable, secure, and high-performance AI infrastructure. As AI adoption accelerates across industries, their combined technologies aim to address the full spectrum of enterprise needs—from LLM inference to traffic management, security, and programmable automation.