By Kaushik Ghosh, Director, Product Management, Nutanix
Alex Almeida, Sr. Product Marketing Manager, Nutanix
The era of simple chatbots is over. Enterprise AI is rapidly evolving beyond model training and basic inferencing toward Agentic AI - autonomous systems capable of complex reasoning, long-running workflows, persistent memory, and real-time decision making.
But as AI agents evolve from short prompts to hours-long reasoning sessions operating on continuously changing "live” enterprise data, a major infrastructure challenge surfaces: traditional storage was never designed to be the "living memory" of AI.
As enterprises scale from experimentation to production-grade AI factories, they hit two major walls:
To address these challenges, Nutanix is evolving Nutanix Unified Storage into the data fabric of the Nutanix Agentic AI stack. Rather than acting as passive capacity, Nutanix Unified Storage becomes the high-speed data engine of the AI Factory.
Since Large Language Model (LLM) context-memory can become large, it is organized hierarchically in tiers for optimum performance and economics. Tiers 1-3 are local to the node and stored in GPU VRAM, system memory and local NVMe drives. Tier 4 is the foundational shared storage layer, representing the “living memory” of the AI Factory.
Nutanix is operationalizing this fourth tier by providing an RDMA-enabled, high-performance, low-latency data layer capable of supporting thousands of GPUs. By integrating LMCache – specialized cache-tiering orchestration software – with Nutanix Unified Storage, AI memory is seamlessly offloaded from expensive, capacity-constrained local nodes to resilient data-center shared storage.
This tiered approach to context memory helps ensure enterprises can:
Built with NVIDIA AI Data Platform reference design, Nutanix AI Data Platform provides capabilities that customers can use to enable AI agents to reason over enterprise data the moment it is created. By integrating NVIDIA AI Enterprise software and Milvus Vector Database directly with Nutanix Unified Storage, organizations can build continuous data pipelines that can ingest, transform, and vectorize raw data in real time. Uniquely, Nutanix allows mixing of GPU-enabled and CPU-only dense storage nodes within a single storage cluster. This “compute-adjacent” architecture brings AI to data, ensuring AI agents are always grounded in the freshest proprietary intelligence and significantly reducing the latency and friction of traditional data movement.
To keep pace with high-speed AI compute, Nutanix Unified Storage aims to deliver a low-latency RDMA-enabled data path between GPUs and storage memory. As a validated NVIDIA Magnum IO GPUDirect Storage solution, Nutanix Unified Storage allows AI workloads to bypass the CPU entirely for I/O, reducing CPU overhead on both client and storage nodes while maximizing GPU utilization and lowering cost per token. Today, NFS over RDMA is supported for high-performance file access, and planned support is intended to extend this capability to S3 over RDMA for object storage. This breakthrough combines the massive scalability of object stores with ultra-low-latency direct GPU access—making Nutanix Unified Storage objects stores an ideal data foundation for large-scale AI workloads and modern AI Factories.
AI is only as trustworthy as the data it is grounded in. Nutanix Data Lens (NDL) provides the essential security and governance for the data being fed into the AI Factory, delivering proactive auditing, ransomware protection, and secure data isolation. Through a single SaaS-based portal—or running directly on a Nutanix storage cluster—NDL enables organizations to monitor, secure, and govern datasets across multiple Nutanix Unified Storage clusters, whether within a single data center or globally distributed environments. This helps ensure enterprise data remains protected as it moves through the AI lifecycle. With planned capabilities such as automated data classification and metadata tagging, sensitive information can be intelligently identified, protected, and governed end-to-end, helping organizations support their compliance efforts while safely powering agentic AI workloads.
Nutanix is proud to be a design partner for NVIDIA STX, a modular storage reference architecture engineered for the AI fFactory. By co-designingdeveloping on the NVIDIA Vera Rubin NVIDIA Vera Rubin NVL72 architecture and leveraging NVIDIA BlueField-4BlueField-4 data processing units (DPUs), Nutanix will beis centralizing intelligent data handling directly into the storage layer. This helps ensure that GPUs, vector databases, and RAG pipelines operate as a cohesive, rack-scale system rather than disconnected components.
As a design partner for NVIDIA CMX, built on the NVIDIA STX reference architecture, Nutanix plans to build support for a new G3.5 pod-shared cache layer. This breakthrough delivers scalable capacity with ultra-high performance and seamless data sharing across GPU pods. This tiered approach to context memory helps ensure enterprises can run massive context windows, maximize GPU utilization and significantly reduce the “cost per token”.
The Nutanix Agentic AI stack helps enterprises scale from experimentation to production-grade AI Factories by delivering:
Nutanix Unified Storage is a core component of the Nutanix Agentic AI stack and the data foundation of the modern AI factory. By bringing AI closer to data and enabling scalable AI “living memory,” Nutanix is transforming storage from passive capacity into an intelligent, high-speed data engine built for the Agentic AI era.
In the race to operationalize agentic systems, the bottleneck is no longer just silicon—it’s the data path. The real question for modern enterprises is no longer how many GPUs they have, but whether their data foundation can keep pace with Agentic AI at scale.
With Nutanix Unified Storage, it can.
©2026 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo, and all Nutanix product and service names mentioned herein are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. Kubernetes is a registered trademark of The Linux Foundation. NVIDIA and the NVIDIA products mentioned are registered trademarks or trademarks of NVIDIA Corporation. All other brand names mentioned herein are for identification purposes only and may be the trademarks of their respective holder(s). This content may contain express and implied forward-looking statements, which are not historical facts and are instead based on our current expectations, estimates, and beliefs. The accuracy of such statements involves risks and uncertainties and depends upon future events, including those that may be beyond our control, and actual results may differ materially and adversely from those anticipated or implied by such statements. Any forward-looking statements included speak only as of the date hereof and, except as required by law, we assume no obligation to update or otherwise revise any such forward-looking statements to reflect subsequent events or circumstances.