Runtime
5 minute read
Audience
AI & HPC leaders, architects, DevOps
Primary themes
Performance · Economics · Simplicity
AI factories and Neocloud operators are deploying GPU infrastructure at unprecedented scale, but storage remains the overlooked bottleneck preventing optimal performance. If storage cannot feed GPUs fast enough, training stalls, checkpoints burn compute dollars, and inference latency spikes. Storage is also the single largest blast radius in any GPU cluster: SemiAnalysis estimates that a 5,000-GPU cluster running at just 98% storage availability bleeds roughly 876,000 GPU-hours per year, approximately $2.6 million in idle compute. Meanwhile, flash prices have surged as much as 472% in a single year, exposing organizations locked into all-flash architectures to volatile and unpredictable economics. Storage, once a quiet 10% line item, is now trending toward 20–30% of cluster spend in all-flash deployments; every dollar overspent on flash is a GPU you cannot deploy.
The VDURA® Data Platform V12 is modern data storage infrastructure software purpose-built for these challenges. Built on the HYDRA architecture, VDURA combines the performance of a true parallel file system with the resilience and cost-efficiency of object storage in a unified, software-defined system. By pairing NVMe flash for performance-critical workloads with high-capacity HDD for data retention, VDURA delivers the same mixed-fleet storage model to AI factories, Neoclouds, and enterprises that hyperscalers like Google, Meta, and Microsoft already deploy in production. All of this is backed by at least six nines of availability and up to 12 nines of durability, proven across more than 1,000 production deployments.
The VDURA Data Platform is modern data storage infrastructure software built for performance, durability, and scalability.
V12 also introduces the Elastic Metadata Engine with up to 20× acceleration in metadata operations, native snapshot support for AI pipeline checkpoints, SMR HDD optimization unlocking 25–30% more capacity per rack, RDMA support for GPU-native data paths that bypass the CPU entirely, Context-Aware Tiering, end-to-end encryption, and a native CSI plug-in for Kubernetes. Built on over 25 years of innovation in high-performance computing, VDURA delivers unprecedented parallel throughput, ultra-low latency metadata operations, and superior data protection across the entire AI pipeline.
With linear scalability to thousands of nodes, integrated value-tier storage, and total cost of ownership more than 60% lower than all-flash architectures at comparable performance levels, VDURA eliminates the traditional compromises between performance, durability, and cost. This document outlines the architectural components, performance capabilities, and deployment options of the VDURA Data Platform and V5000™ system, showcasing why leading AI factories, Neocloud operators, and enterprises trust VDURA to power their most demanding workloads.
AI factories and Neocloud operators are building the infrastructure that will define the next decade of computing. Purpose-built GPU clouds represent an estimated $35 billion market today, projected to reach $236 billion by 2031. Yet the storage layer powering these environments remains the single largest performance bottleneck and the single largest blast radius in any GPU cluster. Storage, once a quiet 10% line item, is trending toward 20–30% or more in all-flash deployments. As NAND flash prices surge to an order of magnitude above HDD costs per TB, every dollar overspent on flash is a GPU you cannot deploy.
The hyperscalers figured this out years ago; Google, Meta, and Microsoft do not run all-flash storage. They deploy mixed-fleet architectures; intelligent tiering with just enough NVMe flash to saturate GPU throughput, then draining data to high-density HDDs. Flash is a performance medium, not a capacity medium. VDURA brings that same model to every AI factory, Neocloud, and enterprise.
The VDURA Data Platform is modern data storage infrastructure software built for performance, durability, and scalability.
VDURA combines true parallel performance, hyperscale durability, and effortless scalability in a single, software-defined platform. It delivers the performance AI training and inference demand on NVMe flash while leveraging high-capacity HDD for cost-efficient data retention, all under one control plane, one data plane, and one global namespace. Total cost of ownership is more than 60% lower than all-flash architectures at comparable performance levels.
We’re not new to this space; we helped define it.
Before the rise of AI, cloud-native workloads, and modern data infrastructure, there was Panasas®, the company that reshaped the high-performance computing landscape with the industry’s first true parallel file system. For more than 20 years, Panasas PanFS® set the bar for scalable performance, mixed workload efficiency, and enterprise-grade reliability in environments where data is everything. Built on that core architecture, VDURA is the modern evolution of a platform trusted by the world’s most data-intensive environments, with over 1,000 production deployments in more than 50 countries, with tens of millions of cumulative runtime hours and exabytes upon exabytes of data managed.
VDURA is where velocity meets durability.
The name says it all: lightning-fast NVMe flash throughput and petabyte-scale HDD capacity meets industry-leading durability in a platform that scales linearly to thousands of nodes. VDURA combines the scalable, linear high performance of a true parallel file system with a cost-efficient, resilient object store, unifying active and bulk storage under one architecture. The VeLO™ metadata engine powers intelligent data flow and fast namespace operations, delivering a software-defined platform built for AI factories, Neoclouds, and HPC environments that is simple to deploy and effortless to scale.
Storage architecture for the modern AI era was not designed in a vacuum; it was forged by hyperscalers, where the first organizations to confront true exabyte-scale workloads, billion-file namespaces, and unforgiving GPU economics had no choice but to invent something new. AI factories, Neoclouds, and enterprises are now arriving at the same crossroad. VDURA is the platform that brings that proven hyperscaler architecture to them.
A decade and a half ago, Google, Meta, Microsoft, and Amazon each ran into the same wall: storage architectures built for traditional enterprise applications could not survive search-, cloud-, social-, and AI-scale workloads. Centralized controllers serialized metadata. Static tiering ignored changing access patterns. Flash alone could not deliver capacity economics at hyperscale. RAID-protected arrays could not maintain durability efficiently at exabyte scale. Each hyperscaler responded by rebuilding storage around software, scale-out metadata, automated data placement, and distributed durability.
Google Colossus evolved beyond GFS with a distributed metadata architecture and software-driven placement across SSD and HDD. Meta’s Tectonic applied a similar lesson at social scale, disaggregating metadata and storage layers while using software-defined replication and erasure coding for durability. AWS S3 and Microsoft Azure Storage productized the same core operating principle for cloud-scale storage: use software to manage placement, durability, and economics across tiers instead of forcing every byte onto the most expensive media.
The pattern is clear: hyperscalers do not treat all-flash as the default architecture for massive data growth. They use flash where performance matters most, dense capacity media where economics matter most, and software to decide where data belongs.
That lesson now applies to every organization deploying GPUs at scale. Flash is a performance medium, not a capacity medium. The economically viable architecture for AI is mixed-fleet storage, governed by intelligent software, under a single namespace.
AI workloads break the assumptions that traditional enterprise storage was built on. Training reads massive datasets in randomized batches across thousands of GPUs simultaneously. Checkpointing demands burst-write throughput at the rate of an entire model state every few minutes. Inference produces high-concurrency random reads against weights, embeddings, and growing RAG corpora, with first-token latency budgets measured in single-digit milliseconds. Metadata operations explode into the billions per second. Tier policies change by the hour, not the quarter.
Storage that was perfectly adequate for relational databases, virtual machines, or shared user drives collapses under these conditions. Centralized controllers become bottlenecks. NFS gateways serialize what must be parallel. RAID groups cannot rebuild fast enough to hold their durability promise at petabyte scale. All-flash arrays drain capital budgets and consume rack power that should be feeding GPUs. The architecture that worked yesterday is the architecture that strands GPUs today.
VDURA is the realization of the hyperscaler playbook for organizations that do not have large teams of distributed-systems engineers. VDURA’s HYDRA architecture separates control plane from data plane the way Colossus and Tectonic do. The VeLO metadata engine distributes namespace operations across stateless Director Nodes, eliminating the single-controller bottleneck. Storage Nodes pair NVMe flash for hot data with high-capacity HDDs for retention, governed by Dynamic Data Acceleration™ and, in V12, Context-Aware Tiering. Multi-Level Erasure Coding™ delivers software-defined durability up to 12 nines, well past what hardware RAID can guarantee. Self-healing, automated rebalancing, and continuous data scrubbing match what hyperscale operators built internally over a decade of trial and error.
What the hyperscalers proved, VDURA productizes for everyone else. AI factories, Neoclouds and enterprises get the same hyperscale-grade storage tech the four hyperscalers invented for themselves, packaged as a software-defined platform that runs on commodity Dell, Supermicro, AIC, or any roadmap-certified server, with one control plane, one data plane, and one global namespace. The result is the storage substrate the AI era requires, finally, available to every organization that needs it, not just to the four companies that invented it.

The storage industry is experiencing unprecedented flash price volatility. SSD prices climbed nearly 24% in just three weeks in early 2026, and enterprise-grade 30 TB TLC SSDs surged 472% between Q2 2025 and Q1 2026. The QLC-to-HDD price multiple expanded from 4.9× to 22.6× over the same period, making all-flash architectures increasingly expensive and unpredictable for large-scale AI deployments.
This volatility creates real financial exposure for organizations locked into all-flash storage strategies. A 25 PB all-flash deployment delivering 1,000 GB/s sustained performance saw its three-year total cost increase approximately 397% over one year. By contrast, a mixed-fleet architecture delivering identical performance and capacity costs significantly less, with substantially lower exposure to flash price swings.
VDURA’s architecture is uniquely positioned to address this challenge. By combining NVMe flash for performance-critical workloads with high-capacity HDD for data retention, VDURA delivers the same throughput and IOPS without requiring organizations to absorb the full impact of flash market volatility. This is the same model the hyperscalers deploy in production; VDURA makes it available to the enterprise.
VDURA has developed the only tool of its kind on the market. The interactive index tracks how flash media volatility translates into real-world cost exposure and compares it to the current HDD market, allowing organizations to model total system costs across different architectures and media configurations. Infrastructure teams use it to make data-driven decisions about storage architecture and to quantify the financial implications of choosing between all-flash and mixed-fleet approaches as pricing conditions continue to evolve. This essential tool can be found at https://www.vdura.com/flash-volatility-index-and-storage-economics-optimizer-tool/.
“As pricing conditions continue to evolve, infrastructure teams must plan for greater variability in cost dynamics. The architectures that succeed will be the ones that can adapt without compromising performance.”
—Erik Salo, SVP, VDURA
Through its sophisticated software, the VDURA Data Platform can transfer terabytes of data per second to and from your compute cluster. VDURA manages this orchestration without manual intervention, continuously balancing the load across those systems, automatically ensuring resilience, scrubbing the stored data for the highest levels of data protection, and encrypting the stored data to safeguard it from unwanted exposure.
VDURA bypasses bottlenecks with direct, parallel data transfers from NVMe flash Storage Nodes to the client. Unlike NFS or "sort-of-parallel" systems, VDURA’s shared-nothing architecture and separate metadata plane eliminate contention, delivering maximum throughput, lowest latency, and the consistent performance AI workloads demand at scale.
VDURA’s VeLO metadata engine delivers ultra-low latency for billions of file operations. V12’s Elastic Metadata Engine dynamically scales across nodes, delivering up to 20× improvement in metadata operations. Built for AI, it accelerates metadata-heavy tasks like model staging, small-file access, and checkpointing.
VDURA natively integrates high-capacity extensions as a value tier, combining NVMe flash and HDD storage within a single platform. This eliminates siloed object stores and delivers cost-efficient, long-term storage under the same namespace, making VDURA ideal for AI data lakes, model checkpoints, and archival workflows.
VDURA runs on commodity-agnostic storage with AI-grade speed and cost efficiency. The shared-nothing architecture eliminates the need for specialized HA-pair servers, firmware-based RAID controllers, or dual-ported drives, enabling the use of standard, off-the-shelf commodity servers and storage devices.
VDURA provides industry-leading end-to-end encryption. AES-256 protects data from the moment it leaves the client through the transit and at rest, with transparent, tenant-per-volume encryption and KMIP-based key management.
New in V12, RDMA support enables GPU-to-storage data transfers that bypass the CPU entirely. Direct memory access between GPU server nodes and the VDURA Data Platform eliminates CPU bottlenecks for low-latency, high-throughput data paths critical to AI training and inference workloads.
Reliability improves with scale through client-side, file-level erasure coding that protects each file individually, eliminating the need for legacy RAID or costly HA hardware. VDURA’s patented Multi-Level Erasure Coding (MLEC) provides superior data protection, delivering up to 12 nines of durability in all-flash configurations.
One vendor, one stack, one upgrade path. The HYDRA architecture seamlessly expands NVMe flash and HDD capacity, automatically balances workloads, and self-heals from failures, with non-disruptive upgrades and zero day-two complexity.
VDURA’s native Container Storage Interface (CSI) plug-in simplifies multi-tenant, Kubernetes-based deployments with zero-script persistent-volume provisioning and management. Cloud-native simplicity meets enterprise-grade storage.
One simple contract covering hardware, software, and support. VDURACare Premier™ includes 10-year, no-cost replacement of drives and 24×7 expert response, delivering comprehensive, risk-free coverage that keeps your AI factory running.
AI workloads have redefined what modern data infrastructure must deliver: speed, scale, and precision under constant I/O pressure. Most storage architectures are not built for this.
Each area of the AI pipeline is unique with different storage requirements to keep the factory running efficiently and better than the competition. The current approach has been to pick different systems for each area or to default to data infrastructure that may meet some requirements but not be ideal for all.

The following table displays the complexities of AI data infrastructure. Each stage in the AI pipeline has different read, write, throughput, capacity, and IOPS requirements that must be optimized.
| Stage | Read | Write | Data Size | AI Workload Insights |
|---|---|---|---|---|
| Data Ingest | Low | High | TBs to PBs | Bulk writes require fast speeds. Data retention requires high capacity. |
| Model Load | High | — | GBs to TBs | High throughput required. Any delay holds back the entire pipeline. |
| Training | Low | Low | TBs to PBs | Fast I/O crucial to saturating GPUs. |
| Checkpoint (Train) | — | Very High | GBs to TBs | GPUs are idle during checkpointing. Must be fast to prevent burning GPU dollars. |
| Fine-Tune | Low | Low | GBs | Smaller datasets than training. Typically, lighter on reads/writes. |
| Checkpoint (Fine-Tune) | — | Very High | GBs | High-speed write requirements similar to training checkpoints. |
| Inference | High | Low | GBs | High-concurrency random reads across model weights, embeddings, and RAG corpora. KV-cache persistence and multi-tenant isolation add write and metadata complexity beyond simple reads. |
| AI Archive / Data Retention | — | — | PBs | Long-term, cost-efficient storage for raw or processed datasets. |
Designing an AI infrastructure requires more than just performance at a single stage. It demands modern data storage infrastructure software that can handle the full pipeline.
AI and HPC pipelines demand precision for fast writes during ingest, training, and fine-tuning checkpoints; high-throughput reads during model loading and inference; and scalable, cost-effective storage for AI data retention and reuse. Most vendors force tradeoffs. Shared-everything architectures rely on centralized head nodes to handle all I/O, introducing performance chokepoints. Writes slow dramatically during cache flushes due to compression and deduplication. Bolt-on, third-party object stores for data lake functionality add latency, break the namespace, and shift complexity to the user.
These disjointed approaches cannot keep pace with modern AI workloads.
VDURA eliminates these limitations with a true parallel file system and software-defined architecture that separates the control plane from the data plane. This shared-nothing design enables scalable, high-performance throughput with no single-node bottlenecks. AI training data flows directly from NVMe flash to clients. AI archive and retention data lives cost-efficiently in high-capacity mixed-fleet nodes, all under a single global namespace.
Every stage of the AI pipeline is covered:
Intelligent orchestration automates tiering, eliminating the need for manual tuning, extra software layers, or external storage systems. VDURA is the software-defined modern data storage infrastructure platform that is purpose-built to power every stage of the AI pipeline. We combine the scalable, linear performance of a true parallel file system with the resilience and cost efficiency of object storage.
One data plane, one control plane, one namespace. Simple to deploy, operate, and grow.
The VDURA Data Platform V12 is built on a fully software-defined, microservices architecture that combines the speed and efficiency of a true parallel file system with the durability and cost-effectiveness of resilient object storage. This is HYDRA: High-Performance, Yield-Optimized, Distributed, Resilient Architecture.

This unified design ensures high performance and simplicity for active and bulk data storage and is designed specifically to address the complexities and requirements of the AI pipeline. The VDURA Data Platform explicitly separates the control plane handling metadata operations from the data plane, which is dedicated exclusively to user data storage.
Three key components work together to power the VDURA Data Platform:
Director Nodes are the core of the control plane. They orchestrate and manage all metadata operations, coordinate the actions of Storage Nodes and DirectFlow Client drivers for file access, maintain the health and membership status within the storage cluster, and oversee all recovery and reliability functions. These nodes are simple, powerful compute servers featuring high-speed networking, substantial DRAM, and NVMe flash optimized for metadata transaction logs.
The VDURA VeLO metadata engine runs on each Director Node. VeLO is distributed and flash-optimized, designed specifically for high-speed parallel metadata operations. V12’s Elastic Metadata Engine dynamically scales across nodes, delivering up to 20× improvement in metadata operations and supporting billions of files and objects under active use. This integration ensures ultra-low latency, efficient handling of billions of file operations, and consistent metadata performance at scale.
Storage Nodes form the foundation of the data plane, dedicated exclusively to storing and managing user data. Available in configurations of either all-NVMe flash for peak performance or NVMe flash with HDD capacity expansion for high-performance and economical bulk storage, Storage Nodes deliver versatile and optimized infrastructure. Each node hosts multiple Virtualized Protected Object Device (VPOD) instances, enabling granular, scalable data management and enhanced reliability through Multi-Level Erasure Coding. VPOD architecture ensures linear scalability and consistent parallel performance, accommodating thousands of nodes seamlessly within a single cluster.
The VDURA DirectFlow Client is a high-performance parallel file system driver specifically engineered for Linux-based compute environments. Deployed directly on compute servers, DirectFlow seamlessly integrates with existing Linux applications, presenting itself like any conventional file system. It provides fully POSIX-compliant, cache-coherent file operations across a unified global namespace, tightly collaborating with Director and Storage Nodes. By enabling direct, parallel I/O paths from compute servers to Storage Nodes, DirectFlow eliminates traditional bottlenecks and intermediary processing overhead found in NFS or legacy storage solutions. V12 adds RDMA support for GPU-native data paths that bypass the CPU entirely.

The VDURA Data Platform is built as a true parallel file system, engineered to handle the intense I/O demands of modern AI and HPC workloads. Each file stored by the VDURA Data Platform is individually striped across many Storage Nodes, allowing each component piece of a file to be read and written in parallel, increasing the performance of accessing every file.
VDURA’s parallel architecture dramatically accelerates data access, significantly boosting performance and throughput.
Unlike other enterprise systems which route data through limited head nodes, causing potential bottlenecks and requiring additional backend network infrastructure, VDURA’s DirectFlow Client communicates directly with all relevant Storage Nodes. Each compute server directly accesses the nodes holding the data, bypassing intermediary bottlenecks. Director Nodes manage metadata and coordinate system activity out-of-band, ensuring efficient data flow without interference or congestion.
The DirectFlow Client is lightweight, consuming approximately 191 MB of DRAM per compute node, requires zero dedicated CPU cores, and uses the standard Linux page cache rather than the pinned HugePages required by kernel-bypass data paths. CPU cycles are borrowed opportunistically during active I/O and returned immediately to the application. This efficiency matters at fleet scale: a 500-node GPU cluster running VDURA commits roughly 93 GB of DRAM to storage clients, while architectures that require kernel-bypass modes for peak performance reserve 2.5 TB of DRAM and permanently lock 500 to 2,000 CPU cores across the same fleet. Every core and every gigabyte VDURA does not claim stays available to the applications. V12 takes this further with RDMA and NVIDIA® GPUDirect Storage (GDS) support, enabling direct DMA transfers between storage and GPU HBM that bypass host DRAM and the CPU entirely.
This direct and parallel design eliminates traditional NAS hotspots, ensures predictable and scalable performance, and simplifies infrastructure by removing the need for a separate, costly backend network. VDURA architecture delivers seamless scalability, consistently high performance, and exceptional efficiency across every stage of the AI pipeline, from ingest and training to inference and long-term data retention.
The VDURA Data Platform delivers true linear scalability across both metadata and data services without compromise or complexity. AI workloads evolve fast, from early experimentation to scaled production across global clusters. Add Director Nodes to boost throughput for metadata-heavy tasks like model versioning and checkpoint tracking. Add Storage Nodes to scale bandwidth and capacity to support more training data, inference logs, or multi-tenant pipelines. VDURA enables linear scalability and seamless, predictable growth. A 50% increase in Storage Nodes delivers 50% more throughput and capacity, with no bottlenecks and no architectural redesigns.

Director Nodes serve as the brain in the VDURA architecture. VDURA separates the control plane, which handles metadata, orchestration, and policy, from the data plane, which handles user I/O. As the control plane’s core, they command every stage of the AI pipeline, from ingestion and training to checkpointing and inference. Director Nodes continuously adapt to workload changes, ensuring optimal throughput and seamless orchestration across the system.
Each Director runs VeLO, a flash-optimized metadata engine built to handle billions of operations per second. For modern AI, where performance is dictated as much by metadata velocity as data throughput, VeLO is essential. VeLO accelerates everything from tiny files to checkpoint indices to model versions. V12’s Elastic Metadata Engine dynamically scales metadata capacity across nodes, delivering up to 20× improvement in operations.
Director Nodes form the authoritative layer of VDURA’s control structure and every deployment requires a minimum of three. Administrators configure either three or five of the total Director Nodes as a replication set, or “repset,” a voting quorum that maintains a synchronized, fully replicated configuration database. One node from the repset is elected realm president and is tasked with managing configuration, status monitoring, and leading failure recovery. If the current president fails, a new one is elected instantly and automatically.
Beyond coordination, Director Nodes also perform essential tasks at the president’s request. These include managing volumes, serving as protocol gateways (NFS, SMB, S3), performing background data scrubbing, recovering failed Storage Nodes, and executing Active Capacity Balancing across VPODs. All changes are non-disruptive to clients; gateways and volumes can migrate transparently across nodes when necessary.
Storage Nodes are the backbone of VDURA’s data plane, enabling seamless scale and sustained performance throughout every stage of the AI pipeline. Designed with flexibility and resilience, these nodes combine the best of both all-NVMe flash and flash with HDD capacity expansion storage, orchestrated under a unified control plane and single global namespace.

Optimized for Every Phase of the AI Pipeline
From high-frequency ingest and bursty checkpointing to real-time inference and long-term retraining, each phase of AI benefits from storage tiers purpose-built for performance and durability:
VDURA Data Platform V12 represents a major release with significant advancements across metadata performance, data management, storage economics, and GPU-native connectivity. V12 delivers more than 20% increase in throughput, 20× metadata acceleration, and over 20% cost-per-TB reductions, all available as a zero-downtime in-place upgrade for V11 customers.
The V12 Elastic Metadata Engine dynamically scales metadata capacity across Director Nodes, delivering up to 20× improvement in metadata operations. It supports billions of files and objects under active use, eliminating metadata bottlenecks that have traditionally constrained AI pipelines at scale. The engine automatically rebalances metadata distribution as clusters grow, ensuring consistent performance regardless of namespace size.
V12 introduces native snapshot support with instantaneous, space-efficient, point-in-time copies. Designed for AI pipeline checkpoints, model snapshots, and operational recovery, snapshots can be created manually or via policy-based retention. This capability is essential for protecting training progress, enabling rapid rollback during model development, and maintaining data integrity across complex AI workflows.
A new write-placement engine in V12 organizes sequential zones intelligently for Shingled Magnetic Recording (SMR) drives, unlocking 25–30% more capacity per rack without compromising throughput.
Available now for all V5000 systems, RDMA support enables GPU-to-storage data transfers that bypass the CPU entirely. Built on NVIDIA ConnectX-7 networking adapters and AMD EPYC Turin processors, RDMA delivers direct memory access between GPU server nodes and the VDURA Data Platform, eliminating CPU bottlenecks for the low-latency, high-throughput data paths critical to AI training and inference.
Phase 1 of Context-Aware Tiering introduces three capabilities: Extended DirectFlow Buffer to Local SSD, reducing dependency on network storage for hot data; KVCache Writeback for Persistence SLA, minimizing unnecessary I/O while maintaining inference SLA compliance; and Context Cache Tiering Framework for high-speed read/write at LMCache speed, supporting long-context LLM serving and RAG workloads. The roadmap includes deeper application-directed data placement, cross-node cache coherence, and NVIDIA BlueField-4 DPU support.
V12 includes a native Container Storage Interface (CSI) plug-in that simplifies multitenant, Kubernetes-based deployments with zero-script persistent-volume provisioning and management. Organizations running containerized AI pipelines on Kubernetes can now provision VDURA storage volumes directly through standard Kubernetes APIs, eliminating custom integration work and accelerating time-to-production for cloud-native AI workloads.
VDURA provides industry-leading end-to-end encryption. V12 delivers comprehensive security with transparent, tenant-per-volume AES-256 encryption that protects data from the moment it leaves the client, through transit, and at rest. This unified encryption architecture replaces the patchwork of TLS for in-flight and SED for at-rest that competitors rely on, providing stronger confidentiality, integrity, and compliance alignment in a single, zero-performance-compromise implementation.
One simple contract covering hardware, software, and support. VDURACare Premier™ includes 10-year, no-cost replacement of drives and 24×7 expert response, delivering comprehensive, risk-free coverage that protects your investment and keeps your AI factory running without interruption. No surprise costs, no separate maintenance contracts, no finger-pointing between vendors.
V12 is available as a zero-downtime in-place upgrade for all V11 customers and reaches general availability in Q2 2026 for all V5000 systems.
Rather than treating the entire server as a single failure domain, each VDURA Storage Node hosts multiple Virtualized Object Storage Devices, or VPODs. This architecture introduces a finer unit of failure isolation:
Files are striped across component objects in multiple VPODs using N+2 erasure coding, ensuring high fault tolerance with efficient space utilization. Large POSIX files benefit from this distributed protection model, while small POSIX files are triple-replicated across VPODs, delivering optimal performance and storage efficiency.

VDURA Dynamic Data Acceleration™ (DDA) intelligently aligns I/O patterns with the most suitable media layer in real time:
Together, these layers form a high-performance, self-optimizing data fabric that minimizes latency and maximizes cost-efficiency.
In the event of a Storage Node failure, the VDURA Data Platform reconstructs only the affected component objects, not the full node’s data. Files are rebuilt by pulling erasure-coded data fragments from other nodes. Continuous background scrubbing verifies data consistency across the system by validating erasure codes against stored data.
VDURA’s intelligent orchestration engine continuously analyzes file size, access pattern, and data temperature to automate data placement across flash and hybrid tiers. Key features include:
The result is a storage system that evolves with the volatility of the AI pipeline, scaling performance and capacity without trade-offs.
VDURA V12 performs data reduction at the Storage Node level, ensuring zero impact on client-side CPU or memory resources. Unlike architectures that shift compression or deduplication tasks to the client, consuming valuable compute and memory, VDURA handles all reduction operations within the storage layer itself. This design keeps GPU and application nodes fully dedicated to AI and HPC workloads, maximizing performance and system efficiency. The data reduction feature can be toggled on or off at any time via the GUI or CLI.
VDURA V5000 Certified Platform hardware is engineered for AI and HPC pipelines that demand relentless GPU feed rates. Built on industry-standard servers, V5000 Certified Platform hardware pairs flash performance with optional HDD capacity expansion, giving organizations a cost-balanced path from pilot to exabytes.
V5000 hardware runs the VDURA Data Platform V12, VDURA’s flash-tuned parallel file system, streaming multiple terabytes per second from a single global namespace. Working with the DirectFlow Client, VDURA offers parallel redundant data paths that scale linearly, safeguard data with enterprise-class durability, and keep day-to-day management simple.
Each system begins with a minimum of three Director Nodes and three Storage Nodes, which can be either all-flash or flash with HDD capacity expansion. Additional nodes can be added seamlessly to expand performance, capacity, or metadata throughput independently.

The VDURA V5000 Certified Platform hardware system represents the culmination of decades of engineering expertise in parallel file systems and distributed storage technology. Built for AI/ML and HPC workloads, V5000 hardware combines enterprise-grade reliability with maximum throughput and flexibility. Its modular architecture allows organizations to independently scale performance, capacity, and metadata operations to create the ideal balance for their specific workload requirements without overprovisioning or underutilization.
Each VDURA V5000 cluster can expand incrementally and non-disruptively.
VDURA V5000 Director and Storage Nodes support 400/200/100 GbE networks via two network ports in the rear of each node. The default configuration upon initial installation is link aggregation across two ports, a 2×200/100 GbE configuration using two 200/100 GbE SFP28 cables, with one attached to each port. VDURA V5000 nodes support Link Aggregation Control Protocol (LACP) by default; static Link Aggregation Group (LAG), single link, and failover modes are also available.
VDURA V5000 Director and Storage Nodes contain two 25/10 GbE ports for corporate network connectivity. All nodes also contain a single 1 GbE port that may be used as a general administrative network port or for troubleshooting.
There are four network configuration options:
The default network configuration for V5000 nodes is LACP across the dual 100 GbE ports. Generally, protocols other than LACP and static LAG operate in active/passive mode.
Active/Active Link Aggregation Mode: When load balancing is required to optimize performance, V5000 systems can be configured to use either dynamic LACP or static LAG. LACP is preferred, as it is significantly more robust than static LAG. In LACP mode, the physical ports are bonded with the IEEE 802.3ad LACP link-layer protocol, providing load balancing, better fault tolerance, and protection against misconfiguration.
Single Link Mode: While single link mode is supported on V5000 systems, it is not optimal since it is a single point of failure and suffers reduced bandwidth. Single link mode should be used with caution.
Network Failover Mode: Network failover is used on V5000 systems when active/passive redundancy is required.
VDURA provides two mechanisms to manage namespace and capacity: StorageSets and volumes.
StorageSets: The StorageSet is a physical mechanism that groups Storage Nodes into a uniform storage pool. It is a collection of three or more Storage Nodes grouped together to store data. You can grow a StorageSet by adding more hardware, and you can move data within a StorageSet.
Volumes: A volume is a logical mechanism, a sub-tree of the overall system directory structure. A read-only top-level root volume (“/”), under which all other volumes are mounted, and a /home volume are created during setup. All other volumes are created by the user on a particular StorageSet, with up to 1,200 per realm. V12 adds native snapshot support for volumes, enabling instantaneous point-in-time copies for AI pipeline checkpoints and operational recovery.
When planning volume configuration, keep the following points in mind:
Storage Nodes in the VDURA Data Platform are highly sophisticated Virtualized Protected Object Devices (VPODs); you gain the same scale-out and shared-nothing architectural benefits from our VPODs as any object store would.

VDURA defines objects used in our VPODs per the American National Standards Institute (ANSI) T10 standard definition of objects rather than the Amazon S3 object definition. The VDURA Data Platform uses T10 objects to store POSIX files. Instead of storing each file in an object like S3 does, VDURA stripes a large POSIX file across a set of VPODs and adds additional VPODs into that stripe that store the P and Q data protection values of an N+2 erasure coding scheme. Using multiple VPODs per POSIX file enables the striping that is one of the sources of a parallel file system’s performance.
A traditional storage array reconstructs the contents of drives, while VDURA reconstructs the contents of files.
While large POSIX files are stored using erasure coding across multiple VPODs, small POSIX files use triple replication across three VPODs. This approach delivers higher performance than can be achieved by using erasure coding on such small files, while being more space efficient. Unless the first write to a file is a large one, it will start as a small file. If a small file grows into a large file, the Director Node will transparently transition the file to the erasure-coded format at the point that the erasure-coded format becomes more efficient.
Any system can experience failures, and as systems grow larger, their increasing complexity typically leads to lower overall reliability. For example, in a traditional storage system, since the odds of any given drive failing are roughly the same during the current hour as they were during the prior hour, more time in degraded mode equals higher odds of another drive failing while the system is still degraded. If enough drives were to be in a failed state at the same time, there would be data loss, so recovering back to full data protection levels as quickly as possible becomes the key aspect of any resiliency plan.
The VDURA Data Platform has linear scale-out reconstruction performance that dramatically reduces recovery time in the event of a Storage Node failure, so reliability increases with scale.
If a VDURA Storage Node fails, the system must reconstruct only those VPODs that were on the failed node, not the entire raw capacity of the Storage Node like a traditional array would. The system reads the VPODs for each affected file from all the other Storage Nodes and uses each file’s erasure code to reconstruct the VPODs that were on the failed node.
When a StorageSet is first set up, it sets aside a configurable amount of spare space on all the Storage Nodes in that StorageSet to hold the output from file reconstructions. When the system reconstructs a missing VPOD, it writes it to the spare space on a randomly chosen Storage Node in the same StorageSet. As a result, during a reconstruction, the system uses the combined write bandwidth of all the Storage Nodes in that StorageSet. The increased reconstruction bandwidth results in reducing the total time to reconstruct affected files, which reduces the odds of an additional failure during that time and increases the overall reliability of the realm.
VDURA also continuously scrubs the data integrity of the system in the background by slowly reading through all files in the system, validating that the erasure codes for each file match the data in that file. Data scrubbing is a hallmark of enterprise-class storage systems and is only found in one HPC-class storage system, the VDURA Data Platform.
Based on system configuration, the N+2 erasure coding that VDURA implements protects against either one or two simultaneous failures within any given StorageSet without any data loss. The realm can automatically and transparently recover from more than two failures as long as there are no more than two failed Storage Nodes at any one time in a StorageSet.
If, in extreme circumstances, three Storage Nodes in a single StorageSet were to fail at the same time, VDURA has one additional line of defense that would limit the effects of that failure. All directories are independently stored triplicated, with three complete copies of each directory, with no two copies on the same Director Node. If a third Storage Node were to fail in a StorageSet while two others were being reconstructed, that Storage Set would immediately transition to read-only state. Only the files in the StorageSet that had VPODs on all three of the failed Storage Nodes would have lost data. All other files in the StorageSet would be unaffected or recoverable using their erasure coding. The number of affected files in these situations becomes smaller as the size of the StorageSet increases.
VDURA is unique in the way it provides clear knowledge of the impact of a given event, as opposed to other architectures which leave you with significant uncertainty about the extent of the data loss.
Instead of relying on hardware controllers that protect data at a drive level, VDURA architecture uses per-file distributed erasure coding in software. Files in the same StorageSet, volume, and even directory can have different erasure coding protection levels. In this way, a file can be seen as a single virtual object that is sliced into multiple component objects.
Users have three Erasure Coding Protection Levels available: Dual Parity (n+2), Single Parity (n+1), and Striped Mirror (2x).
The Erasure Coding Protection Level is selected at volume creation time and cannot be changed after a volume is created. You can mix protection levels and any volume layout together in the same StorageSet, with each volume evaluated independently for availability status.
Download the full VDURA Data Platform V12 White Paper or visit vdura.com for a tailored AI factory assessment.