How AI’s hunger for data is transforming what organizations need from storage

Ken Claffey, Oct 10, 2025 (CIO Article)AI’s massive appetite for data is breaking old storage systems, pushing companies to rebuild for speed and reliability before their budgets go up in smoke.

AI workloads are radically reshaping enterprise technology infrastructure. Market forecasts underline just how dramatic the change is, and according to McKinsey, AI has become “the key driver of growth in demand for data center capacity,” with overall requirements predicted to “almost triple by 2030, with about 70 percent of that demand coming from AI workloads.”

Indeed, the World Economic Forum expects the global data center industry, currently valued at $242.7 billion, to more than double by 2032 to around $584 billion. Behind these figures lies a central challenge: traditional storage approaches were designed for a very different era, and today, they are ill-suited to the more unpredictable demands of powerful AI systems. Unless enterprises rethink the fundamentals of their architecture, much of this investment will go to waste.

The legacy gap

To put this in context, for decades, enterprise storage solutions have been designed around predictable workloads, such as those relating to databases and enterprise applications, among many others. It’s an environment that, in general, has enabled IT leaders to scale their storage technologies with a reasonable level of precision and flexibility.

AI has disrupted this approach. Training AI models is dependent on systems being able to read from massive, unstructured datasets (such as text, images, video and sensor logs, among many others) that are distributed and accessed in random, parallel bursts. Instead of a handful of applications queuing in sequence, a business might be running tens of thousands of GPU threads, all of which need storage that can deliver extremely high throughput, sustain low latency under pressure and handle concurrent access without performance bottlenecks getting in the way.

The problem is, if storage cannot feed that data at the required speed, the GPUs sit idle — burning through compute budgets and delaying the development and implementation of mission-critical AI projects.

Lessons from HPC

These challenges are not entirely new. High-performance computing environments have long grappled with similar issues. In the life sciences sector, for example, research organizations need uninterrupted access to genomic datasets that are measured in the petabytes. A great example is the UK Biobank, which claims to be the world’s most comprehensive dataset of biological, health and lifestyle information. It currently holds about 30 petabytes of biological and medical data on half a million people. In government, mission-critical applications, such as intelligence analysis and defense simulations, demand 99.999% uptime, and even brief interruptions in availability can potentially compromise security or operational readiness.

AI workloads, like HPC, require architectures capable of balancing performance and resilience. That often means combining different storage tiers, so that high-performance systems are reserved for the datasets that must be accessed often or at speed, while less critical data is moved to lower-cost environments.

If organizations are to benefit from the experiences of HPC users, they must be open to moving away from one-size-fits-all deployments and toward hybrid storage systems that align infrastructure with the specific demands of training and inference.

Delivering durability

Another big problem organizations are encountering is data durability, which is the extent to which stored data remains intact, accurate and recoverable over time, even where there could be system failures, data corruption or infrastructure outages.

These issues are having a direct impact on the success of AI projects. According to a recent study by Gartner, “through 2026, organizations will abandon 60% of AI projects unsupported by AI-ready data.” In practice, this reflects an absence of robust data management and storage resilience. Only 48% of AI projects ever make it into production, and 65% of Chief Data Officers say this year’s AI goals are unachievable, with almost all (98%) reporting major data-quality incidents.

If this doesn’t make IT leaders sit up and take notice, then there’s also the issue of cost. Poor data quality already drains $12.9 – $15 million per enterprise annually, while data pipeline failures cost enterprises around $300,000 per hour ($5,000 per minute) in lost insight and missed SLAs. These failures translate directly into stalled training runs and delayed time-to-value.

Avoiding these outcomes requires both technical and operational measures. On the technical side, multi-level erasure coding (MLEC) provides greater fault tolerance than traditional RAID by offering protection against multiple simultaneous failures. In addition, hybrid flash-and-disk systems can balance ultra-low latency with cost control, while modular architectures allow capacity or performance to be added incrementally. On the operational side, automated data integrity checks can detect and isolate corruption before it enters the training pipeline, while regularly scheduled recovery drills ensure that restoration processes can be executed within the tight timeframes AI production demands.