Vast Data, Weka, and more set out their stalls as AI bottlenecks grow. (SDxcentral)–If scarcity is a super power, it seems flash memory has become a superhero of sorts in the AI conversation. But like with all cinematic comic book sagas these days, there is the specter of repetition, with recycling old content more problematic than it appears on the surface.
AI storage darling Vast Data arguably established the franchise when it launched Amplify last month, a flash reclamation program to help its customers in the current AI-driven memory shortage.
As exclusively revealed by SDxCentral, Vast claimed the move would help clients navigate the current “dry year” for solid-state drive (SSD) provisions, many of whom, it said, have only had since the end of last year to solve the problem as memory prices began to spike with the scaling surge of AI use cases.
Amplify was followed promptly by storage vendor Vdura launching what it called a “Flash Relief Program,” asking customers to submit any configuration from larger rivals Vast and Weka, and promising to beat the price by 50% while matching or exceeding specifications.
That release came off the back of a Flash Volatility Index calculator from the firm, which tracks quarterly pricing and models different architecture scenarios. According to Vdura’s data, pricing for 30-terabyte triple-level cell (TLC) enterprise SSDs increased 257% between the second quarter of last year and the first quarter of this year, rising from $3,062 to $10,950. Cost multiples between SSD and hard disk drive (HDD) capacity expanded from 6.2-times to 16.4-times within the same period, which also saw HDD pricing spike 35%, and dynamic random-access memory (DRAM) costs rise 205%.
Erik Salo, Vdura’s SVP of business operations, told SDxCentral that its rivals are looking for new solutions in a tough sales market, saying bluntly: “I’ll tell you if I was an all-flash array vendor right now, I don’t know what I’d do, because nobody can afford to buy my product.”
Breaking it down further with a 25-petabyte deployment delivering 1,000 GB/s of sustained performance, an all-flash architecture carried an annualized cost of $8.5 million at Q2 2025 pricing, according to Vdura. By Q1 2026, the same configuration increased to $24.54 million, a 189% rise driven mostly by flash media pricing.
Vdura’s architecture consists of a flash and disk tiered system, with per-node parallel “engines” combining one SSD with multiple HDDs. According to Salo, such systems are still susceptible to price increases, but not as much as all-flash architectures.
On the topic of memory recycling, Salo described the idea of using old flash in new systems as an “awful idea.”
“Flash wears out, right? Those bit cells wear out really fast. And, you know, inside a flash drive, they actually, when you buy it new, they’ve given you a whole big bunch of that which is not shown to you, because they’re using it for the failures. If you have … half a dozen flash drives fail, your data is gone, right? I’ll tell you, I would not use old flash,” Salo said with equal bluntness.

Weka’s worries
Weka also had its reservations, confirming that while it can support flash reclamation if customers request it, it’s “generally a bad idea” during shortages.
Val Bercovici, the firm’s chief AI officer, said that, from a pure capacity perspective, reclaiming unused storage could have value – but in theory only.
“Here’s the problem: AI customers aren’t buying storage systems to let data sit idle, nor are they using them for backup and archive,” Bercovici explained. “They’re buying storage for continuous AI use cases: training, fine-tuning, and inference, particularly with key-value (KV) cache running on those systems 24/7. No one shuts off their inference servers at night. These are always-on operations.”
Echoing Salo, Bercovici said reclaiming stranded performance capacity requires serious due diligence, as “you need to know whether you’re buying a lemon, a well-maintained used car, or a new car” when reclaiming a drive.
“And that’s not always transparent,” Bercovici continued. “Drives, unless brand new, come with wear. The drive writes-per-day metric for flash TLC and quad-level cell (QLC) drives is key. You have to inspect self-monitoring, analysis, and reporting technology (SMART) data on the non-volatile memory express (NVMe) devices themselves – every single one you’re considering. You also need to understand usage, wear level, and actual capability. Only if it passes all those tests is it appropriate for high-performance AI workloads versus commoditized capacity use cases.”
For Weka, flash reclamation addresses the wrong problem as the constraint isn’t squeezing more capacity from aging drives, but is instead eliminating the need for separate storage infrastructure entirely.
“Reclamation from heterogeneous, opportunistically-sourced drives introduces performance variability that undermines graphics processing unit (GPU) utilization,” Bercovici added. “When customers mix drive ages, generations, and manufacturers, premature NAND flash block wear-out risk rises (i.e., drives can’t write data anymore), and consistent performance becomes impossible – creating exactly the storage bottleneck that keeps GPUs at 30 to 50% utilization. Reclamation may recover capacity, but it can’t deliver the microsecond-latency KV cache access that prevents inference workloads from wasting cycles costing millions of dollars daily at scale, on token recomputation.”
In other words, he argued, the question isn’t whether a vendor can run drive reclamation programs, but whether “they can deliver the needed memory-class performance, extend GPU memory capacity by 1,000x, and eliminate hardware dependencies when every component is constrained. What matters is proven production capability at scale, regardless of company size.”
The Vast view
In response, Phil Manez, VP for go-to-market execution at Vast Data, similarly argued that flash obviously has limits, meaning vendors must “manage those limits appropriately through explicitly governed and predictable architecture and software control, or mask inefficiency with additional tiers and hardware complexity that introduce latency, variability, and GPU underutilization.”
Most environments, Manez explained, consume flash far faster than the workloads themselves require, driven by replication, rewrite amplification, and fragmented data pipelines.
“That inefficiency is now being exposed by AI workloads that increasingly depend on storing and reusing large volumes of context data on flash to deliver real-time experiences without constantly re-computing at the GPU layer,” Manez said.
The Vast approach centers around an estate intelligence check, ensuring that when flash is deployed within an architecture that explicitly measures endurance, maintains continuous media visibility, and structurally reduces write pressure, SSDs exhibit highly predictable service lives, even under sustained AI workloads.
As powered by Vast’s Disaggregated Shared-Everything (DASE) approach to architecture, installed flash is converted into what Vast calls a “unified, globally accessible pool,” applying continuous, global reduction across the namespace, helping to eliminate redundant patterns where they occur, with capacity and metadata visible at the platform layer rather than hidden in siloed volumes.
“Taking into account the wear from previous deployments, reclaimed drives with known health characteristics can operate as predictably as, or more conservatively than, new SSDs deployed into legacy, write-amplifying architectures,” Manez said.
Bercovici, meanwhile, touted Weka’s software-defined architecture for eliminating storage procurement dependency. When customers provision GPU servers to meet their computational requirements, Weka’s storage system is deployed on those systems to utilize local NVMe devices for memory-class performance. This configuration, he claimed, eliminates the need for dedicated storage infrastructure alongside additional hardware which introduce latency and extend deployment timeframes. Aside from a reduction in deployment timespans, the executive claimed Weka can support up to 10-times the typical user capacity, and sustain GPU utilization rates exceeding 90%.
Marketing marvels?
The various approaches in the market to tackling the memory scourge shows that flash reclamation is but one solution, confirmed Max Smolaks, research analyst at Uptime Institute Intelligence. But the researcher saw most tools as pairing traditional copy-data management techniques with a new marketing message.
“While they are effective at reducing storage footprints, many IT departments will find they have data reduction capabilities available within their existing storage software. If those are not being used, now would be good time to start,” Smolaks said. “Storage tiering, which combines the capacity of HDDs with the speed of SSDs at a moderate performance penalty, is another time-tested approach that is likely to see more use.”
Smolaks added upgrades to storage capacity are hard to delay, and in applications where performance is not a priority, SSDs can be replaced with traditional hard drives.
“Those are going up in price, but not as quickly,” he said, adding that enterprises have an array of options to save the day when dealing with the DRAM shortage.
“Not all workloads are affected by DRAM to the same degree, and modules can be easily moved between servers; some will be able to sacrifice DRAM capacity or performance without a noticeable impact on the workload,” Smolaks said. “In some cases, IT departments might choose to limit hardware refreshes to central processing units (CPUs), rather than entire servers – keeping DRAM modules in place. Yet another option is to deploy modern servers with minimal amounts of DRAM and upgrade them later.”
It says hybrid flash+disk storage architectures, which decouple performance from capacity by combining flash and HDD tiers, experienced significantly lower cost escalation over the same period. Architectural flexibility can reduce exposure to sudden flash market price rises while maintaining required performance levels.
The FVI report illustrates compounding cost pressures across the infrastructure stack. DRAM pricing increased 205 percent over the same Q2 2025 and Q1 2026 period, driven by demand for memory-intensive GPU systems. High-speed networking components face similar constraints, further bulking up total system costs for designs that require higher node counts to meet performance objectives.
According to the FVI report, multi-year hyperscaler purchasing agreements have locked up a significant portion of global SSD manufacturing capacity through 2026. At the same time, large-scale AI infrastructure deployments continue to absorb remaining supply. Industry outlooks suggest pricing pressure may persist into 2027 and beyond.
“After more than a decade of relatively stable NAND pricing, the rules have changed,” said Salo. “Infrastructure leaders need real data to understand what’s happening and plan accordingly. That’s what we’re providing.”
Check out VDURA’s FVI and Storage Economics Optimizer Tool here.