Meet with us at SC24

Purely Misleading, Carbon Smart Storage

How we build systems matters! It impacts cost, performance, capacity, environmental footprint, and more. Building systems requires a diverse perspective and consideration from multiple angles.

Pure Storage recently published a blog claiming their DirectFlash Module (DFM) technology is more environmentally friendly than standard SSDs and HDDs. As flash array vendors, their view is clearly biased. Limited to one type of offering, their perspective lacks the flexibility needed to innovate and improve environmental impact.

When considering more accurate metrics across the build, use and retirement phases a Hybrid (HDD & SSD) or HDD based system can come out significantly ahead of an All-Flash System. In fact, my analysis shows that a modern HDD system could be 60% more favorable at launch in terms of CO2e emissions. When factoring in a more realistic failure rate the edge is maintained at 54%. This is with data available today and the expectation is that 30TB HDDs will have an even wider margin.

At VDURA, we prioritize flexible architectures that meet our customers’ needs and the world’s demands. Our hybrid architecture allows us to adjust the mix of HDDs and SSDs within the system, providing tailored solutions for optimal performance, sustainability and more.

System Construction – The build phase

The first area to consider is simply how a system is constructed and the components that go into it. In the Pure blog they start with a simple 4PB solution built out of HDDs, SSDs or their DFM technology. The data sources that are used are only broad generalizations which lead to the conclusion that an All-Flash company would want you to see.

Building Block Selection

HDD manufactures have been continually pushing the boundary of what is possible and squeezing more and more bits into a device. The Pure blog calls out a few different drive types but uses them inconsistently throughout. For this analysis let’s consider a 22TB HDD and then project out what a 30TB HDD system would look like.

Once you determine what sort of devices you want to use in building your system the next thing to look at is what they are placed in. In the current environment there are many options that range from low density 2U12s to high density that go all the way up to 4U108 based systems. For this analysis let’s consider something that is available to all which is a dense system in the following form factor 4U78.

System Analysis

When you consider our perspective of building a system with denser drives and denser enclosures you can achieve a significant reduction in overall rack space. A 22TB based system comes out to be 14U which is 56% more dense than the original Pure perspective. If you push this even further with the latest HDD technology from Seagate and use a Mozaic3 30TB you can see a 68% reduction in rack space.

While I mentioned above that Hybrid based systems win out you may have noticed that I have not included SSD in this. We tend to target ratios of 10% for SSD and in this assessment, it has been left out to give a straighter comparison. In future blogs we will dive more into the power of a Hybrid based system.

Assessing the carbon footprint

To understand the carbon footprint of a system like this we need to first understand where the Pure’s numbers came from and then look at what information is available from the HDD manufacturers to help us understand a more accurate assessment of those systems.

The first step in creating a reasonable comparison with a 22TB HDD was to identify the source data. This report from Western Digital (WD) provides the basis, specifically referencing the following metrics for Tons of CO2 per Petabyte (PB):

Using the most recent reports of 1.2 Tons per PB we can work out that 1.2 kg CO2e per TB. Projecting that forward to a 12 TB device we 14.4 kg of CO2e. However, this calculation assumes a constant CO2e rate as the device scales up, which may not hold true over time due to improvements in manufacturing efficiency.

This is a very broad generalization of the carbon cost of a device. It fails to highlight that HDD manufacturers don’t sell one capacity point they sell many different capacity points. This means that a 12TB and a 22TB device may have very different measurements as there are normally multi generation improvements.

For this analysis we should consider a less generalized estimate of the CO2e cost of a device.

To estimate the CO2e for a 22TB device, we considered additional sources to understand generational improvements in emissions. Seagate publishes similar data on a per-generation basis, available at Seagate’s Product Sustainability Page.

Simply browsing the various generations of drives in the above-mentioned source we can see that every generation of HDD produces has an improvement over the past in terms of CO2e impact. In fact, a 22TB HDD is 40% lower in CO2e emissions than a 12TB HDD on a per TB basis. When a 30TB HDD is considered, my estimate is that it could be up to 50% lower in terms of CO2 emissions than a 12TB HDD. Significant improvements are continually being made!

With this concept in place, we can take the over generalized values that Pure created and apply a scaling factor to them such that more modern drives can be used in the calculation. With this we see that a 22TB system could be as much 67% lower in CO2e emissions than a Pure based DFM system at initial deployment. A 30TB based system could be almost 73% lower!

Reliability

Another aspect to consider is the overall reliability of HDDs vs SSDs vs Pure’s DFMs. While I agree that the data cited from Backblaze  shows HDDs failing at a higher rate, what I don’t agree with is after 5 years all the devices need to be changed out. Rarely, if ever, has anyone changed all the drives of a system after 5 years. Could you even imagine how you would do this or what would compel you to think you need to?

So how do we consider the effect that time has on a system? How do we understand the impact of replacing failed devices? First let’s assume the BackBlaze reported number of 1.41% for HDD failure rate. Assume that this covers the first 5 years of the device’s life. At that point, a decision needs to be made about the failure rate of a drive over the next 5 years. While the data is a bit fuzzy here the data reported from Backblaze does include HDDs that have over 5 years of life. In fact, the highest reported HDD age was an 8TB Seagate device which was 104 months old and had a 0.68% failure rate. The data itself shows a range of ages and AFRs.

For my modeling, I assume that after 5 years the AFR increases greatly and continues until the 10-year mark. My model uses 5% for the first year and increases 1% for each year after. When you use this model a more reasonable estimate of total devices used can be calculated. My calculations show that you would only replace 78 HDDs in the 22TB model. When you factor this into the overall CO2e impact the HDD based system maintains a significant edge with a 54% improvement in emissions and a 30TB HDD projected a margin of 62%.

Conclusion

As you can see there when you over generalize the conclusions are not overly realistic. By applying a bit more diligence and modeling we see that HDD based systems are superior to that of an all-flash system. My analysis shows that by applying a more logical thought process we see that a modern HDD based system, with dense drives and dense enclosures, starts at 54% better in terms of CO2e emissions and moves to over 60% as the newest 30TB HDDs come to market. l of this without even considering the price delta between these sorts of systems but that is a topic for another blog.

At VDURA, we understand the importance of broadening our perspectives when considering total cost of ownership (TCO) and economies of scale (EoS). These concepts provide valuable frameworks for evaluating the complexities of building sustainable systems. While carbon emissions and their equivalents are crucial, there’s much more to ensure our systems contribute to keeping this beautiful blue planet a wonderful place for us and future generations.

 


 

Written By: Michael Barrell

About the Author:

Michael Barrell

Michael Barrell is the Senior Systems Architect and Platform Lead for VDURA. With over 20 years of experience in data storage and management, Michael specializes in developing advanced storage technologies and solving complex performance issues. His work has led to several innovative solutions and patents, significantly impacting the field of storage systems. Michael’s leadership and technical expertise continue to drive VDURA’s cutting-edge platform development.