Non Disruptive Upgrades: The Pure Storage Ship of Theseus
- By Don Poorman, Pure Storage
- October 15, 2024
The Ship of Theseus Paradox asks the question: If every part of a ship is replaced over time, is it still the same ship? Establishing the identity of anything is tricky because physical presence is not necessarily tied to what something represents. Material construction and identity are two separate variables.
Could the same be true for FlashArray™? It has been designed for its controllers and all other components to be non-disruptively upgraded in place over time without the need to take it offline. One Pure Storage customer has done this to their original FlashArray FA-420 purchased in 2013 to include their controllers, bus architectures, and DirectFlash® Modules (DFMs) to become a FlashArray//X70 R3.
This begs the question: Does this customer’s current FlashArray//X70 identify as the original array or something completely different?
Non-disruptive Upgrade (NDU) technical benefits
Pure Storage NDU's technical benefits are tremendous. First, the array becomes faster and better in place; no data needs to be heavy-lifted across the network to a separate array. This eliminates the complicated and time-consuming process of planning and executing multiple workload outages while data is moved and then cut over to the new array. Most modern arrays support several hundred terabytes of data, so copying all of it between arrays over the network or reloading it from backups or replications can take weeks or even months to accomplish.
The second benefit is the mitigated operational risk of an in-place stateless controller upgrade. The process is truly non-disruptive, which means it could be executed during production hours, but for those who opt for it to occur during a maintenance window, its streamlined process practically ensures it can be completed within reasonable time frames for standard maintenance windows.
NDU financial benefits
FlashArray non-disruptive upgrades can also provide incredible financial benefits. That’s because upgrades are accomplished in place with no need to pay professional services to heavy lift data between old and new arrays or for vendor maintenance during that process. This can free up system funding to be applied to another function in the data center.
There’s also a potential tax benefit for some organizations for NDUs. Some businesses that purchase with capital expenditures may not be able to depreciate a new array until it’s fully operational and supporting all of the workloads of its predecessor. This may seem pretty specific, but it could be critical in understanding the bigger picture of enterprise storage technical refreshes.
FlashArray NDUs from 50,000 feet
A non-disruptive upgrade to a storage array’s controllers may seem improbable — that would be true with legacy architectures designed 30-40 years ago. However, that is not the case for FlashArray. When it was designed, component modularity was a core principle that ensured it would be future-proofed for refreshes and upgrades — even for new technologies that hadn’t emerged yet.
For instance, original FlashArray systems used SAS as the array’s backplane. But, in 2017, we began shipping NVMe-based FlashArray systems and allowed customers to non-disruptively upgrade the SAS backplanes on their existing arrays to NVMe.
The same modularity principle is applied to FlashArray’s controllers. Check out this architectural design white paper that gives a great breakdown of how FlashArray and FlashBlade® designs make enterprise storage more agile and the least disruptive part of the data center.
The keys to enabling their non-disruptive swap are twofold:
- Active/passive controllers (sort of): FlashArray is configured with two storage controllers. Both controllers can receive I/O during normal operations, but the system only processes data through the active one. This architecture may seem half as capable as other arrays that use dual active controllers, but that is older thinking that originated decades back when processors, RAM, and bus architectures weren’t as robust as they are today—two of them were needed to share the array’s load. FlashArray’s design is simpler: ensure each controller is twice as performant, thus maintaining performance when one controller is unavailable through component failure, system upgrades, or external causes like network outages.
- Stateless controllers: Legacy storage arrays rely on their controllers to maintain their operating state and I/O exclusively onboard in volatile RAM. This will present a system stability problem in an unexpected component outage because critical data will be lost. FlashArray controllers mirror their state and I/O with separate NVMe cache memory, so replacing a controller merely requires that cache to reload the swapped component.
These two characteristics enable an NDU to occur in a few simplified steps:
- Disconnect the older Controller 2 (passive) and connect the new upgraded controller.
- Run some commands to fail the active older controller to the new one.
- Disconnect FlashArray Controller 1 (formerly active) and connect the new upgraded controller.
- Run some commands to fail the array over to the new one.
That’s it! Basically, you use the new controllers to “swing” through the NDU. We call it a “Reverse Headstand.” And, because of FlashArray’s active/passive controller design, performance is unaffected during the entire process, which lasts only one to two hours.
Get to know more about NDUs
Pure Storage’s non-disruptive upgrades for FlashArray are a much-needed innovation for enterprise storage, and we’ve successfully executed over 11,000 controller upgrades over the last 10 years. Legacy vendors may claim they have a similar capability, but it will involve a lot of advanced planning and array performance will be affected during the process.
The original article is here.
The views and opinions expressed in this article are those of the author and do not necessarily reflect those of CDOTrends. Image credit: iStockphoto/MR1805
Don Poorman, Pure Storage
Don Poorman is the senior technical evangelist at Pure Storage. He is a conflict-decorated USAF veteran with 30 years’ experience in the design, implementation, and support of information technology systems. He has worked on projects that affected thousands of end users including high profile accounts such as: the White House Executive Residence, several global service providers, and many US Federal agencies.