Improving Higher Availability
Regardless of the organization size, every one of our clients is continually assessing ways to make their IT environment more highly available. Depending on budgets, the level of availability considered can vary widely. But, whatever the budget, improving availability is an important endeavor for all organizations and the right approach is usually multi-faceted.
To be clear, I am not talking about backup/recovery or keeping data offsite in case of disaster. Rather, I’m focusing here on ways to keep systems up even when a part of the infrastructure fails; maintaining continuous operation.
This week, I want to start by listing a few of the more common solutions to achieve higher availability but focus on one that leverages an industry standard. HPE mid-range storage arrays provide an affordable and easy way to deploy solutions for high availability.
Base Level of Availability
The base level of availability for all our clients begins with solutions that provide no-single-point of failure in the primary data center. In many cases, this is as simple as assuring all storage is protected by some level of RAID and that there are multiple paths to the network and to storage. This level of availability also means that all hardware components have redundant fans and power and that they are connected to redundant power distribution units in the rack.
With so many organizations using colocation services today, power redundancy can affordably be expanded to include multiple power grids and generators. So, you can see that even at the base level, providing high availability has multiple facets.
Usually, the next level of availability is some type of host/OS clustering (Microsoft Clusters, ESXi Clusters, etc.). Again, because the use of colocation services is becoming so prevalent, stretching these clusters across geographic distances is often an affordable consideration. And, since we are talking here about maintaining continuous availability, the latency between sites should be very low to facilitate a stretch cluster.
These types of clusters are often active/active and will serve as failover sites for one another. It is this kind of infrastructure that supports an availability solution from HPE called Peer Persistence.
HPE Peer Persistence
An HPE Peer Persistence solution allows companies to federate Storage systems across geographically separated data centers at metropolitan distances. This inter-site federation of storage helps customers to use their data centers more effectively by allowing them to support active workloads at both sites. They can move data and applications from one site to another while maintaining application availability even if one side goes offline completely.
In fact, Peer Persistence allows for planned switchover events where the primary storage is taken offline for maintenance or where the workloads are simply moving permanently to the alternate site. In any event, the failover and failback of the storage is completely transparent to the hosts and the applications running on them.
This capability has been available on HPE 3PAR StoreServ arrays for over five years. And, now, with the latest release of the Nimble OS (5.1), Peer Persistence is supported on the Nimble Platform.
The basis for Peer Persistence is the ALUA standard. ALUA (Asymmetric Logical Unit Access), allows paths to a SCSI device to be marked as having different characteristics. With ALUA, the same LUN can be exported from two arrays simultaneously. Only the paths to the array accepting write to the volume will be marked as active.
The paths to the secondary side volume (the other array) will be marked as standby. This prevents the host from performing any I/O using those paths. In the event of a non-disruptive array volume migration scenario, the standby paths are marked as active. The host traffic to the primary storage array is redirected to the secondary storage array without impact to the hosts.
Whether using HPE 3PAR StoreServ or Nimble, Peer Persistence is possible using certain components. First, you must have two arrays that support synchronous replication. In this case, we are talking about either HPE 3PAR StoreServ or Nimble arrays (3PAR requires another 3PAR and Nimble requires another Nimble).
Beyond that, you’ll need:
- RTT (round trip time) Latency of 5ms or less between the sites
- Hosts that can support ALUA. Those include:
- Oracle RAC
- XEN Server
- A Quorum Witness – This component is software deployed in a third site that receives ongoing status from each array in the Peer Persistence relationship to help define when a failover needs to take place.
At this point, I had planned on providing a drawing and an explanation of how Peer Persistence utilizes the components mentioned above. However, instead I’m including two videos that do a great job of showing the Peer Persistence solution.
Why Peer Persistence?
If you already utilize 3PAR or Nimble in your infrastructure, then you should consider this solution to improve your availability. It is a simple way to achieve high availability utilizing a storage solution with which you are already familiar. If you are considering a storage refresh, Peer Persistence is reason to explore either 3PAR or Nimble as part of your infrastructure.
Zunesis can show you this technology first hand in our lab. And, you can see a case-study on our website where we successfully deployed this solution.