Datacenter Disaster Recovery: Who you gonna call?

A flood that damages mission-critical equipment is among the worst scenarios a data center can face, offering little hope for a quick fix. On April 13, 2015, the Zunesis team received a “first responders call” from a long-term public sector customer who had the misfortune to experience a broken water main that left their main data center under 2 feet of standing water.

The call for help came at 4 p.m. on a Thursday afternoon. Zunesis joined forces with HP and immediately responded to assess the damage. By the next morning, it was clear that the data center was living on borrowed time. With 18 racks of equipment compromised, it was imperative that the team immediately instigate a full Data center fail-over. Luckily, a secondary data center had just been completed, along with plans started for a full DR site; but those plans were stalled as the organization waited for additional funding to become available.

A two-fold plan was quickly put into place to swiftly expand the new data center to capacity while keeping the damaged data center up and running. The insurance company was immediately engaged to begin the replacement exercise while Zunesis and HP jumped in to “keep the lights on” in the existing data center. Immediately needed were a large supply of extra drives; power supplies; cables; switches; and, most daunting, an emergency 120TB SAN array to move data to safety.

Zunesis Datacenter Disaster Recovery By Friday morning, emails had been circulated to the highest levels of HPE and Zunesis requesting emergency assistance. Zunesis quickly moved a 3Par 7200 array from its own internal lab to the customer site and expanded an additional 100TB in an emergency drive order to migrate data. HP called in emergency supplies from all over the Americas. Extra drives, power supplies, cables, switches, and even servers arrived throughout the weekend. Within 48 hours, the teams had built up enough reserves to keep the data center live until all of the data could be migrated.
Although all IT organizations know that it’s critical to have a solid DR plan and failover Data center in place, the reality is that the Data center spotlight is on monitoring and compliance; and the loss of a main data center is crippling. The second most common cause of catastrophic failures (after electrical) in the Data center is water leaks. Taking the following three steps can help avoid a flood disaster:

Initial survey to ensure that the location of our data center is not in a flood plain and that the site is well protected from external water sources. Confirm that the fabric of the building is well enough designed and properly maintained to prevent the ingress of rain (even in extreme storm conditions). Check how sources of water inside the building are routed (hot and cold water storage tanks, pipe runs, waste pipes, WCs, as well as water-based fire suppression systems in the office space). Office space above a data center is almost always dangerous from a water ingress perspective.
Protection to ensure that any water entering the data center does not have the opportunity to build up and cause a problem. If you are really worried, install drains and a sump pump under the plenum floor. Ensure that the floor space is sealed and that all cable routes through partition walls are stopped up to be air and water tight.
Monitoring is critical in a data center – we absolutely need to be able to detect water under the plenum floor. Generally, water detection systems use a cable that runs under the floor and causes an alarm to be triggered if it comes into contact with water.

Click to find out more about our Disaster Recovery Assessments.

Datacenter Disaster Recovery: Who you gonna call?

Categories

Search

Blog Categories

Related Resources

Archives

Datacenter Disaster Recovery: Who you gonna call?

Categories

Search

Blog Categories

Related Resources

Archives

Social