In my experience, disaster recovery (DR) and business continuity (BC) are either done very well or very poorly. There doesn't seem to be a middle ground. Most big organizations have redundant data centers, replicated databases, failover servers, and a detailed business continuity plan. Smaller organizations tend to have a few point solutions for DR but do not (or are unable to) invest in alternate data centers and do not have failover solutions in place. If we think in terms of the insurance world, major carriers will have effectively managed DR and BC plans and procedures. Agencies, GMAs, and regional carriers probably will struggle to get back on their feet after a real data disaster.
Social networking and the whole Web 2.0 phenomena have changed the way we do many things. YouTube, Twitter, Facebook, and ubiquitous Web-enabled smart phones now are part of every information worker's life. Maybe we should embrace some of these things when we are planning for disaster recovery.
What We've Got Here Is Failure To Communicate
What is the most important thing we need when recovering from data disaster? Communications! Strother Martin said it all in Cool Hand Luke. I need to be able to communicate with all my team members no matter where they are. Presumably some will be at the alternate center, others will be at the site of the disaster, and others will be working from home or gathered at a site where they have Internet access. By the way, you need to have a couple of sites scouted and prearranged if the corporate center or office is unavailable.
So, your team is dispersed, and it needs to interact. Everyone has a cell phone, so they are able to speak with each other. But you know, sometimes an e-mail is 20 times more efficient than a phone conversation. Instant messaging also is effective for technical teams operating remotely. When I work remotely, I always use the corporate IM solution to communicate with fellow workers. I may be termed onto a server from my home office using IM to tell the techie in the data center what I need him to do. The problem is now we are in disaster mode. The primary data center is down–we have no corporate e-mail system, we have no corporate Internet connection. The corporate phone system is voice over IP, so that is down. The communications server is gone, so we have no IM. We are left with only cell phones for communication–and they don't even work in the alternate data center.
Tweet Your DR?
With the data center going down, so do all communications. All members of the team have to be called one at a time to inform them of the situation. Wouldn't it be more efficient to have a corporate DR team Twitter account? A quick tweet, and all members of the DR team could be informed of the problem simultaneously–and by prearranged plan everyone could log onto a Web-based collaboration space and an action plan put into process. There exist multiple no-cost solutions right now where dispersed team members could meet online. Microsoft, Google, Yahoo, and others provide this type of functionality for free.
There also are no-cost instant-messaging services. You could use Skype or something similar to create a quick online meeting using voice and video (Web cams or online white boards). Everyone on the DR team also should have a Google or Hotmail or Yahoo account. All of these "free" communications tools are available using a smart phone. Everyone on that DR team also should have a data card. Sure, everyone has Internet access at home, but we want to have as many different ways of getting online as possible. If your corporate data center is down, there is a reasonable expectation the Wi-Fi at the local Starbucks also might be down. All of this can be preplanned and tested. I recommend creating an entire map of preconfigured communications alternatives.
It's Not Secure!
OK, I understand there may be resistance to using "public" hosted facilities to manage your disaster recovery–although I don't really understand why that is an issue. I guarantee you will be communicating using cellular communications that don't even need to be hacked to be listened to. All you need to do is come up on the appropriate frequency, and all your conversations are there for the listening. Logging onto a Gmail or Skype account at least provides a modicum of security.
On Cloud Nine
There is an alternative. And this is an alternative that should be attractive to those smaller organizations that don't have the ability to maintain multiple data centers. Cloud computing is here to stay. It is a whole different beast than maintaining a hosted data center off site. Cloud computing provides common business applications that are offered as online services by organizations with large distributed infrastructures. Cloud services are delivered by vendors such as Salesforce, Microsoft, Amazon, and others. These firms have geographically distributed infrastructures that are virtually guaranteed to be available in all but major global disasters.
Smaller organizations (and perhaps large ones, too) may not have the staff and resources needed to manage common business applications as well as their line-of-business applications. If I were running an MGA, I would rather have my IT staff dedicated to revenue-producing applications and let someone else manage things such as corporate e-mail, communications servers, CRM, intranets, Internet, and perhaps even productivity suites. Placing these common business applications in the cloud eliminates licensing issues in addition to reducing the workload on IT staff. It also eliminates the need for a rack of servers that continually need to be updated and maintained.
From the DR/BC perspective, the cloud becomes even more attractive. Even in the teeth of a catastrophic failure of your corporate data center, you still will have access to and use of common business applications. Your disaster recovery team will be able to devote its full energy to returning business applications to functionality. It will have an entire communications and collaboration suite available–on the cloud. Furthermore, since we are concerned with business continuity only for a select group of applications, we can reduce the size of our alternate data center. The total cost of ownership for IT applications should be reduced. You can consume cloud services for fees that are generally lower than the TCO for those applications if they were hosted and maintained internally.
Not My Data!
There currently is a significant amount of resistance to Software as a Service and cloud computing in general. Many organizations are uncomfortable relinquishing control of chunks of their corporate data to giants named Google, Microsoft, and Amazon. Search engines are intrusive. "Free" online Web analytics intrude even further. Placing confidential corporate data on the cloud simply is too much for some organizations. I totally understand this perspective, yet I do like the idea of some applications, particularly corporate e-mail and corporate messaging, provided as a service. Proper training and governance should ensure data that shouldn't be publicly available under any circumstances be kept off the cloud.
Test Time
If an organization does have a DR plan, chances are pretty good it will periodically test its facilities and procedures. Most often DR tests are planned desktop exercises. The team is gathered together, and the event is kicked off. The scenario is described, the data center is declared offline, certain team members are designated as incapacitated, and the plan goes into effect. Backup data is requested and loaded onto the appropriate servers and databases in the alternate data center. On (very) rare occasions, an application actually is failed over to the DR server. The exercise has been planned for weeks. The procedures were rehearsed, and servers were carefully checked for readiness before the exercise. So, what did the exercise really prove? That given two weeks' notice, we probably can recover from a controlled data center problem?
A Real Test
When I was taking my check flight for a pilot's license, the check pilot reached over and killed the engine when I was just barely airborne. I had to react immediately to lower the nose of the plane to maintain sufficient airspeed. At the same time, I had to reverse the direction of the aircraft while calling the tower to declare an emergency and clear the active runway from all other traffic. I was coming back in dead stick and going the wrong direction. Now, that was a real test of how I would react in an emergency. To be fair, I did expect the check pilot would pull some kind of funny business during the flight, but I did not expect it to be in the first 30 seconds. And I also am certain the check pilot already had notified the tower he would bring us back in immediately after takeoff. Nevertheless, that particular exercise provided some useful output. First, it "proved" to the check pilot I was reasonably capable of handling an unexpected problem, and it gave me a certain amount of confidence in my abilities.
Perhaps we should do something similar with our disaster recovery training. Every so often, it makes sense to pull a system offline to see how your team reacts to an unscheduled emergency. I am not suggesting you pull your ratings or policy admin system offline, but I am suggesting you periodically (once a year) and unannounced bring down some mission-critical internal system to see how your team handles it. Hopefully, it will come through with flying colors, but if it doesn't, you will have pinpointed flaws in the system. And that information is invaluable.
Keep an Open Mind
No matter where you come down right now in terms of preparedness for a data disaster, you constantly need to evaluate the tools and options available. Don't be satisfied with a plan that made sense five years ago. Consider alternative ways of doing things. As it happens, my BlackBerry is down this weekend. I am communicating using my iPod touch–corporate e-mail, Gmail, Skype, IM, and Web browsing. I am not suggesting iPods as an alternative to some of things we discussed, but I am suggesting you consider other ways of doing things. Free doesn't mean bad. It just means the revenue stream is coming from a different source. And you should feel free to look around and see whether some of the latest "cool" technologies might make sense in the corporate world. TD
Please address comments, complaints, and suggestions to the author at [email protected]
Want to continue reading?
Become a Free PropertyCasualty360 Digital Reader
Your access to unlimited PropertyCasualty360 content isn’t changing.
Once you are an ALM digital member, you’ll receive:
- Breaking insurance news and analysis, on-site and via our newsletters and custom alerts
- Weekly Insurance Speak podcast featuring exclusive interviews with industry leaders
- Educational webcasts, white papers, and ebooks from industry thought leaders
- Critical converage of the employee benefits and financial advisory markets on our other ALM sites, BenefitsPRO and ThinkAdvisor
Already have an account? Sign In Now
© 2024 ALM Global, LLC, All Rights Reserved. Request academic re-use from www.copyright.com. All other uses, submit a request to [email protected]. For more information visit Asset & Logo Licensing.