Banking Industry and Disaster Recovery Planning
By Robert F Bronner Executive Vice President, SunGard Recovery Services Inc.
Banks were among the earliest adopters of information technology in the business world. They embraced the benefits of computers almost from the birth of the high-tech industry.
Concurrent with the industry's increased reliance on technology has been the birth and evolution of another industry - the disaster recovery industry. The Automated Clearinghouse Association was formed by seven Philadelphia-based banks in the mid-1970s for the sole purpose of confronting the issue of how banks should manage data recovery if their computer systems go down. From the formation of this group came the start of the disaster recovery industry in 1978 by SunGard Recovery Services.
In 1983, the federal government stepped in to mandate that banks develop and maintain disaster recovery plans.
These events show why it is no overstatement to say that the banking industry spawned today's multibillion dollar recovery industry. And the sophisticated needs of the banking community continue to drive the evolution of disaster recovery.
In this article, I explore what the banking industry's specific disaster recovery needs are and explain some state-of-the-art recovery solutions that have emerged as a direct outgrowth of those needs.
Planning is Critical
A proactive approach is critical to banks. Planning is vital to disaster recovery because the primary objective is to avoid problems before they occur.
The importance of this planning became painfully evident during the World Trade Center bombing in 1993. Two-thirds of the companies located in the center were caught without a solid recovery plan. This fact illustrates that despite increased consciousness of the need for disaster planning, plenty of room for improvement exists.
Most of the 170 disaster recoveries that SunGard has supported since 1978 have taken place in the last 10 years. Of those recoveries, 45 were for banks. One of our most recent banking industry recoveries took place last spring in Grand Forks, N. D. when the Grand Forks Community National Bank found its data center under water as a result of massive flooding that struck the region.
We deployed our mobile data center within 48 hours of receiving the disaster declaration from the bank, and bank employees resumed computer operations out of the SunGard data center during the next two months, until their own data center was operative.
Banks - Greater Exposure at Time of Disaster
Disaster recovery is of particular importance for the banks in a locality hit by crisis - more so than other businesses - because their services are in great demand during times of community disaster. The average bank is multi-platformed, with multiple locations and varied operations and computer applications.
Mergers and acquisitions, along with increasingly sophisticated technology, have complicated banks' situation. Mergers and acquisitions have caused banks to inherit more varied applications. Typically, banks run 20 to 30 critical applications simultaneously and when organizations merge or are acquired, this number may double. Also, many banks' operations are becoming decentralized as financial institutions expand their reach beyond the back office into satellite locations. Yet banks also continue to rely heavily on paper, especially at the branch level.
What happens to these decentralized operations and manifold applications if a bank experiences a disaster? What happens to the many paper transactions in branches that have not entered the central system? Whether its the World Trade Center bombing, Midwestern floods or simply a local crisis, the reality remains the same - disasters can disrupt critical business operations significantly for weeks and sometimes months. Thorough preparation can shorten recovery time dramatically and keep banking operations ongoing.
As most bank security officers know, banks must think first about their employees when developing disaster recovery plans.
In the wake of a disaster, a bank's employees first and foremost concern will be the safety of families and personal property. Once needs surrounding those two areas are accommodated, employees will take care of their employer and its customers. For the employer, this may mean providing essentials such as food, shelter, and medical assistance, as well as counseling and information on recovery efforts.
Beyond the Data Center
Business recovery operations are no longer aimed only at what is inside the data center or "glass house." Recovery is a corporate-wide undertaking. As bank operations become decentralized, regional operations centers and satellite offices face greater exposure. Bank work groups such as customer service employees serving clients by telephone from remote locations must be part of the overall disaster recovery plan. The staples of virtually every work group environment - office space, personal computers, telephones, automatic call distribution systems, and other critical office equipment - including those in remote locations - need to be accounted for in the recovery plan.
Business recovery has moved beyond recovering computer systems to restoring and recreating business processes. The outage of whole departments needs to be considered, and how work and information will flow from one place to another or from department to department should be studied. These immediate issues need to be included in every disaster recovery plan.
Buy-In from Senior Management
When business continuity is at stake, delegation does not work. Senior management needs to be educated about disaster recovery so that they support the process. If senior management is not committed to a disaster recovery process, chances for success are poor.
Location - Central to a Recovery Plan
One of the first steps in addressing recovery once senior management has committed to a plan is addressing where employees will relocate in time of disaster. Banks often have the advantage of access to other facilities, including their own branches and regional operations centers, as well as alternate space that can be procured through their real estate departments.
A bank's recovery plan should include geographically independent relocation sites for each work group. Data center professionals may work in an urban area and be more willing to travel or relocate. The average work group employee in branch and remote locations, however, may commute within an hour's radius of home and be less willing to travel. Recovery locations should be planned both for the data center environment and satellite locations.
An alternative available to banks is to subscribe to a recovery service company, which can offer alternate spaces and equipment. Some disaster recovery also often have a "quick ship" type of program that allows them to ship personal computers and related equipment to a designated recovery site within 48 hours of the declared disaster.
Disaster recovery plans should take into consideration any outsourced functions. Many banks outsource data processing or specialized applications such as trust accounting, credit card operations, and automated teller machine applications.
These functions need to be defined and addressed in a disaster recovery plan. Often, outsourced functions provide a false sense of security because management assumes the contractor will take care of every aspect of the product they provide - including recovery planning. Too frequently, this is not the case at all. In 1989, the federal government extended its disaster recovery regulations to require contingency plans from service bureaus or consultants that serve banks. However, disaster recovery often is not addressed in the contract with the consultant.
Testing is Critical
The final component of a successful recovery planning effort is testing. A great plan on paper is no guarantee for a great plan in action - a bank's recovery plan is only as good as the results of coordinated tests. This testing confirms readiness at all levels in the event of a disaster. SunGard's experience has shown us that most problems that arise during disasters occur in areas that have not been tested.
The frequency of testing depends on your bank's size and rate of organizational change. Smaller banks may only test annually; larger banks might perform exercises two or three times a year or stretch an annual test over several days.
A complete test should include all phases of the recovery process including one area overlooked by many banks -relocation back to internal operations. Organizations often address steps that should occur when a crisis hits, and what will happen as business processes relocate, but they stop there. They need to have documented procedures about what will happen during the stage when processes go back into the bank. This carries its own set of issues, so it's critical that banks test this phase - even if they only do it on paper.
The true test though comes during a real disaster. During the actual events as well as during simulations, banks should carefully document recovery efforts, evaluate results, and refine plans accordingly.
Electronic Vaulting - A State-of-the-Art Recovery Solution
When evaluating recovery solutions for banks, the two most important factors that weigh on banks' decisions are:
The recovery window - how long will it take to get up and running if a disaster occurs? and Capturing the point of failure what would be lost from the point of failure to the point of recovery?
Cost, risk, and time are key considerations when finding solutions. How much money is an organization willing to invest in a recovery solution to minimize risk and time to recover should it be faced with a computer shutdown?
Banks, because they have so much at stake, typically are willing to invest in the most advanced, state-of-the-art recovery solutions to minimize their risk and reduce the recovery window. While some businesses could afford a 24- to 48-hour computer shutdown, that length of recovery window is horrific for a bank and a worst-case scenario. If a bank loses its funds transfer function for a day or two, it could be very damaging to business. Banks tend to run many critical applications simultaneously, so recapturing data lost at the point of failure as quickly as possible is crucial.
Many banks today are employing electronic vaulting as a recovery solution because it is the quickest recovery process now available in the information systems environment.
Electronic vaulting is a very flexible process that allows a bank to maintain duplicate data and systems at a recovery site.
Almost limitless variations exist on the level of speed and protection that a vaulting strategy can deliver. Basic decisions depend on the bank's assessment of the importance of the data that needs to be preserved.
Remote shadowing and mirroring, two technological components of electronic vaulting, allow a bank to replicate information as it is created, transaction by transaction. The information is simultaneously transmitted via high speed fiber optic circuits to a remote site, effectively storing transactions at two locations. Once information is stored and protected at a second site, it becomes immediately available in the event of a processing interruption at the originating site.
Shadowing and mirroring can close completely the window of recovery for mission-critical applications. Typically, that window for banks and financial services companies is too narrow to allow for the time needed to restore traditional backup tapes, let alone the time needed to transport those backup tapes to a recovery facility. Remote mirroring provides continuous availability of mission-critical information. Electronic vaulting is a direct outgrowth of banking's use of advanced technology. In the early 1970s, banks used computers mostly for batch processing. At that time, the disaster recovery industry was in the planning stages.
In the later 1970s, when banks began to go online, the hot site recovery industry was born for mainframe recovery. If a bank's mainframe went down, employees could go to another facility with its own mainframe to recover. Hot sites also included some network recovery to accommodate all of the computer terminals being used by bank employees. It was not until the late 1980s that banks also started deploying distributed or midrange computer systems from companies such as IBM, Hewlett-Packard, Digital, Data General, Sequent, Tandem, and many others.
In the 90's, the mainframe continues to have a critical role, even with the continued growth in the deployment of distributed systems. This decade also has brought increased use of the personal computer (PC) local area network (LAN) environment for work group computing. Disaster recovery must now take into account this growing multiplatform environment of mainframes, distributed systems, and PC LANs, as well as the important recovery needs of business work groups I discussed earlier.
Electronic vaulting also was born in the early 90's, and it covers all platforms, even the PC LAN environment. It has emerged as the most viable recovery option for banks.
Disaster recovery will continue to evolve with the banking industry. As banks become more sophisticated technology users, disaster recovery solutions will follow.
But banks must plan for disaster recovery every step of the way. The key to successful disaster recovery is what happens long before a disaster strikes. With a realistic recovery plan, properly tested and committed to by senior management, banks can effectively maintain operations while providing for the safety of people and assets.
Robert F Bronner, SunGard Recovery Services Inc.
Copyright © 1997 Bank Security & Fraud Prevention. Originally appeared in Bank Security & Fraud Prevention, Vol. 4, No. 11, 11/97
First published on 11/01/1997