HOW
VULNERABLE IS YOUR BUSINESS TO A DISASTER?
(courtesy of
Hughes Network Systems
Europe)
1.0 The Need for Communications Disaster Recovery
Everyday, enterprises rely more and more on their communications services to meet the goals and objectives of their business. The communications services utilized typically include voice, data and Internet. While telecommunications continuity is part of the larger business continuity picture, it is a key component that needs to be thoroughly considered.
A recent Gartner Group survey revealed just how much downtime can cost a business: significant downtime during periods of disaster can bankrupt a company. The study indicated that almost half of the enterprises surveyed declared they would lose up to $50K per hour of downtime (Figure 1-1). In large financial and retail companies the cost of downtime can exceed a staggering one million dollars an hour. This is bad enough if the outage is just a few hours long. However, if a large disaster has occurred and the outage lasts several weeks, losses could be enormous - in the millions or hundreds of millions of dollars. While Figure 1-1 represents any type of downtime (data center, communications, etc.) that affects business continuity, it provides a good financial perspective of the impact an outage can have on a business whether it is communications related or host related. Given this enormous impact, having a communications disaster recovery program is vital to the long-term health of a business. The goal of a communications disaster recovery program is to enable continuity of mission-critical applications and network elements during prolonged disasters lasting up to 2-3 months.

Figure 1-1. Cost of downtime for companies surveyed. Percentage Indicates what proportion of the total number of companies experienced a particular downtime cost. Source: Gartner Group, 2002.
2.0 Types of Disasters
There are two major categories of disasters that can affect communications continuity:
Natural disasters that affect communication services have different causes. Some natural disasters are typically limited to certain regions of the country. For example, earthquakes in the West Coast, hurricanes in the East and southeastern regions and tornadoes in the South and Midwest. Others can occur virtually anywhere in the U.S. Examples are flooding or severe storms with high winds. Any of these natural disasters can destroy or significantly damage telecommunications facilities - resulting in lengthy outages.
There are man-made disasters that can be equally destructive. These include:
One of the largest communications service disruptions occurred when a fire gutted the Illinois Bell Central Office in Hinsdale, 111. This fire caused many businesses to lose service for an extended period of time. Even businesses which had ISDN backup quickly learned that both their primary and backup service were connected to the same central office across the same access facilities.
Another good example is one of the top securities firms in New York City, which lost its communications services as a result of the terrorist attack on Sept. 11, 2001. The firm's main location provided real-time, multi-cast information to remote offices across the country. There was also a high volume of financial transactions between the main center and remote offices. Communications to hundreds of the remote offices were cut off by the loss of the telecommunications services provider's facilities and services. While there was some disaster recovery capability and redundancy built into the network, there was no continuity plan that dealt with a complete loss of central office facilities. In this particular situation, due to the financial importance of this firm - and other NYC firms like it - and the enormous economic impact of the outage, the service provider was able to restore service within a few days. The way their service provider accomplished this, however, was by taking network capacity from other, smaller businesses - which caused their networks to go down.
In the wake of the September 11 attack, Richard dark, Special Advisor to the President for Cyberspace Security, stated that having a redundant telecommunications path was one of his top three recommendations to improve service and security in the event of a disaster.
3.0 Telco Vulnerabilities
Even the most reliable terrestrial networks are susceptible to outages. These outages could be the result of natural or man-made disasters or equipment failure. Many enterprises believe they have adequate redundancy within their network or with their communications service provider, only to later find out - usually after a major outage - that there were common/single points of failure in the network. These could be the result of sharing common access facilities for the primary or backup data centers or having a common central office serving these and other key enterprise sites.
Enterprises commonly procure communication services from different service providers (e.g., AT&T and Sprint) to achieve redundancy and path diversity throughout the network. Few realize, however, that having separate service providers does not guarantee path diversity since many of the service providers use the same right-of-ways for fiber, typically along railroads, where a derailment or construction accident can cut both service providers' fiber at the same time. Many service providers promote the recovery capabilities of their WAN services such as frame relay, VPN or ATM. On the surface, these statements sound reassuring. However, closer examination often reveals common points of failure in the underlying physical infrastructure of the service. Even in the cases where enterprises establish a backup capability using ISDN, they often discover that their backup ISDN service uses the same access facilities as their primary service.
4.0 Backup Technology Options
What is the best network technology for a backup solution? One obvious - and popular - option is a terrestrial network. These backup solutions are usually based on dial-up or ISDN (The cost of a dedicated leased-line backup network would be prohibitive). A major advantage of this solution is its wide commercial availability. From a reliability standpoint, however, a terrestrial backup solution may not be the best solution for two reasons: First, a wide-area terrestrial network is a diverse collection of geographically dispersed devices...
that are interconnected and integrated into a very a complex system. Each device in the network constitutes a potential point of failure. The sheer number and variety of devices in a large terrestrial network make it vulnerable to outages. The second reason is that terrestrial networks are susceptible to the telco vulnerabilities discussed earlier.
Another technology option is microwave and point-to-multipoint (PMP) networks. These technologies can provide a truly diverse backup network for they do not utilize terrestrial facilities. These two technologies, however, are best suited for short-haul or last-mile backup solutions. The cost of a nationwide microwave backup network would be prohibitive and PMP technology has distance limitation of only a few miles.
Satellite networks are well suited for backup applications and can provide a truly diverse backup solution over large geographic regions. An advantage of this technology is its simplicity: a satellite network has only two major components: the ground terminals and the satellite. As a result, satellite networks are less complex, significantly less susceptible to the problems associated with terrestrial networks and hence more reliable. Depending on bandwidth requirements and number of sites, a satellite backup solution can be significantly more cost effective than a terrestrial solution.
5.0 The Disaster Recovery Plan
An effective disaster recovery plan must address the following requirements:
| 1 | Physical path diversity from the existing network |
| 2 | Fast service availability - within 24 hours of disaster occurrence is desirable |
| 3 | Scaleable bandwidth that can adjust to changing capacity needs |
| 4 | Cost-effective - the cost should be much less than the cost of the primary service |
| 5 | Flexible service support |
| 6 | Bandwidth/capacity available for extended time frames |
| 7 | Ability to connect to primary and backup/disaster recovery data centers |
| 8 | Last-mile bypass of disaster-affected area |
The above requirements typically apply to situations were a disaster recovery plan and associated network backup equipment are deployed prior to a disaster. In certain emergency situations, it may be necessary to deploy network equipment after the disaster has occurred. In this case the following would be additional key requirements for emergency communications:
| 1 | Ability to relocate easily |
| 2 | Rapid deployment |
Each enterprise typically has a unique set of disaster recovery needs based on their specific circumstances. These needs include the services and applications they want to recover, which key locations are needed to have communications continuity, etc. Each recovery plan must be customized to meet the enterprise's particular requirements.
There are four components to a disaster recovery program:
5.1 Vulnerability Assessment
The first step is gaining an accurate understanding of how vulnerable the business is to a communications disaster and what impact, economic and otherwise, an extended outage will have on the business. More often than not, enterprises do not have any idea of what impact a major outage will have on their business. The following are the principal activities performed during the vulnerability assessment phase:
5.2 Continuity Planning
Once the degree of vulnerability has been assessed, the communications continuity plan needs to be developed. The following are the key activities done during the continuity planning phase:
- Determination of bandwidth requirements
- Design of the technical solution (in the context of the existing network)
- Development of the migration plan
5.3 Implementation and Testing
The next step is the implementation and testing of the plan. It will also be important to retest the plan on a regular basis (once a year) to ensure that any changes in the network or the applications is provided for in the continuity plan. The following are the key activities performed during the implementation and testing phase:
6.0 Summary
It is critical that companies gain a solid understanding of the impact of downtime to their business and that they have a disaster recovery plan in place that adequately meets their requirements. Furthermore, disaster recovery planners need to understand the vulnerabilities of the communications services they intend to use and develop a solution that mitigates these vulnerabilities and provides a truly diverse backup solution. The survey that follows is a helpful aid for determining if your business needs a disaster recovery plan.
Backup Solution Needs Survey
| 1 | Does your network support mission critical applications, that, when interrupted will impact revenue and employee productivity? |
| 2 | Do you know the total economic cost to your business of network downtime (Lost revenue plus lost productivity)? |
| 3 | Will having an extended network outage cause you to lose market share? |
| 4 | Does your company provide services to customers where penalties can be assessed or contracts voided for lack of performance due to network outages? |
| 5 | Do your Service Level Agreements with your customers include financial penalties for missed performance targets? |
| 6 | The average availability for corporate networks without redundancy is 99.8% (Gartner Group). Is this figure higher than the availability your service provider contractually commits to deliver to you? |
| 7 | A network availability of 99.8% equates to a 17.4-hour outage per year. Is this an acceptable level of downtime for your business? |
| 8 | Are there any single points of failure in your network paths (e.g. common telecom central offices or common building entrance facilities)? |
| 9 | Do you need to call your service provider when a failure occurs (as opposed to having the back up network automatically kick in upon detecting a failure in your primary network)? |
If the majority of your answers to this survey are Yes (No for Questions 2 and 7) then your company needs to implement a diverse backup solution for your network or reassess your existing backup plan.
| Hughes Network Systems
Europe Saxon Street Linford Wood Milton Keynes MK14 6LD UK tel: +44 1908 326 262 fax: +44 1908 310363 url: http://www.hnseu.com |