Today, most organizations use one or more commercial backup software solutions to protect their data, but that’s only part of the puzzle, as many leave their systems and applications unprotected. This means that when unplanned IT outages happen, business operations often come to a halt, frustrated employees sit idle, and customer satisfaction nosedives along with the company’s reputation.


A survey sponsored by CA Technologies has quantified just how bad the problem can be, revealing that businesses in North America and Europe are collectively losing more than 127 million person-hours every year in employee productivity because of IT downtime. That works out to each company losing an average of 545 person-hours annually, and the equivalent of a 63,500 person workforce being shut down for an entire year.


What’s even more troubling is that fact that a third of the 2,000 survey respondents don’t even have a formal disaster recovery policy in place. That’s despite the fact 50% of organizations said IT outages damage their reputation, and 18% call it “very damaging.” I discussed these and other findings in a video.

The good news is that the lost productivity and damage caused by IT downtime is really quite avoidable with the right systems protection and recovery strategies in place. And the tools available to help are as practical as they are powerful.


There are many technologies available to help speed system recovery and some enable continuous availability-especially useful for more critical systems. Many backup software solutions offer Bare Metal Restore (BMR) technology which essentially takes a snapshot of the entire system and data, and in many cases, enables recovery to both similar and dissimilar hardware.  BMR helps eliminate the need to manually rebuild a server that would otherwise include re-installing the operating system and application.  It’s like the old “imaging” solutions that were developed back in the 1980s that are still used by IT staff today to more quickly build and deploy servers, desktops and laptops.


There are also other technologies that provide even faster system restoration.  Replication software not only copies data, it can replicate an entire server or virtual machine environment from a production system and storage to a replica system and storage.  Unlike the BMR process which takes a point-in-time snapshot, replication software provides continuous protection and recovery, capturing all ongoing changes to the operating system, system state, application and data for better protection and recovery.  Also, with replication software, administrators can manually redirect workloads and end users, and use the replica server and storage as the production system until they replace or repair the original production system.  Replication also eliminates the storage device as a potential single point of failure-especially important for virtual server environments that use shared storage.  The replica server and storage can reside on-premise, off-premise and in the cloud.  Administrators can also perform many-to-one replication and use virtual servers as the replica servers to reduce IT costs.


High Availability software takes replication to an even higher level of protection and recovery by providing  system and application monitoring with automatic and push-button failover and push-button failback.  That means at the first sign of trouble, end users are automatically redirected to the replica server and continue working while IT investigates and repairs the issues on the production server.  Once the production server is repaired or replaced, administrators can perform a failback from the replica server to ease and speed the recovery process. The replica server and storage may reside on-premise with the production system, or it can be deployed off-premise and in the cloud to have both business continuity and disaster recovery protection. And remember, virtual servers can be used as replica servers to help reduce costs.


There is also server clustering software like Microsoft Windows Server Failover Clustering included with the Windows Server Enterprise and Datacenter editions.  A cluster is basically a group of servers that are connected and share processing power and storage.  If one server in the cluster fails, end users leverage the other servers in the cluster to keep working.  Just keep in mind that in this scenario, the shared storage device is a potential single point of failure which needs its own protection, and if you want to cluster to a remote site for disaster recovery, you need to purchase a separate replication solution.


For the most critical systems, such as those used in high-volume financial transaction processing environments like banking, brokerage, and retail, IT organizations have been using fully redundant system architectures for decades. While ideal for high availability, in this scenario offsite system and storage solutions for disaster recovery are treated as a separate consideration.


Clearly, a variety of practical system recovery options are available. Why risk system downtime that affects all parts of your business and organization, customer service and satisfaction, and even your company’s reputation? Just remember that backup is only part of the equation.