May 30, 2018
Mark Johnson

The importance of disaster recovery testing—and how to get it right

Will your disaster recovery plan see you through ransomware attacks, hardware failures, and natural disasters—or will you be caught flat-footed? If you can’t answer that question with an unequivocal, “We’re ready!” you should be investing greater time and resources in disaster recovery testing.

Today, too many organizations implement backup and disaster recovery solutions and assume they’re prepared to face any eventuality. After all, the solution vendor promised data recovery in minutes.

Well, that’s great, but we simply can’t emphasize enough how important it is to validate your solutions and processes. You need to find your points of failure and fix them before a real disaster strikes.

What should you consider?

Let’s assume you’ve already fully-documented your infrastructure, application dependencies, data flows, costs of downtime, and SLAs.

What’s next?

When it comes to disaster recovery testing, here’s what you should keep in mind.

 

Test both your DR solution and your people

While automated disaster recovery tests serve an important purpose, they only test the technical component of your DR plan. In the event of a real disaster, your people will also need to work quickly and confidently to rapidly restore uptime.

Conducting both tabletop tests and simulated technical tests will help ensure your people are prepared to execute against your documented policies and procedures.

Never underestimate the importance of the human element.

 

Commit to regular disaster recovery testing

Your DR plan is only as good as your weakest link.

Yet, in our recent survey of 600 channel partners and IT decision-makers, only 44 percent had a DR plan in place. And, of those, only 31 percent ran DR tests more than once a year.

So, how frequently should you test your plan?

That will depend on your business—what works for a local advertising agency won’t work for a regional financial institution. That said, we recommend you run a full test at least every year. And, if you’re required to comply with stringent regulations like PCI DSS, we recommend more regular testing.

Remember: The more frequently you put your people through their paces, the more prepared they’ll be to respond in the face of disaster. And, with regular turnover of your IT staff, regular tests will be absolutely critical when it comes to spinning up new team members.

 

Design your DR test

Regardless of whether you’re running a short drill to test discrete applications or you’re running a full-scale test, you’ll need to fully document your DR testing plan before you begin.

Consider:

  • How long it’s been since you’ve DR tested your critical applications, and which should be included in your next test if it’s not a full-run
  • Changes to your IT infrastructure that may necessitate updates to your plan, which must then be validated through DR testing

Define:

  • Who will be involved
  • What, specifically, you’re testing
  • The goals for your test
  • Your expected results

 

Thoroughly document your DR test

When running your DR test, it will be crucial to task one person with observing and documenting the test; this should be their sole purpose on test day.

During the test, this person will document any hiccups and record the time it takes to complete each step in your documented disaster recovery procedure.

They should keep note of:

  • Time required to failover, restore uptime, recover data, and failback
  • Unexpected technical failures
  • The human response to unexpected surprises
  • Instances where people encountered a lack of clarity in the DR plan, which slowed their progress and created anxiety amongst the team

 

Review and update your disaster recovery plan

All the testing in the world serves little purpose if you don’t leverage the insight you’ve gained to address vulnerabilities in your DR plan.

Did the disaster recovery test reveal any holes?

If so, it’s time to gather key stakeholders to determine your acceptable level of risk and how you can reduce the impact of data loss and downtime.

With regular disaster recovery tests and continuous improvements to your plan, you’ll be prepared to weather any storm.