I know it’s a bit of a provocative title if you’re in the data protection space, particularly if you’re very focused on operational aspects.
But the thing is, objectives such as recovery point and recovery time objectives (RTO and RPO) are just that: objectives. They tend to be meaningful to only a few people, are typically grossly misunderstood by the broader organization, and while they should be absolutely key, they often end up being another elusive thing “we’ll focus on at some point”.
The Arcserve team has spoken to many organizations around the world the past few quarters, and one of the most pervasive manifestations of the “broken state of backup” is that people generally understand where they need to be from an RPO and RTO standpoint. People are smart. They have to be in the data protection business.
IDC surveyed organizations across multiple market segments regarding their RPOs and RTOs at the end of last year, and most in the mid-market (Arcserve’s focus) have what I would call pretty stringent requirements. As a matter of fact, today’s mid-market RPO and RTO requirements feel the same as what I saw in the high end of the enterprise space a few years ago. For mission-critical applications, the vast majority of organizations need RTOs under 4 hours and RPOs under 1 hour, with many needing RPOs to be zero (no acceptable loss of data) to under a minute.
So how do you get from objective to an actual number you can bet your career on, or at least feel very confident about? I would like to propose a simple methodology to get there and some considerations associated with how we see it here on the basis of our technology.
First, users need to take a hard look at application and system criticality and establish a protection and recovery prioritization that also reflects the interdependency of related systems. For example, a mission-critical application may have “feeder” systems that need to be protected at the same level. It’s a tough exercise but one that allows organizations to truly understand their recovery point and recovery time (RPO and RTO) needs, establish tiers of recovery, and therefore determine what the data protection levels need to be. Not every application or piece of data is born equal.
In addition, organizations should take a fully coordinated approach when it comes to various data protection technologies. Guaranteeing RPO and RTO is difficult when disparate solutions are partially protecting a mixed or hybrid IT infrastructure. A thorough review of the data protection processes, applications and interdependencies is key. Also, in some cases, bespoke scripting may have been developed to “coordinate” certain processes, and these need to be clearly understood.
It’s hard work. But remember that the focus needs to be placed on recovery, not backup. Data protection is only as good as the recovery it offers – and robust testing is critical to ensuring reliability.
This brings us to the concept of Recovery Point and Recovery Time Assurance. RPOs and RTOs are for planning purposes, for prioritization, and for discussion with the business leads. They fund you after all!
RPAs and RTAs are what you can actually deliver against, consistently. They stem from measuring repeated and regular tests that tell you what you can bank on. In addition these RPA/RTA tests should not be disruptive to operations, nor should they prevent active data protection processes from running.
They also need to be as automated as possible so that you don’t spend too many resources on it, and they can’t cost you a lot of budget either. In other words, it really requires a cost effective, fully automated test environment – a disaster recovery sandbox, if you will. That’s the only way you get to a number that represents what you can deliver against, every time.
That’s what we mean by RPA and RTA: It’s not only “A” for “actuals”, but it should really be “A” for “assurance”. And that’s what really matters.