The RTO is the amount of time in which a system must be recovered in the event of an interruption. It's driven by the business and represents the maximum amount of time a company can tolerate an interruption of a business function or process that depends on the system. The RTO plays an important role in the development of data recovery strategies and is often used as the basis for the development of a service-level agreement (SLA).
An RPO is the point in time to which a system's data must be recovered after an outage (i.e., last night's backup, the last transaction before the outage, etc.). The RPO is often used as the basis for the development of data backup strategies. It's also used to determine the amount of data that may need to be recreated after the system is recovered or, ultimately, what constitutes acceptable data loss.
Depending on the type of application and the business function it supports, the RTO and RPO can be miles apart. For example, an application providing a critical service relying on static data may have an RTO of one hour, but an RPO of one day. The application must be up in one hour, but using yesterday's data is OK. Conversely, a four-hour email server outage may be tolerable to a company, but losing a single email message may be unacceptable for business or legal reasons. We end up with an RTO of four hours and RPO of zero (down to the last transaction).
Now with all that said, it's possible to miss a RTO while trying to meet a RPO. In other words, it takes so long to restore the data to meet the RPO that the application ends up being down longer than allowed. This is when we need to start rethinking our data recovery strategies.
This was first published in April 2008