Continuous data protection (CDP) is becoming an increasingly attractive data protection and backup option. But is CDP right for your business? Backup guru W. Curtis Preston compares CDP to traditional backup in this FAQ.
By submitting your email address, you agree to receive emails regarding relevant topic offers from TechTarget and its partners. You can withdraw your consent at any time. Contact TechTarget at 275 Grove Street, Newton, MA.
Table of contents:
CDP, in its purest sense, is replication with a back button. It works exactly like replication in that, as the data is being changed on the source site, it's being automatically copied to the target site. The difference is that the target system maintains a log that allows you, in the case of some sort of logical corruption, to go backwards in time. True CDP would allow you to go backwards in time to an infinite number of points, so literally, just before or just after a particular write that causes a particular problem.
There is also something called near-CDP, which is a similar concept, also using replication. But near-CDP can only recover to what we call significant points in time, which are the points in time in which you took a snapshot. So, with true CDP you can recover to any point in time, prior to the current point in time, as far back as you've kept the log of transactions. Near-CDP can recover to significant points in time, which is when you took the snapshots.
The first and the most obvious is that traditional backup, however it's done, is done by transferring bulk copies of the data from the source system to the target system. Generally speaking, it's a series of full and incremental backups. Even with products like IBM TSM, that do a progressive incremental approach for applications like Oracle and SQL Server, they still do a series of full and incremental backups.
So, that's the first thing with traditional backups, you're transferring a significantly larger amount of data on a very regular basis. Whereas with CDP, whether it's near-CDP or true CDP, all you're transferring from one system to another, are the bytes that are changing, as you're changing them. That's the second difference by the way; traditional backup is done as a batch process, typically every night. CDP and near-CDP are running throughout the day.
The other differences are, with traditional backup you can only recover up to 24, or 18, or 36 hours ago, depending on which point in the continuum you happen to have your disaster. Whereas with CDP and near-CDP you could recover up to minutes before or even seconds before the problem.
I generally recommend that people do things in the simplest or most traditional way possible. Based on that, I don't recommend CDP or even near-CDP to everyone or even a majority of our customers. But we do have customers that are unable to meet their requirements with traditional backup. They can't go 24 hours or 36 hours between recovery points. This is their recovery point objective (RPO) and they can't lose 36 hours of transactions on their Oracle Database. I would have that customer examine EMC Avamar and the use of recovery logs, to see if they can get closer to the point in time they want to find.
But generally at some point we have a customer that says "I simply can't meet my objectives with traditional backup." So, then I would first have the person examine both CDP and near-CDP. If they have an aggressive requirement of zero or one second, then CDP is the only way they can accomplish that goal.
If they are comfortable with an hour, then near-CDP is the more "traditional" way, if we can talk about CDP being traditional. True CDP is a little newer than near-CDP. Near-CDP is available from a number of storage array vendors, volume management vendors and RAID vendors. All near-CDP is, is a fancy term for snapshots and replication, which have been around for a long time.
So, if their requirements are such that they can be met by near-CDP, I would have them compare the costs of both and see which one is the least expensive and the easiest to maintain, because that ongoing management is a big part of where those costs come from. Then have them select which approach is the most appropriate for them.
W. Curtis Preston authored "Using SANs and NAS" and "Unix Backup and Recovery," the seminal O'Reilly book on backup. He is also the webmaster of BackupCentral.com. He has been designing storage systems for more than 10 years and has designed systems for environments ranging from backup systems for small businesses to enterprise storage systems for Fortune 100 companies. His passion for backup and recovery began with managing the data growth of a 24x7, mission-critical environment.
Since that time, Preston has been able to help many companies design resilient storage systems, and his client list includes many Fortune 100 and Fortune 500 companies. W. Curtis Preston is Executive Editor and Independent Backup Expert, TechTarget Storage Media Group.