Problem solve Get help with specific problems with your technologies, process and projects.

Continuous data protection: Near vs. real CDP

Continuous data protection (CDP) and near-CDP offer data recovery options that aren't possible with a traditional backup system. Backup expert Curtis Preston explains the differences between real and near-CDP and the pros and cons of each.

W. Curtis Preston

Continuous data protection (CDP) and near-continuous data protection offer recovery options that are simply not possible with a traditional backup system. Both CDP and near-CDP support instantaneous recovery, allowing your application to immediately mount a recovery image when the primary image is damaged. The difference between the two is the recovery point objective (RPO) that they offer; CDP offers an RPO of 0, and near-CDP offers an RPO of however often you are taking a snapshot (typically one hour).

Both continuous data protection and near-continuous data protection work by replicating changed data from the protected system to a target system. That target system can be located onsite or offsite, but is usually located onsite. This is because many people are using CDP as a method of very quick recovery and not yet using it for disaster recovery (DR), although some are starting to use it for both. Depending on the product capabilities and your budget, there may also be a second target system that also receives the data for DR purposes.

The evolution of near-CDP
We didn't call near-CDP systems "near-CDP" until the CDP market was invented. Companies that built up the "real" CDP market tightly defined CDP so it excluded anything that did snapshots. But then everybody that did snapshots and replication wanted to be known as CDP products, and you had two different groups of products both saying they did the same thing -- CDP. The truth is that they both are still very different than traditional backup, as they both are block-level-incremental-forever products, so they were more like each other than they weren't. The term "near-CDP" was born. Now we call all products that do snapshots and replication "near-CDP."

Near-continuous data protection capabilities have been present in storage systems for years. If you have a storage system that can take snapshots and replicate, it is a near-CDP system. Protecting files with near-CDP is relatively easy; protecting applications takes a little more work. Some people put the application in some kind of "backup-ready" state (e.g., Oracle's backup mode) and then take a snapshot. Microsoft's Volume Shadow Services (VSS) accomplish the same thing in an opposite way; when you take a snapshot using the VSS API, it triggers supported applications to prepare for a snapshot. The result is the same; you have a virtual picture of what the file system looked at the time the snapshot was taken. This virtual picture can be used to recover from logical corruption (e.g., someone dropping a table they weren't supposed to) and you can then replicate that snapshot to a target system for protection against physical loss (e.g. lost disk drive). When you have either logical or physical problems, the only data you will lose with a near-CDP system is the data created since the last snapshot.

The pros and cons of true continuous data protection

True CDP systems record every write and send it to the target system that stores these changes in a log. With true CDP, you don't need to place the supported applications into any kind of backup mode to back them up, as the CDP system stores every write in the order it occurred. If there is physical loss, the CDP system will contain changes up to the last write before it failed, and can restore the system right up to that point. In the case of logical corruption, it can restore the system up to the last write before someone caused the logical corruption to occur.

There are two different types of true continuous data protection systems: volume-based and application-based. A volume-based CDP product will protect any application stored on that volume. An application-based CDP system is designed to protect only one application, and will protect that application wherever it happens to be stored. The former tends to be more expensive (but versatile), and the latter tends to be less expensive and offer advanced functionality specifically designed for that application.

Choosing between CDP and near-CDP can be a tricky business. If you need an RPO of zero, your only choice is CDP. If your RPO is an hour or more, then either will do, but near-CDP has more time in the field. If you're leaning towards CDP, but like the idea of snapshots, some CDP products offer both, so you should look into that. You should also examine the CDP and near-CDP capabilities of your primary storage vendor and backup software vendor. You already have capabilities in this area. Just remember to test everything and believe nothing until you see it with your own eyes.

For more information on CDP, listen to W. Curtis Preston's CDP FAQ podcast.

W. Curtis Preston (a.k.a. "Mr. Backup"), Executive Editor and Independent Backup Expert, has been singularly focused on data backup and recovery for more than 15 years. From starting as a backup admin at a $35 billion dollar credit card company to being one of the most sought-after consultants, writers and speakers in this space, it's hard to find someone more focused on recovering lost data. He is the webmaster of, the author of hundreds of articles, and the books "Backup and Recovery" and "Using SANs and NAS."

Dig Deeper on Disk-based backup