CDP products are available in two forms: near-CDP and real-CDP. Near-CDP products take advantage of application, file system or volume-level snapshot features that are available with some applications and server operating and storage systems. Near-CDP products take frequent, periodic (typically about every 15 minutes) snapshots of application data using these available application or system snapshot features.
Conversely, real-CDP products continuously capture or track data modifications and store changes independent of the primary data. These products may be block-, file- or application-based and provide finer levels of granularity in restoring data by providing a theoretically infinite number of recovery points.
Most near-CDP products are limited to a maximum number of snapshots that the application, operating system or storage system can create. Once it reaches that threshold, it starts to overwrite earlier snapshots. The amount of available storage space and even what type of disk drive is utilized impact how many snapshots are taken, and what it costs to retain them. For example, storage system-based near-CDP implementations may only work if the data is journaled on Fibre Channel disk drives as opposed to larger capacity, more economical SATA disk drives.
What users can do with these near-CDP snapshot images is also dependent on how near-CDP is implemented. Some vendors' near-CDP implementations only provide for read-access to the snapshots so the snapshot must be restored before users can read or write to the data.
Real-CDP introduces limitations of a different sort. These products must first make a complete copy of the data at the target, which can take hours, even days, to complete depending on the amount of data, the bandwidth and the distance between the target and source. The change rate of the application will also impact how long the company can keep the data and how much storage will be necessary.
While real-CDP generally allows companies to keep data on any type of storage (assuming it's deployed on the host or on a network switch or appliance), keeping track of every change in applications with high change rates can rapidly consume available storage space. CDP products can typically keep a minimum of three to 30 days of application data online and available before it starts to run out of space and overwrite older data.
Whether to choose real-CDP or near-CDP typically depends on the amount of application data loss that a company can withstand, application change rates and the level of integration that the CDP product has with the application. File servers are generally good candidates for near-CDP because the data on them is less mission-critical and users are usually more tolerant of data loss on file servers.
However, in environments where this level of data loss isn't an option, companies need to take one of two approaches. One approach is to use a real-CDP product that can recover to any previous point in time. In these cases, companies still need to verify real-CDP products integrate with specific applications so they can insert markers at designated times into the CDP journals to ensure application data consistency.
The other approach is to use a near-CDP product that can provide consistent recoveries of applications and databases because it captures the application at a known good point-in-time. Recovering data using this method, however, may take more time than using real CDP since the snapshot needs to be recovered first, and then administrators must access the application and roll the transactions logs forward or back in order to recover the data.
About the author: Jerome M. Wendt is lead analyst and president of DCIG Inc. You may read his blogs here.
This was first published in February 2008