News Stay informed about the latest enterprise technology news and product updates.

CDP: An overview

Downtime to recover terabytes of data can cost a company dearly in terms of income (and customer satisfaction). Continuous data protection (CDP) is emerging as an attractive supplement to traditional backup schemes when downtime isn't an option. It's clear that CDP isn't a perfect or universal backup solution. Some implementations involve agents, or can impact network performance. CDP users must also watch for potential configuration and interoperability issues. But CDP solutions are appearing and proving themselves in very demanding enterprise applications.

Continuous data protection (CDP), also referred to as "continuous backup" or "time addressable storage," tracks and records changes to enterprise data on the fly in real time. CDP creates a running journal of storage activity, with a new entry generated each time a change occurs to the system. This record is so detailed, it can track write operations or even individual I/O cycles -- depending on a specific product's granularity. When trouble strikes, technicians can create a single snapshot of the system just prior to the event, and then restore the system to the very point prior to an event.

Understanding differing definitions

While many of today's smaller startup CDP vendors echo this definition, not all vendors are onboard. The underlying problem, it seems, is the issue of "granularity" – how many points are needed (in a given time) for protection to really be considered "continuous." Vendors like Storactive Inc. and Mendocino Software support the Storage Networking Industry Association's (SNIA) formal definition of CDP as "a methodology that continuously captures or tracks data modifications and stores changes independent of the primary data, enabling recovery points from any point in the past." CDP products that maintain running logs of individual transactions can be restored to within milliseconds of a fault event. Other vendors, such as Network Appliance Inc., seek to expand the SNIA's definition to embrace a "snapshot" methodology that records system states at regular intervals (perhaps every few hours to every few minutes).

Still, vendors and analysts agree that snapshots offer a powerful complement to CDP. W. Curtis Preston, vice president of data protection at GlassHouse Technologies Inc., points out that snapshot products are still a great leap over more traditional backup processes. "Snapshot-based backups are probably 'the' most prevalent, most used, alternate backup methodology today," he says. "CDP doesn't even come close to the market share [currently held by snapshot products]. I think the value is absolutely there. So I don't want to dismiss them – they're just not 'CDP.' " CDP products have the ability to reach a specific write or I/O operation just prior to the event – absolutely minimizing recovery time. It all comes back to an issue of granularity.

Important implementation choices

CDP products differ in their hardware/software implementation. Vendors and analysts agree that prospective users must understand the applications that CDP is intended to protect. File-based applications are often best served by a file-based CDP product, which can restore individual files on demand, and is often quicker for smaller or limited restorations. By comparison, block-based applications often run on raw volumes for improved performance. Block-based CDP products operate at a lower level and can handle all types of applications, but recovery takes a bit longer since the entire volume must be restored before recovering particular files. Block-based CDP products will likely include plug-ins to support higher level file operations.

Prospective users must also choose a "recovery engine" from a mix of host-based and network-based products. Host-based CDP employs software running on each server you're protecting. This eliminates the need for dedicated hardware, but imposes additional software maintenance overhead (and you may need to buy/license software for each server). Network-based CDP appliances may protect numerous servers simultaneously, but "in-band" appliances (in the data path) can become a performance bottleneck. "Out-of-band" (outside the data path) appliances don't limit throughput, though agent software is often needed for each server. The general consensus appears to be that both host- and network-based approaches are equally acceptable. It's just a matter of matching your IT needs to the most compatible product.

Go to the next page: CDP: Strengths and weaknesses

Or skip directly to another part of this article:

  • Introduction
  • CDP: An overview
  • CDP: Strengths and weaknesses
  • CDP: The vendors
  • CDP: User perspectives
  • CDP: Future directions
  • Dig Deeper on Disk-based backup

    Start the conversation

    Send me notifications when other members comment.

    Please create a username to comment.