Managing and protecting all enterprise data


Manage Learn to apply best practices and optimize your operations.

CDP poised to replace traditional backup methods

Continuous data protection (CDP) applications are quickly becoming a force in the backup market. In fact, CDP is poised to replace traditional backup methods.

We've witnessed major changes in data management during a decade of SAN evolution. Advances in disk capacity, huge reductions in the cost of disk and ever-developing tools have significantly changed the way data is managed. Today, we're able to provision and allocate storage online, create snapshots and local volume copies automatically, and replicate data across long distances. The technologies used to perform these tasks are far from "bleeding edge," but many have been widely adopted and are generally accepted as best practices for managing and protecting data.

Compared to the pre-SAN era, the shift is quite dramatic. But amid this data management revolution, backup is a notable exception. Despite advances such as SAN-based shared tape drives and disk technology like virtual tape libraries, the backup process is fundamentally the same as it was 20 years ago. Backup remains a costly and highly intrusive batch operation that's prone to error and consumes an exorbitant amount of time and resources.

Defining CDP
    "Continuous data protection (CDP) is a methodology that continuously captures or tracks data modifications and stores changes independent of the primary data, enabling recovery points from any point in the past. CDP systems may be block-, file- or application-based and can provide fine granularities of restorable objects to infinitely variable recovery points."

    --From the SNIA Data Management Forum CDP Special Interest Group

Based on this definition, products like Microsoft Data Protection Manager (DPM) and other snapshot-based solutions aren't technically CDP because they're not continuous--they don't immediately store every change. But if you view data recovery as a continuum with nightly backups at one end of the scale and true CDP at the other, snapshot management tools must be viewed as dramatic enhancements to recoverability. Each environment has specific recovery needs--for those currently dependent on backups, DPM represents a leap forward despite falling short of the CDP ideal.

I'm not suggesting there aren't well-run backup operations. Many organizations have invested in technologies and staff to make their backup process function as effectively as possible. What I am suggesting is that it may be time to step back and rethink the backup process itself. Does it still make sense or can it be replaced by something better?

The emerging technology lifecycle
Technologies exist today to replace traditional backup and to effectively eliminate the nightly backup cycle. It's possible to provide data protection in an integrated and transparent manner without the invasiveness of nightly backups. This can be accomplished in a number of ways. More importantly, it can be done affordably.

I'm referring to continuous data protection (CDP) and snapshot-based CDP-like products that have emerged in the market. Perhaps the greatest promise of these products is their ability to shift the focus of data protection from backup to where it should be--recoverability.

Although quite promising, these products aren't considered part of the data protection mainstream yet. All new technologies face hurdles, but the adoption curve here, compared to VTLs for example, seems to be particularly long. At what point does a technology evolve from "emerging" to "arrived"? The ultimate metric is the number of adopters, but that begs the question, "What compels people to become adopters?" Here are some considerations:

  • The technology must provide significant benefits over current approaches
  • There must be multiple vendors of the technology
  • The adoption risk can't be too high

Let's measure CDP against those criteria:

  • Initial reaction to CDP products is typically excitement about the possibility of eliminating backups and having near-zero recovery time objectives (RTOs) and recovery point objectives (RPOs) without spending a fortune.

  • The number of vendors in the CDP arena is expanding. The Storage Networking Industry Association's "CDP Buyer's Guide" lists approximately nine vendors/products. If you also include snapshot-based, near-CDP products, the list grows.

  • Adoption risk is the sticking point for CDP-type technologies. Initial positive reactions may be replaced by skepticism or questions about a product's maturity and reliability.

Enter the giants
There are signs CDP is gaining traction, as evidenced by the large vendors embracing the concept. Oracle has incorporated CDP-like functionality, called Flashback, into Oracle 10g (see "Oracle Flashback," at right) that enables fast rewind of databases to earlier points in time. IBM is testing the waters with IBM Tivoli CDP for Files, a product focused primarily on protecting desktops and laptops. Symantec/Veritas and EMC are also talking about introducing CDP products.

Oracle Flashback
Since the introduction of RMAN in Oracle 8.0, Oracle has steadily improved data protection functionality in its products. Introduced in Oracle 9i and significantly enhanced in Oracle 10g, Flashback provides a set of SQL commands that lets users view data as it existed at various points in time. This allows you to quickly identify points of corruption and to restore a database or table to a point immediately prior to the corruption.

Designed to protect against logical corruption only, Flashback must be combined with other technologies, such as backup and replication, to protect against physical loss. While not a complete continuous data protection solution, it's a dramatic step toward a significantly reduced recovery point objective and recovery time objective.

But the most significant entry into this space has to be Microsoft with its System Center Data Protection Manager (DPM) 2006. CDP purists might cringe to see DPM grouped with other CDP products (see "Defining CDP," previous page). Technically, DPM might be considered a snapshot repository and management product. Using replication and the Windows Volume Shadow Copy Services (VSS) infrastructure, DPM provides automated data protection functionality to file servers with far better RPO/RTO capability than traditional backup. It also eliminates the nightly backup window.

Microsoft DPM works with Windows 2000 Server, Windows Server 2003 and Windows Storage Server 2003 to protect server volumes, folders or shares. A DPM agent initially creates and sends a replica of each protected object to a DPM server. The agent then logs byte-level changes and periodically (typically hourly) replicates those changes to the DPM server. The DPM server catalogs this information in its SQL Server database and uses VSS to create point-in-time copies of protected objects based on administrator-defined policies.

The default policy is to create shadow copies three times a day. Given the VSS limit of 64 shadow volumes for each protected object, this provides approximately 30 calendar days (20 business days) of disk-based DPM recoverability. The granularity can range from hourly to daily, and the oldest copy is removed when the 64-copy limit is reached. This limit dictates that an organization should establish an archiving policy to ensure maintenance of older data.

Data restores can be performed by administrators or by permitting users to browse previous versions using Windows Explorer or Microsoft Office 2003 applications.

Some factors to consider:

  • DPM depends on Active Directory to manage access to data. Because of this, it can also identify systems or volumes that aren't protected, which can be valuable for discovering "orphan" systems and ensuring they're properly protected.

  • In its initial version, DPM protects only files. It doesn't handle e-mail or databases, although that functionality will be added in subsequent versions.

  • DPM is a "Windows Server-only" solution. It doesn't protect desktops, laptops or non-Windows servers.

  • Microsoft is positioning DPM as a solution for businesses with 10 to 99 file servers, and for enterprises implementing centralized data protection for branch offices.

  • The DPM server must be protected through replication, backup or a combination of the two. It's important to note that Microsoft is positioning DPM in a disk-to-disk-to-tape architecture and has provided an interface to enable backup software vendors to integrate DPM support into their products. With a fully integrated backup product, client restores from tape, if required, can be performed directly without the intermediate step of recovery to DPM.
Several other CDP products provide similar or greater levels of data protection. Some also protect applications such as e-mail. Whether Microsoft's entry into data protection chills the third-party market or raises awareness and broadens the market remains to be seen.

A number of promising storage technologies have taken time to gain traction: virtualization almost died but has been reborn; iSCSI was long awaited and is now growing; and intelligent switches--well, they're still emerging. Is it finally time to think about dumping your batch-oriented backup infrastructure? For most companies, the answer is "Not yet." However, it's clearly time to assess where these new technologies can be applied. It'll take a few years, but I expect that one day we'll be referring to transparent backup as a "best practice."

Article 14 of 17

Dig Deeper on Data storage backup tools

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

Get More Storage

Access to all of our back issues View All