Any-point-in-time backups

Continuous data protection captures changes at a file- or block-level as they happen, and provides running recovery journals for all historical data states. This shifts data protection to a more flexible any-point-in-time framework.

The ability of CDP systems to recover application data from any point in time may forever change the way critical data is protected.

Continuous data protection (CDP) is a buzz-inspiring new technology. Because CDP captures all data write changes at a file or block level, and provides running recovery journals for all historical data states, data protection shifts from a point in time to a vastly more flexible any-point-in-time framework.

The old methods of restoring individual backups from mirrors, synchronizing them in time against an archive log and then staging that data back to a live environment may soon be eclipsed by a more unified process comprising much fewer steps. For example, to execute a database recovery with CDP, you simply roll back to the correct time for the event, stage the CDP data set and allow the application to initiate its restore. Theoretically, IT managers could discard their traditional scheduled backups completely and reap significant savings in time, money and management efficiencies.

A tale of two applications
Interest in continuous data protection (CDP) is generally related to two major application groups:

DATABASES. The vast majority of enterprise interest in CDP focuses on enhancing database recovery.

E-MAIL. The complexities associated with Microsoft Exchange recovery are turning Exchange-focused CDP into a booming new business.
So far, CDP technology has been limited to a small niche of early adopter and specialty deployments (see "A tale of two applications," at right). But CDP has an immediate and far-ranging role to play in the enterprise because it has the ability to transform application recovery.

Many enterprises deploying CDP are attempting to remedy some pronounced pain or complexity that existed in their prior recovery processes. Those processes might involve a Microsoft Exchange environment that takes three days to recover, or a top-level application that spans multiple databases and requires massive scripting to juggle rotating mirror schedules.

Most IT shops are comfortable with some level of recovery complexity, but beyond an acceptable threshold a breaking point may be reached. This may be the limitations of a given technology or the inability of the management team to handle a certain level of complexity with reliable outcomes.

An IT executive at a major media services company has added Mendocino Software's CDP solution to the company's storage environment to handle database recovery, even though it's still using Veritas Software Corp.'s NetBackup (now owned by Symantec Corp.) for its main backup program. Mendocino's CDP solution is used as a rapid recovery platform that enables the firm's database managers to directly and easily control recovery operations. Rather than using "application blind" split-mirror or snapshot programs to support critical database recovery, this user wanted to bring the recovery operations directly under the control of database managers.

We also spoke with an IT manager at a leading nutritional product retailer who chose Storactive Inc.'s LiveServ for Exchange because he wanted greater application-level controls for Exchange-based recovery events. Storactive's ability to perform e-mail object-level recovery in seconds, or minutes, from any point in the protected history constituted a significant step ahead for this user because the recovery operation used to take days.

Most importantly, the new arrangement let the company's Exchange managers skip an entire range of recovery steps that used to heavily impact the storage team, while providing fast, incremental recovery services to business users. The company hasn't abandoned its existing Veritas NetBackup program either. It still uses NetBackup to handle normal backup operations, with the Storactive product dedicated to application-specific recovery.

A healthcare management company uses Revivio Inc.'s Continuous Protection System (CPS) for application recovery across multiple Oracle and Microsoft SQL Server environments running in a 100TB-plus environment. This company selected Revivio because of problems with the existing hot backup architecture for its multiple databases, which was extremely failure-prone and often corrupted. Since deploying the Revivio product, the company has eliminated the complexity of its rotating mirrors for backups and allowed database administrators to take more control of the recovery process, which had been a long-term goal of the team.

There are three basic elements for application-enabled recovery:

  • Application-centric recovery. The application executes and controls the recovery process from any point in its protection history.
  • Little human intervention. CDP technologies should minimize the need for the administrator to touch any infrastructure elements outside of the application.
  • Event-based recovery. Groups of managers should be able to define any number of business values from which recovery operations can take place for one or more applications. These could be recurring business events such as "quarterly closes" and "CRM software updates," one-time events such as "new servers brought online," or even a meta class of correlated events such as "pre-consolidation/post-consolidation."

CDP products should be carefully evaluated in the context of specific application environments (see "Pros and cons of CDP products,"). The following questions will help you decide whether a CDP product is something you should consider for your environment (see "Does your application need CDP?").

Do you have an application that's routinely taken offline during operations, and can this downtime be managed on a regular basis without business loss?

If the answer to this question is "Yes," you probably shouldn't consider using CDP for this application. Existing snapshot and mirroring technologies should provide adequate disk-based recovery capabilities. CDP might be overkill.

What's your sensitivity to data loss for this application? Is it measured in data lost over days, hours, minutes or seconds? What's the recovery point objective (RPO) trend over the past year?

If you're measuring RPO in minutes (or less) of data loss for this application, you should definitely consider CDP because you can more easily achieve those fine-tuned goals. On the other hand, if a data loss of hours (or longer) is currently acceptable, CDP enablement for this app may prove to be excessive.

What's your sensitivity to downtime for this application?

If you answered "It's always highly sensitive" or "It's becoming a five-nines environment," CDP can improve the recovery time objective (RTO) for this application; however, a detailed analysis of cost/ benefits must be conducted.

Is your application recovery failure rate in the double digits? Do you rely heavily on recovery scripts that require ongoing updating and attention? Do common storage activities such as provisioning place applications at risk?

A "Yes" response to any of these questions could mean your environmental complexity is sufficient to merit an evaluation of CDP technologies. These questions force you to analyze the indirect costs of a complex application environment. One of CDP's advantages is that it removes complexity by collapsing formerly discrete functions into one action (for example, providing a single, user-specific restore image for a database vs. synchronizing multiple, independent backup images as in traditional hot backups).

If you decide to investigate CDP products, the next question you need to answer is "What application(s) will CDP protect?" To protect multiple database applications spanning multiple platforms will require a fundamentally different product than one that protects Exchange.

To help you narrow the field, there are three general approaches being pursued by CDP vendors today. They are multiplatform, platform-centric and application-specific.

MULTIPLATFORM. The vendors in this category, Mendocino Software and Revivio, have taken a bottom-up approach to CDP. Their goal is to develop a block-based CDP engine with the broadest applicability possible across the widest range of computing platforms and applications. Users with large, mission-critical database deployments--especially those that span multiple operating platforms--should start their CDP implementation projects by evaluating multiplatform CDP products.

PLATFORM-CENTRIC. The products in this category from TimeSpring Software Corp. and XOsoft Inc. are deeply integrated into the Windows platform and support a wide range of Microsoft applications, including Windows file servers, SQL Server and Exchange. These products will appeal to users who want to leverage a single technology base across as many of their applications as possible. This is especially true for medium-sized companies seeking a comparatively economical entry into application-enabled recovery.

APPLICATION-SPECIFIC. The CDP vendors in this category have a narrower focus than those in the platform-centric group, opting to develop a deeply integrated solution for one particular application. The main application that's targeted in this category is Microsoft Exchange. Notable application-specific vendors include FilesX Inc., Mimosa Systems Inc. and Storactive.

Does your application need CDP?

DOES IT FACE recovery point objective or recovery time objective pressure?

IS RECOVERY a complex process?

If you answered "YES" to any of these questions, you should consider evaluating continuous data protection (CDP) to enhance your application recovery processes.

Blocks or files?
In the block approach, all data entering into the CDP application is stored in much the same way a traditional volume manager writes data. Whether that data capture is taking place on the host or in a network device, any data state for the application will be recovered. Because it's free of file-level semantics, block-level data capture works across all data types: structured, semi-structured and unstructured content.

When it comes to recovering data, a block-level approach creates "recovery objects" determined by the application requesting it: database tables or rows, e-mail items, mailboxes, etc. However, block-level approaches don't automatically recover file-level information because this requires integration with a file system. For database application recovery, block-level data capture is the preferred method employed by CDP vendors today.

File-level CDP products are typically built with extended functionality for particular applications, such as SQL Server or Exchange. By recovering data directly to the file level, the CDP product can achieve a much tighter integration with a top-level application because no conversion is required between the physical and logical layers, as would be the case in a block-based approach. Users who want to implement CDP across multiple platforms will have to use a CDP application based on the block approach.

The future
Today, approaches to application recovery vary widely among CDP products, as do the degrees to which these products integrate with supported applications. Within the next 18 months, many of these architectural differences will recede. Multiplatform vendors will bring to market a range of application-specific modules and integrated tools, deepening support across key applications. At the same time, application-specific CDP vendors will supply more automation capabilities and direct integration with business processes.

Looking further out, over the next three years, CDP technologies will likely become an integrated part of an emerging discipline of application recovery. Ultimately, top-level applications riding above databases (such as ERP, CRM, Web-based services, etc.) will be able to execute a range of self-healing or self-recovering functions without direct administrative interaction from the storage layer. There will also come a time when the application will be able to identify a corruption event and correct it without any intervention.

You can expect the major application software vendors, such as Microsoft Corp., Oracle Corp. and SAP AG, to expose APIs in their applications or provide tool suites that enable CDP-driven recovery. CDP technologies, working in direct conjunction with the app, will chart the future of information recovery.

Dig Deeper on Data storage backup tools