BOSTON -- There's one thing users, vendors and analysts alike at the Storage World Conference in Boston can agree on -- continuous data protection (CDP) is a confusing term. Even more confusing is who needs it, and why.
"I'm looking at it for my SQL database," said Tory Skyers, network administrator for Fox and Roach Realtors, the nation's fourth largest realtor and a subsidiary of Prudential Financial Inc. "But it's difficult to figure out if it's worth the investment and to make a business case on it to management."
Among the concerns for Skyers is the possibility of writing corrupted data to a CDP store. Most CDP products will roll data back to the last good state of the application, to which Skyers asked, "What's the difference between that and just doing a snapshot?"
Skyers also said he was concerned about using storage space to store corrupted data or to store every single mistake made to files or databases, along with the blocks representing their correction. Another valid concern, Preston said, and another reason to consider CDP very carefully.
Ultimately, however, though CDP has become a commonplace term over the last year, the average user is still struggling to understand the technology and products on the market. That in turn adds up to a scene like the one that took place during a Storage World session headed by Preston Tuesday in which one person out of a room of about 60 attendees raised his hand when asked who was implementing either a true-CDP or near-CDP product.
Moreover, that user, Christopher Baer, an account executive with storage services firm Broadleaf Services LLC, is using IBM's CDP product, which is actually near-CDP. In Baer's environment, files are backed up every hour for laptops and sent to the home directory when the users of those laptops log into the company's network.
"Getting down to the second really doesn't matter," Baer said. There hasn't been a big cry for it among the storage customers he serves, either, he said. "Even in the healthcare and legal professions, hourly snapshots work just fine."
Sorting out the user cases for CDP and near-CDP
Out of user discussions at Storage World, as well as the session moderated by Preston, two key usage scenarios emerged in which true-CDP products have a clear benefit. The first: environments in which multiple servers need to be backed up using snapshots all at once. The second: environments with recovery point objectives (RPO) of zero.
The rest, according to Preston, is just hype -- or better served by near-CDP products, snapshots or some other form of disk-based backup.
Adding further confusion to the market is a process marketed by both EMC Corp. and Hitachi Data Systems Inc. (HDS) called Business Continuance Volumes (BCVs), which are really split mirrors, Preston said, but are referred to as snapshots by vendors. BCVs are full copies of a volume rather than a virtual recreation of a system. And thus, fall prey to further attempts at confusion by true-CDP and near-CDP vendors about the amount of disk space taken up by snapshots as opposed to true-CDP and near-CDP products copying every block of data. A full snapshot or even an incremental BCV in busy environments can take up as much or more disk space as a fully continuous backup -- and in environments where BCVs have found a happy home, true-CDP can have a value proposition to offer over them, Preston said.
True-CDP is also easier from a management perspective in large heterogeneous environments that need to take frequent snapshots. "It is true that when you have multiple different applications on multiple different servers, and you need to quiesce those applications and do a snapshot on them all at the same time, CDP will work better than traditional snapshots," Preston said.
In environments such as banks or online retailers, in which large databases are processing hundreds or even millions of transactions per second and when even one lost transaction could do the business serious harm, true-CDP is also the way to go, Preston said. There are tools within databases, like Oracle's Recovery Manager, that technically offer their own transaction-level recovery, he pointed out. However, these applications often keep the most recent transactions in online redo logs, which aren't copied until they're filled; meaning that depending on how big that log is, transactions could be lost in the event of a failure.
Baer predicted that true CDP could become more attractive as a tool for regulatory compliance as legislation becomes more and more stringent, since true-CDP tracks and keeps every change made to every block within an application, showing when and what changes were made.
Before true-CDP can gain broader appeal, both Baer and Preston said products in this space must, at the very least, become better integrated with traditional backup products and other applications and processes.
"CDP products make the case that they are unobtrusive, which is true, but they're so different from how we've always done things, and traditional backup applications can't control them the way they can other disk-based backup products, like virtual tape libraries (VTL)," Preston said. "It's a nice idea to have everything backed up to within a second, but if you're telling me I still need to run my backup application alongside it if I want to keep that infrastructure, as a user, I'm going to tell you, 'I've got enough problems.' "
Baer said he saw some value in CDP products tying in with document management processes that delete documents from machines that aren't authenticated with a key server. These products are used for compliance purposes so that multiple copies of documents or sensitive information isn't floating around on company laptops.