Continuous data protection (CDP) was bleeding-edge a few years ago, but it’s re-emerging as the best technology for protecting organization's virtual environments.
In 2006, Taneja Group published a technology brief titled “Continuous Data Technologies: A Paradigm Shift.” Back then, we maintained that the traditional method of data protection was seriously flawed and needed a fundamental overhaul.
For decades, the basic method of data protection was based on copy making. To protect a file we made a copy of it and stored it elsewhere -- but we did it as inefficiently as possible. For backups, we would start with a full backup to tape. That meant every bit of data in that volume was transferred from primary storage where it resided, through the application server, over the local-area network (LAN), into the media server and then onto tape. Nightly incrementals came next and any file that had even a slight change was dragged through the backup process again. A bunch of snapshots were taken and these stayed on the primary storage and hogged space. Some snapshots were backed up and occupied space on tape. In a typical IT environment, it wasn’t unusual to find anywhere from 10 to 100 copies of the data on primary storage and tape combined. Often, the cost of protecting data outweighed the cost of primary storage by a factor as high as 5-to-1.
We argued in 2006 that after a volume’s base image was copied changes to data should be captured only once at the time of creation. Because each change was time-stamped, the recovery system should be able to build the contents of the volume from any point in time (APIT). Using this methodology, we would never need a backup window. No fulls and incrementals, and the recovery point objective (RPO) would be whatever we wanted it to be, even zero. The recovery time objective (RTO) would be very fast, too, since the volume image could be grabbed from any point in time. Companies like Mendocino and Revivio promoted this method, but failed. Still, we felt the fundamentals were right and perhaps the concept was ahead of the available technology.
In parallel, other developments were poised to impact data protection in a big way. Vendors like Data Domain (now EMC), ExaGrid, FalconStor, Quantum and Sepaton said that rather than storing multiple copies of data on slow, unreliable tape, we should toss out all that duplicate data and store it only once on inexpensive SATA disks. Files were split into chunks and only one copy of each chunk was kept on disk. When data was replicated, these new systems only sent unique chunks across the wide-area network (WAN) and thereby maintained a capacity-efficient environment on the remote site as well. Good sound thinking, we said. And surely IT responded well, as demonstrated by the success of many of these companies and a drastic drop in tape sales over the past four years.
But the fundamental process of data protection still hadn’t changed. We still ran fulls and incrementals, and we maintained a remote location. And, typical of conservative storage professionals, we often still maintained tape behind disk. So our Iron Mountain expenses stayed with us. But we felt better because backups were faster and more reliable, as were recoveries.
In the past few years, we’ve seen a resurgence of the continuous data technologies (CDT) idea. And this time, vendors have developed products that work. Finally, we think the idea of CDT will get a fair shake and a shot at commercial success. So why would these new products be successful now when they weren’t in 2006? Two things are different today. On the conceptual front, we all recognized that just because we could create the image of a volume as of any point in time, it didn’t mean we should. APIT images may take you to an RPO of zero, but an image that’s inconsistent with the state of the application isn’t very useful. Your RPO for data may be zero, but your RPO of the application could be hours or days. Instead, the more meaningful point in time for recovery is the last consistent state. To make this concept work one needed the ability to generate very rapid snapshots. And for mission-critical applications that often ran on multiple systems and had multiple databases, the system needed to be quiesced across the board for a consistent snapshot to be taken. This level of sophistication wasn’t available in 2006, but it’s now commonplace.
On the technology front, a fundamental piece that hadn’t yet matured in 2006 was virtualization. And virtualization makes the CDT concept come to life, along with the availability of multicore processors. But first things first. Because our focus has now shifted away from “true continuous” to “very rapid but consistent,” we need to rename the technical approach. We now define this new category of storage as vDPAS, for Virtual Data Protection and Availability Storage. The benefits of this type of storage are numerous, including:
- No backups ever
- Excellent capacity optimization
- Near instantaneous recovery of applications, not just data
- Easy to assign different service-level agreements (SLAs) to different applications
- Images can be mounted instantly, requiring no full image to be created before a volume can be mounted
- Works the same way across physical and virtual servers
- Minimal IT involvement
- Ability to make current images available to test and development groups at the press of a button
- One main source for all data protection tasks
- Applicability to cloud
Several companies, in our view, have made the vDPAS concept a reality in the last few years. Probably the best example is Actifio’s implementation. But many others have been striving toward this concept, such as Dell AppAssure and InMage. Almost certainly, legacy vendors are feverishly working toward this new functionality, but they have to juggle sales of current products with introducing a product that could have a negative impact on their current revenue. Still, in the next three years we expect the entire market to offer products in the vDPAS category. It behooves you to take a closer look at these products and to start planning for a major overhaul of your data protection environment. Once you see the magic of this idea you’ll say goodbye to full backups forever. Good riddance!
BIO: Arun Taneja is founder and president at Taneja Group, an analyst and consulting group focused on storage and storage-centric server technologies.