In case you think you're missing out on what's happening in the data backup and recovery space, here's my best attempt at trying to predict the future of 2010, and how your data backup and recovery planning may change.
My first prediction for 2010 is there will be increased adoption of target deduplication in its various forms. Target dedupe is when regular backups are deduplicated at the backup server -- usually by an appliance. Target deduplication will be big in 2010 for variety of reasons, starting with the fact that I believe it is the right technology for a lot of customers. Target dedupe allows customers to continue using the backup software they already know, while having its backups deduped when written to disk. In addition to allowing increased retention to disk, it also allows users to replicate their backups offsite without using tapes.
Other reasons for target dedupe's increased adoption have more to do with some of the companies that are pushing it. EMC Corp. acquired Data Domain last year, spent $2.1 billion dollars, and their sales teams have been given marching orders to show the stockholders that this wasn't a bad move. Their response has been as expected: They're selling it like hotcakes. I'm not saying that EMC/Data Domain has the best product in the market, but I am saying that it is a strong product with an even stronger sales force behind it. Expect your EMC rep to be very interested in selling their latest product addition.
And then there's ExaGrid Systems Inc. Its CEO Bill Andrews said that ExaGrid beat Data Domain 74% of the time that they go head-to-head with Data Domain, and he said that this number has remained true since EMC acquired Data Domain. One of the big reasons that this is the case is that they have a global dedupe story and Data Domain still does not. I predict that this product differentiator will continue to give them even more sales in 2010. Their sales will, of course, be dwarfed by EMC's, but ExaGrid is doing just fine. If you really want to make your EMC sales rep nervous, though, bring ExaGrid in.
Another target dedupe company is CommVault Systems. CommVault has designed from the ground up a product that leverages the power of deduplication and the power of its Common Technology Engine. While IBM Corp. and Symantec Corp. also have target dedupe software, CommVault's product is the first target dedupe product I've seen from a backup software company that made me truly rethink my general position of preferring appliance solutions. The company is selling a lot of the product so far, and I believe it will continue to do that in 2010. I also believe that they've got to be working on making the final steps that product needs to become a source dedupe product, too.
The other target dedupe companies, FalconStor Software Inc., IBM, NEC Corp., Quantum Corp., and Sepaton Inc., will also continue to sell their products into this space as well. They will have varying degrees of success for a variety of reasons -- suffice it to say that target dedupe will continue to grow as a percentage of the backup device market.
Source deduplication is backup software that communicates with a source dedupe backup server and only sends unique data across the network -- data that has been seen before from any client is never transferred again from any other client. (For example, a file that has been backed up from one client will not be backed up from any other client.)
The market for source dedupe is remote offices and laptops, as they really need a product that uses very little bandwidth. Source dedupe was designed with this as the primary design goal. As a result, it's really good at saving bandwidth and not so good at high levels of throughput (compared to target dedupe products). This is why it is most appropriate for remote offices and mobile data where bandwidth is the primary consideration.
A related issue that source dedupe can help with is server virtualization. If you think about it, virtual machines (VMs) are very similar to remote offices: Servers with very little bandwidth. Since source dedupe is good at saving bandwidth, it can help solve this new backup problem. This is why source dedupe has experienced a lot of growth in this area.
One word of caution, though, on using source dedupe to back up virtual servers. Make sure you find out the source dedupe product's top restore speed, and make sure that this meets your recovery time objectives (RTOs) if you have to recover an entire ESX or Hyper-V server. For example, a fully configured Avamar grid's top restore speed is just over 200 MBps. If you have a 10 TB ESX server, you're looking at half a day to restore it -- if you bought a full Avamar grid of 10 nodes. Make sure you do that math.
The companies that are selling these products (i365, Asigra, EMC, Symantec) have just begun to scratch the surface of all of the companies that are backing up remote data and virtual servers. Expect them to come calling on you if you have remote sites and/or an ESX or Hyper-V server. (And if you don't have one, what are you waiting for?)
While many are debating the virtues of cloud storage for other areas in the data center, cloud backup providers have really taken off. EMC/Mozy and Carbonite have really changed things with their $5/month pricing model, and other vendors such as BackBlaze and CrashPlan have come out with similar products and pricing. EMC says it has more than a million Mozy customers, and Carbonite has over half that, but with over 110 million households in the U.S. alone, they also have just scratched the surface. I truly believe this is the best way for most households to get reliable, offsite backup for an incredibly affordable price, and it seems that the market is finally ready for it.
For business customers, cloud backup is another way to solve their remote office and mobile data backup problems. Instead of buying a source dedupe product, consider trying out one of these services. They also have business pricing.
VMware vStorage API
VMware Inc. has really changed things with vSphere, one of which is backup. Where VMware almost ignored backup in the initial design of VMware, it's a critical element of vSphere. It's an entirely new architecture that allows other things to plug into it, and VMware chose backup as one of the first two technologies they would plug into it. The vStorage API is so much better than VMware Consolidated Backup (VCB) and will make backups of VMware virtual machines much, much easier -- but backup software companies have to program to the API first.
Smaller backup vendors (ESXpress, Veeam Inc.) jumped early on the vStorage API bandwagon. To date, Avamar (a source dedupe product) has been the only major product to do so. I predict that all the major players will support the vStorage API in 2010, and the adoption rate of vStorage-based backups will dwarf VCB adoptions to date in just one year.
Tape backup and recovery
Tape isn't going anywhere, and large tape systems (like Spectra Logic's T-Finity) will continue to roll off the shelf and into data centers. It may no longer be the primary target for backups, but it is still the most economical way to do long-term retention for both backup and archive.
CDP and Near-CDP
Continuous data protection (CDP), and it's lesser acknowledged cousin near-CDP will continue to be adopted more and more in 2010. It is true that this technology has not lived up to its early market expectations, but things have really changed in the five or so years since CDP was introduced. The biggest change is that big OEMs are on board. EMC, Symantec, and IBM all acquired CDP offerings. Hitachi Data Systems HDS has partnered with InMage. This should help alleviate startup concerns. In addition, the products have really matured over the years and this should help as well.
A CDP product can recover your critical server in seconds and support a 0-second RPO. That is, assuming you're using synchronous replication between a client and backup server, you would lose no data in a disaster. A near-CDP product can also recover it in seconds, but offers RPOs based on how frequently you make your snapshots -- usually an hour.
The real reason I believe in CDP and near-CDP for the future is that both of these options provide much better recovery options than traditional backup. Things have got to change, people. We can't keep doing things the way we've been doing them as we move quickly into petabyte land. CDP and near-CDP are the future; the only question is whether 2010 will be the year it takes off or not.
Hopefully I'll remember to look back at this article in 2011 and see if how right (or not) I was.
About this author: W. Curtis Preston (a.k.a. "Mr. Backup"), Executive Editor and Independent Backup Expert, has been singularly focused on data backup and recovery for more than 15 years. From starting as a backup admin at a $35 billion dollar credit card company to being one of the most sought-after consultants, writers and speakers in this space, it's hard to find someone more focused on recovering lost data. He is the webmaster of BackupCentral.com, the author of hundreds of articles, and the books "Backup and Recovery" and "Using SANs and NAS."