Olivier Le Moal - Fotolia
For years, any storage pro worth his or her salt would've scoffed at the idea of combining backup, disaster recovery and archive. While all three are, at their core, data protection processes, there are very different goals, methods and tools associated with each.
Is it backup, DR or archive data?
Of course, there have always been organizations that cut corners in one or more of the processes of protecting data to save some time, money or eliminate redundancy. In some companies, backup, disaster recovery (DR) and archive are merely stops along a continuum, with each stage roughly defined by how much dust has collected on the tape cartridges tucked away on storeroom shelves. A delicate patina of motes indicates a fresh backup tape; a more thorough coating denotes DR data and those barely distinguishable shapes among the dust bunnies must be archive.
But this path can be perilous -- especially when an emergency occurs and recovery is complicated by not being able to identify the most recent data copy.
The new data protection: Copy data management
For years, the good data protection management mantra was clear: keep backup, DR data and archive processes separate. That made a lot of sense, but it also created a lot of duplication with the possibility that a single piece of data could exist in all three places at the same time. That's where the idea of copy data management comes in: We can focus on the similarities found in each of those three data sets to reduce redundancy, but recognize that we'll do very different things with the data in a backup recovery, disaster recovery or legal discovery situation. So, maybe a single copy could suffice.
Copy data management is particularly important today as most companies are loath to toss any data. So along with all that hoarding, there's a constant struggle with capacity requirements. The concept of unified data protection fits neatly into that equation as it relies on one copy of data but uses three different sets of tools to manipulate it.
So, all those best practices that called for keeping those backup, DR and archive data protection processes separate are now yielding to another more practical approach that promises to work much better.
Beyond data protection
Interestingly, the idea of a single pool of data that can serve multiple data protection needs has also spawned some new ideas about using that type of data for applications beyond its traditional use case.
Many companies have realized that backup data may represent the single most complete pool of data under management. And that's a substantial resource that can be tapped for a variety of applications.
Already, backup vendors such as Code42, Commvault and Druva make their backup data available via file sharing apps to allow users to access the data using their mobile devices. The neat thing about this approach is that all the data that's viewed on mobile devices is already protected, so there's no need to worry about backing up those phones and tablets. And on the data center side, no additional storage capacity or special setups are required -- the backup data is there already, but now it's just getting used in a different way.
Other possibilities include big data analytics. A backup data pool could help make short shrift of one of the toughest parts of big data analysis implementation -- collecting all the data to churn through the analytic engines. In most cases, backup is an eclectic mix of data types culled from a variety of applications, so it may be the best single source of data to fuel big data analytics. ETL -- extract, transform and load -- is the key to big data acquisition and preparation; by tapping the backup data pool, the "extract" part of that equation may be shortened considerably or possible even completely satisfied. Of course, the "transform" step could be a considerable undertaking given the propriety nature of most backup data sets, but there are companies that specialize in tools for that purpose, so it's reasonable to hope (and expect) that issue will be addressed soon.
The challenge of monitoring your policy for protecting data
Copy data management, traditional backup similar, but different
Data storage protection plans can be difficult to monbitor