Best practices for remote data backups
A comprehensive collection of articles, videos and more, hand-picked by our editors
IT organizations have been searching for the silver bullet of backup since the first mechanical head hovered over a platter: A solution that's easy to use, fast and reliable. For several years, cloud has been touted as the next great hope for backup admins. But the term cloud is a bit generic to facilitate a meaningful discussion. Organizations can choose from public cloud repositories that offer low cost, but generally low levels of service. A private cloud can be an extension of the in-house IT data protection operation, perhaps at a lower cost and with less effort. But can either alternative offer the restore performance to assist large enterprises in recovering tens, hundreds or thousands of terabytes of information?
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
Storage pros have learned that the cloud, of any variety, isn't the long-sought backup silver bullet. But used as a repository to protect against specific risks and in certain scenarios, however, the cloud can make a significant contribution to the overall solution.
Tiering is another a term that requires some clarification in the context of backup. Tiering connotes not only different hardware types, but the dynamic movement of data between them. Backup tiering is much more old style in that specific data is moved to a specific repository and only moved again by manual intervention. Thus, cloud backup, whether public or private, will be a consideration for specific workload types. Examples include remote office/branch office (ROBO) data, desktop and laptop files, and other backup situations where the restore data set is small enough to be recovered practically over the WAN or Web. (Cloud is also useful for archive, but that's not truly backup and therefore not germane to this discussion.) Cloud backup is just another service in the data protection catalog that has specific service levels and costs.
Adding cloud as a backup service tier can yield numerous advantages for an IT organization:
- Cloud backup outsources the physical infrastructure management without outsourcing the human intelligence that makes a backup strategy effective.
- Cloud as a backup service tier may broaden an organization's data protection strategy, if desktop/laptop backup isn't currently provided, without increasing the workload on IT. The hallmark of many desktop/laptop backup solutions is user self-service.
- Public cloud services and some private clouds are pay-as-you-go arrangements that can reduce the impact on the IT budget.
- Cloud repositories can reduce or replace the need for offsite tape vaulting and rotation, assuming the retention policy is appropriate. Organizations are increasingly replacing disk-to-disk-to-tape strategies with disk-to-disk-to-cloud (D2D2C). In this case, the cloud becomes the "repository of last resort," as recalling large quantities of data can be a lengthy process. Then again, recalling and reading hundreds or thousands of tapes isn't exactly an instantaneous operation.
Multiple cloud backup tiers
Cloud backup tiers fall into Tiers 1, 2 and 3 just like other tier schemes. Tier 1 would be a direct backup target. Here, the ROBO and desktop/laptop use cases come into play. Cloud backup services from traditional storage vendors such as EMC's Mozy, and Hewlett-Packard (HP)'s LiveVault and Helion cloud services are examples of solutions where IT organizations can completely outsource small system backup to a third party, though these products also have options for in-house deployments for organizations that have security or control concerns. Druva's inSync product is similarly deployed and can also be used for Apple OS X, iPhone/iPad iOS and Android backup. If deployed in-house, Druva supplies an on-premises CloudCache server with a cached copy of the data, while the metadata remains in the cloud. Druva describes its solution as a way to gain control over data at the edge and speed up data movement to and from the cloud.
But caveat emptor, as these applications are truly standalone solutions. IT organizations will want the backup catalog from any such solution to be integrated with the in-house backup catalog. Depending on the organization, this lack of integration can be a backup beauty (IT has no responsibility for it) or a beast (IT has no control over it).
Where does cloud fit into backup tiers?
Tier 1: Direct backup target for remote office/branch office or laptop/desktop sources, usually provided by a third-party service.
Tier 2: Private cloud target that can offer higher service levels or specialized services, such as advanced security.
Tier 3: Public cloud general-purpose targets as a low-cost, offsite repository.
Tiers 2 and 3 could be considered indirect backup targets, such as our earlier D2D2C example. One might also consider Tier 2 as a private cloud where more control is available and Tier 3 as a public cloud that's truly general purpose in nature. However, the implementation becomes substantially more complicated. Implementation models may include backup appliances, such as EMC Data Domain, which can function as Tier 1, 2, 3 or any combination thereof. Data may be backed up to a Data Domain device in the data center and replicated to another in the cloud.
Whereas Data Domain is an example of a purpose-built device, HP StoreOnce VSA is a software-defined storage device that runs on general-purpose servers and virtualization hypervisors. This solution is positioned for a broad spectrum of uses, from small organizations looking for a cost-effective backup solution to large organizations wanting to federate remote offices to backup-as-a-service providers looking for a hardware-agnostic deployment. HP calls these service providers CloudAgile partners and has special programs for them.
Symantec, as a backup software developer, has enhanced its NetBackup product to specifically use cloud repositories as backup targets. This involves building in the APIs, such as the RESTful API, needed to communicate with cloud providers such as Amazon and Rackspace. In addition, Symantec offers its Auto Image Replicator for block-level replication on Data Domain devices. In that case, Symantec's OpenStorage API is needed for NetBackup to "see" the replicated images. Another ancillary product is the recently announced Symantec Disaster Recovery Orchestrator, which uses Microsoft Azure as a backup target for disaster recovery. The advantages are application-based, automated failover and failback with the low cost of public cloud storage.
The backup software plays a central role in using tiered backup effectively, especially when the cloud is used for Tiers 2 and 3. Without a centralized backup catalog, backup to the cloud risks fragmenting the backup environment into the silos of yore. It may be acceptable to have an entirely separate desktop/laptop backup scheme if the intent is to make it entirely user self-service or with service provider support. However, it would be unacceptable for an enterprise to have one backup repository and a separate cloud repository where no orchestrator assures backup consistency. This would be a prescription for disaster from where there could be no acceptable recovery.
In addition to NetBackup, HP Data Protector is another product architected to manage backup images, whether those images are on traditional disk and tape backup, a backup appliance or a cloud target. Data Protector is available in its 9.0 release that uses the OpenStack API for access to the HP Helion Public Cloud. However, users should not assume that any given product or cloud is supported by their backup software. For example, HP Data Protector does not currently support an integrated catalog with LiveVault, though it is integrated with 3PAR.
The cons of cloud backup tiers
There are, of course, a few downsides to using cloud storage as a backup tier. First, organizations need to make a mental or cultural shift from managing systems and technology to managing vendors and service levels. That may not truly be a downside, but storage teams accustomed to specifying a bill of materials for specific implementations may get an awakening when they aren't given a vote as to the specific implementation on the public cloud. Even in a private cloud, the options may be more of a menu of service levels than technologies and products. Fundamentally, cloud consumers shouldn't care what the underlying technology is so long as the specified service level is achieved. IT managers should resist the temptation to tell cloud vendors how to run their businesses.
Second, organizations with all in-house solutions may have very mature monitoring and management regimes for troubleshooting and solving backup failures. Those monitoring tools are unlikely to extend to the cloud environment, whether public or private. The cloud will be somewhat opaque and administrators may be dependent upon the cloud support team to solve problems.
The pros and cons of cloud backup tiers
Why you need to add cloud as a tier to your storage strategy
- Reduce the total cost of your backup strategy
- Outsource some of the backup headache to somebody else
- Protect laptop/backup systems that otherwise are unprotected
- Replace offsite tape rotation with the cloud
Why cloud tiers will disillusion you
- The cloud offering may not be very flexible
- Troubleshooting may be complicated by having to work through multiple vendors or layers
- Management of backup data stored in the cloud likely won’t integrate with in-house storage management
- Your cloud vendor may go out of business
Finally, some cloud backup vendors have failed spectacularly in the marketplace, leaving organizations scrambling to get their data back. This is a dire situation in primary storage or archive situations rather than backup, which is inherently a second copy. Nevertheless, suddenly having a gaping hole in one's backup strategy due to vendor failure isn't good. It may not be necessary to go as far as having a backup of the backup, but a contingency plan is certainly prudent.
Cloud backup bottom line
When organizations consider adding cloud to their backup hierarchy, the motivating factors are usually lower cost and the convenience of outsourcing at least a piece of the backup headache to someone else. Not surprisingly, there are tradeoffs in those transactions. Organizations shouldn't expect on-premises service levels at off-premises prices. That may be perfectly acceptable as long as the business units and application owners understand any tradeoff between price and recovery services.
Backup isn't a strategic market advantage for any organizations other than those that provide it. For everyone else, the idea is to meet the required service level and then minimize the cost of attaining it. Cloud tiers are an opportunity to do just that.
About the author:
Phil Goodwin is a storage consultant and freelance writer.