buchachon - Fotolia
Published: 02 May 2016
Data is the lifeblood of most businesses today. And yet the job of backing up that data is probably one of the least loved, but most important processes in IT.
Few organizations could survive without the email and productivity tools they use every day; not to mention the data (both current and archived) these applications generate. And, at the other end of the spectrum, entire business sectors, such as finance, couldn't operate without huge IT infrastructures and the volumes of data they contain. This makes it essential to implement a data protection plan that includes putting into place a reliable process for backing up that data.
The public cloud and, in particular, cloud storage provide organizations with a huge opportunity to implement scalable, manageable and dependable backups. These cloud backup options -- such as Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform -- effectively offer unlimited storage capacity at the end of a network, with no need to understand how the supporting infrastructure is constructed, managed or upgraded.
Public cloud vendors have also introduced multiple tiers of storage into their products to stay competitive. AWS, for example, offers three levels of storage (Standard, Infrequent Access and Glacier), each of which delivers different service levels and price points. Google's public cloud mirrors AWS offerings with its Standard, Nearline and Durable Reduced Availability storage tiers.
There's plenty of raw infrastructure available to store your backup data. The question to ask now is what data should be stored in the cloud and what cloud backup options do you use to back it up?
Where applications run matters
To determine what data to store in the cloud and how to back it up, we need to first see how IT deploys applications. Nowadays, businesses can run applications from four main areas:
1. On-premises (including private cloud). This happens when running applications within a private data center managed by local IT teams. Systems are built on internal infrastructure and historically have been backed up using similar infrastructure within the data center, replicating data to another location or taking backups off-site with removable media.
2. Co-located. Rather than sit in a customer's data center, physical rack space is rented at a co-location facility that manages the environmental aspects of the data center, while the customer continues to own the server hardware. Co-location provides an opportunity for third-party businesses to offer services like backup that are deployed in the same co-location facility. This offloads the work of backup, but delivers low latency and high-throughput connectivity to backup infrastructure because of its physical proximity.
3. Public cloud. The public cloud can be used to deploy virtual servers and applications without businesses owning or managing any of the underlying hardware. Infrastructure as a service (IaaS) vendors won't provide backup capabilities outside of the requirement to return failing systems back to normal operation, however. So if a server crashes or data is lost, the IaaS vendor will simply return the system to the previous state of operation.
4. Public cloud. Platform as a service (PaaS) and software as a service (SaaS) have been widely adopted for the most easily packaged and transferrable services, such as email (e.g., Office 365), and applications, such as CRM (e.g., Salesforce). PaaS and SaaS offerings operate in a similar way to IaaS, in that the platform provider ensures systems are always up and running with the latest version of applications and data. They won't directly provide the ability to recover historical data (e.g., when a user inadvertently deletes vital account records), however.
Backup options for the public cloud
Organizations have a number of choices among cloud backup options that take advantage of public cloud storage, including:
Back up directly to the public cloud. Write data directly to AWS Simple Storage Service (S3), Azure, Google or one of many other cloud infrastructure providers.
Back-up-to-a-service provider. Write data to a service provider offering backup services in a managed data center.
Implement disaster recovery as a service (DRaaS). A number of vendors offer DR services that manage the backup and restore process directly, focusing on the application/virtual machine, rather than just data. These DRaaS offerings also work with PaaS/SaaS applications to secure data already stored in the public cloud.
Backups and the public cloud
Cloud backup is no longer simply about shipping data to cheap storage locations. Today, entire applications can be migrated to, run from and backed up to and within public cloud infrastructure.
Existing backup software providers have extended their products to take advantage of cloud storage as a native backup target. Veritas (formerly part of Symantec) updated NetBackup to version 7.7.1 toward the end of 2015, extending AWS S3 support to cover the Standard -- Infrequent Access (IA) tier. (Version 7.7 originally introduced a cloud connector feature with the ability to write directly to S3.)
The Commvault Data platform (formally called Simpana) natively supports all of the major public cloud providers and a range of object store vendors -- including Caringo and Data Direct Networks. It also supports an extended set of vendors through standardization on the S3 protocol, highlighting how S3 as a standard is being used to provide interoperability between object stores and backup platforms, even if those systems are not running in the public cloud.
A number of storage vendors have also started to support native S3 backups from within their storage platforms. SolidFire introduced the ability to archive snapshots to S3 or other SWIFT-compatible object stores as part of the release of its Element OS Version 6 in March 2014. Zadara Storage, which offers a Virtual Private Storage Array (VPSA) either on customer premises or deployed at a co-location site, provides S3 support to archive snapshots that can either be restored to Amazon's Elastic Block Store (EBS) service or any other vendor's storage hardware.
One word of caution when deciding to use public cloud storage: Data written to S3 and other services won't be deduplicated by the cloud provider to reduce the amount of space consumed by the user (although they may deduplicate behind the scenes). This means data must be deduplicated before being written to the cloud if that feature is not built into a backup product. One option to overcome this issue is to use software such as that from StorReduce. Its cloud-based virtual appliance deduplicates S3 data, storing only the unique data on the customer's S3 account. (You can write to StorReduce as the target in real time and it will write to S3 in real time.) This significantly reduces the amount of data stored on S3, which translates to cost savings, both in data stored and the transfer costs for reading and writing to S3 itself.
Opening the door to MSP and SaaS backups
Managed service providers (MSP) offer backup services that take advantage of co-location facilities to offer cloud backup options. If IT is already using hosting services from companies such as Equinix, then backups can be performed within the data center across the high-speed network implemented by the hosting company, rather than going out onto the public Internet.
A number of software vendors, including Asigra and Zerto, deliver versions of their products specifically designed so that MSPs can deliver a white-label backup platform to their customers. The benefit of using a service provider for backup is in the security of keeping data within the MSP's facilities. That way, data doesn't have to traverse the public Internet, which may resolve issues of compliance for some organizations. MSPs can also deliver "value-added" services that let customers run applications in DR mode if primary systems fail.
SaaS, meanwhile, has allowed many IT shops to outsource common applications to the public cloud -- most notably email, customer relationship management and collaboration tools. While SaaS removes the need to manage infrastructure and applications, it doesn't fully provide data management capabilities. A SaaS provider will, for example, recover data from hardware or application failure, but not from common user errors such as the accidental deletion of files or emails.
Products such as Spanning (acquired by EMC in 2014) and Backupify (acquired by Datto the same year) enable organizations to back up SaaS data. Pricing is typically calculated on a per-user-per-month basis, which has to be added into the overall cost of using a SaaS.
What to back up? That is the question
An important consideration when examining cloud backup options is deciding what exactly to back up. It is possible to back up only application data or an entire virtual machine, for example. The advantage of a VM backup is that it makes it possible to restart an application in the cloud in the event of a disaster at the primary site. This also means IT doesn't need to have specific DR hardware and can instead operate applications from within the cloud.
Amazon's common backup standard
The S3 API provides a common standard that allows backup applications to write data to both object storage and public cloud providers.
Datto is an example of a vendor that provides customers with the ability to run applications in DR mode in a cloud. It offers a number of appliances that back up VMs locally, replicating them to the private cloud Datto purpose-built to allow customers to failover their applications in the event of a disaster.
Druva provides a similar service with Phoenix DRaaS, where entire applications can be backed up to the cloud (through the replication of VM snapshots) and restarted within AWS. The Druva application manages issues like IP address changes that need to be put in place as the application is moved to run in a different network.
Cloud backups: Traditional vs. appliance
Traditional backup software applications have been modified to write directly to the cloud, typically using standard protocols like Amazon's S3 API. In this instance, the application needs to perform any data reduction tasks like deduplication before pushing the data out, as stored data is charged per terabyte.
By comparison, application gateways can be used to cache data as it is being written to the cloud storage. The appliance can then perform deduplication and also cache data locally, allowing for quicker restores from backup where needed. Typically, the majority of restores occur within the first few days of a backup being taken.
How the cloud simplifies DR
Public cloud takes away the need for many IT shops to build and manage their own DR site.
Traditional vs. appliance-based backups are important to consider because the public cloud is increasingly becoming a practical target for data backups. The effective, limitless scale of the cloud takes away many of the operational headaches associated with managing backup infrastructure.
Obviously, there is a tradeoff between running backup locally and using cloud as the target, particularly in managing network bandwidth. However, with the ability to move entire VMs into the cloud and run them there in DR mode, we could see a serious decline in the use of traditional backup applications as IT realizes it no longer needs to build out dedicated DR facilities or suffer the impractical nature of shipping physical media off-site.
When cloud backups don't make sense
Cloud data backup tutorial
Cloud-to-cloud data backup emerges
- Cloud Backup and Cloud Disaster Recovery Essential Guide –SearchStorage.com
- Cloud Storage: A list of applications to move there now –SearchStorage.com
- What does cloud computing look like in 2016? –ComputerWeekly.com
- CW ANZ August 2016 –SearchDataCenter.com