lassedesignen - Fotolia

Which data protection methods are here to stay?

Despite a changing market, disk-based approaches like continuous backup and snapshots might still have a place in your backup and archiving strategy.

In a world where backup and archiving seem to be moving inexorably toward the cloud, the future of in-house backup seems bleak. Does disk-based backup still have a chance?

To figure out what in-house backup will look like in a few years, we have to consider where core data protection methods are headed. With storage becoming less expensive as commercial off-the-shelf boxes take a large percentage of the market, continuous backup systems will converge on snapshot approaches to give us a perpetual storage model, where all storage changes are additive to the storage data set and rollbacks can be run to some point in the distant past.

Both snapshots and continuous backup are disk-based data protection methods, but they differ in that continuous backup typically creates an incremental record of changes on a different media, while snapshots use the primary storage pool. Snapshots are therefore exposed to rootkit hacks and admin errors. It is often argued that, with remote replication, snapshots use a different media, too, but the replica is comounted with the primary, so it can be destroyed along with it.

At the same time as we look at data backup migration, the approach to disaster recovery will converge on the backup task. Here, that remote copy is adequate to protect the primary data set from natural disasters, though not from rootkit hacks. I predict we'll see a complete convergence of the three data protection methods to yield a perpetual storage that has a snapshot-based primary, coupled with a continuously updated replica of the snapshot as a backup that is geographically remote in the cloud and independently encrypted and mounted. Archiving will be achieved within the backup framework by policy-driven tiering and deduplication services.

As several backup service vendors have pointed out, there is a lot of value in keeping the most recently backed-up data in a local drive-based cache, because their studies show there is a reasonable likelihood some of this data will be recalled within 48 hours of backup. This cache can reside with primary data storage, as the official backup is in the remote version.

The result of these changes is there is no longer a need for a dedicated backup server and storage. Software-only data protection methods residing in virtual instances can do the backup task, with the option to scale out to reduce backup windows or RPOs if necessary. The evolution of this converged approach is already happening, but will take some years to reach a majority of IT.

Dig Deeper on Disk-based backup