kras99 - Fotolia


Why arrays with erasure codes are not a backup substitute

Many shops struggle with slow backup and restore times when using traditional backup products. Erasure-coded arrays are no standby for traditional backup, but can help with protecting data.

Over the last several years, traditional backups have become increasingly challenging for some organizations, and IT pros have been phasing out legacy backup in favor of next-generation products. Technologies such as continuous data protection have proven to be effective, but what about erasure codes? Could erasure coding eventually eliminate the need for backups?

Erasure codes are a data protection algorithm similar to RAID, but with more flexibility. Consider how RAID 5 works. In a RAID 5 scheme, data is striped across each disk in the array set. In addition to block-level striping, each disk in the array set also contains parity data. The idea is that if one disk in the array were to fail, the remaining data on the other drives can be combined with the parity data to reconstruct the missing data. RAID 6 uses a second parity block on each disk and can survive the failure of two disks, but doing so requires twice the overhead of RAID 5.

Erasure coding works similarly to RAID 5 or RAID 6, except that it allows the data storage administrator to choose the required level of protection. For example, an erasure-coded array might be designed to survive the simultaneous failure of eight disks. The administrator defines the number of disks to be used and the number of disks that can fail without bringing down the array. An algorithm determines the amount of redundant data to be stored on each disk to achieve the administrator's requirements.

Although erasure codes can create a very high degree of redundancy, some aspects keep it from acting as a replacement for legacy backups.

Like RAID arrays, erasure-coded arrays are designed to provide operational fault tolerance. In other words, the array is designed to protect against disk failures, not act as a data backup. Even so, some believe that if a storage system is sufficiently redundant then traditional backups become unnecessary. After all, a backup is nothing more than a "redundant" copy of an organization's data.

Erasure codes vs. backup

Although erasure codes can create a very high degree of redundancy, some aspects keep erasure coding from acting as a replacement for legacy backups -- at least not by itself.

For starters, erasure-coded arrays are not designed to provide point-in-time recovery capabilities. The array is designed to guard against disk failure, not to perform data recovery. Since organizations need the ability to recover lost or corrupted files, there needs to be a mechanism for point-in-time recovery.

One possible option would be to store production data in virtual hard disks on the erasure-coded array, and then use application-aware snapshots to provide point-in-time recovery. Creating a snapshot does not actually create a copy of the data. However, because an erasure-coded array can have a high level of built-in redundancy, snapshots could work. This is similar to how snapshot is used in conjunction with backup software today.

Another problem with using erasure-coded arrays as a backup substitute is that the arrays are not fully immune to hardware failures. Sure, such an array can survive multiple, simultaneous disk failures, but what happens if the disk controller were to fail? Such a failure could corrupt the entire array.

If an organization is seriously considering using an erasure-coded array as a substitute for traditional backups, then it will also need a way to guard against the array becoming a single point of failure. One option is to replicate the array's contents to a secondary array. This allows the organization to have a copy of its data that is isolated from the primary storage array and that will not be impacted by an array-level failure.

About the author: 
Brien M. Posey, MCSE, has received Microsoft's MVP award for Exchange Server, Windows Server and Internet Information Server. Brien has served as CIO for a nationwide chain of hospitals and has been responsible for the Department of Information Management at Fort Knox. Visit Brien's personal website.

Next Steps

Erasure codes can cut down on backup costs

Deciding between erasure coding and replication use in an organization

Erasure code use in a post-RAID environment

What to consider when using flash backup for your company

Dig Deeper on Backup and recovery software