kras99 - Fotolia
- W. Curtis Preston, Druva
Traditional backup typically makes copies of any files and database tables that have changed since the last backup to a storage device on a media server. This incremental approach is referred to as an incremental backup. Traditional backup also occasionally does a full backup where it copies all files and the entire database to a storage device on a media server.
Backup-less backup does neither of these things. Also referred to as flat backup, backup-less backup is usually accomplished by creating storage-level snapshots on the server or filer being backed up and then replicating them to another server or filer for preservation. This approach is different from a typical backup in a couple ways.
First, there's no media server between protected storage and the protection copy. Second, the protection copy is just that, a copy. It's in the same format as the original data. One of the hallmarks of backup-less backup is that the protection copy can immediately take the place of the protected copy if it's damaged. Traditional backups require a restore.
Why would you do a flat backup?
One reason to do flat data backups is because they offer much faster restores, allowing you to meet a much tighter recovery time objective (RTO). It's difficult to argue with something that lets you immediately recover from any disaster.
These snapshot-based backups also can be performed more frequently than regular ones, allowing a tighter recovery point objective (RPO). Typical snapshot times are once an hour, but many products let you create them as often as every five minutes. This means you would only lose five minutes of data in case of a disaster or outage.
In addition to allowing for much tighter RTOs and RPOs, flat data backup uses fewer resources than regular backups. It consumes little CPU and memory on the protected server, while running fewer I/O operations there. There's also less network traffic, less CPU and memory used, and fewer I/O operations on the protection server.
Network traffic is reduced because the flat backup systems replicate only changed blocks, where a traditional backup transfers the entire file whenever any part of it has changed. Traditional backups also occasionally perform full backups, sending even more data across the network. The reduced CPU and I/O are because the backup system doesn't have to crawl the file system to figure out what has been backed up. It simply asks the changed block tracking system what blocks must be transferred. That's a single, simple query instead of a CPU- and I/O-intensive file system crawl.
CDP: Related, but different
Continuous data protection looks similar to flat backup. One big difference is that no snapshots are created for a CDP backup. All changed blocks are immediately replicated to the protection system.
With a snapshot-based system, nothing is replicated until a snapshot is taken, so the recovery point objective is based on snapshot frequency. True CDP systems offer an RPO of zero, since they're copying every changed block.
Another difference is how the two types of systems store the protected copy, particularly historical data. CDP is not a mirror of the protected data in the way that a flat data backup is. It's actually a log of changes that can be used to virtually present a version of the protected volume from any point in time.
The final reason why flat data backups make sense is you can use them for a variety of tasks. Traditional backup is generally only for recovery, but snapshot-based backups can be the central part of a copy data management (CDM) system. Backup-less backup also can be used for development and testing, as well as data analytics against historical data. Having another copy of your data that may have a longer history than your production copy lets you do all sorts of operations without affecting the production copy's performance.
Snapshot-based backup isn't new, having been popularized by NetApp some time ago. Today, most modern storage and software vendors support it, including all newer CDM products.
NetApp was the first storage vendor to create hundreds of snapshots without affecting storage performance. Most other vendors used the copy-on-write (COW) method to store snapshot history, which has inherent performance challenges. NetApp, on the other hand, used a method similar to what is referred to as redirect-on-write (ROW).
The difference between these two methods appears when an application changes a block after a snapshot has been taken. A COW-based snapshot system copies the before image of a block to an alternate location before overwriting that block with new data (i.e., it copies on writes). This means every write requires three I/O operations: reading the original block, rewriting that block in another location and updating the block with new information.
In contrast, a ROW-based system leaves the before image in place and writes the updated block in another location -- that is, it redirects that write. The storage system does a single I/O then updates the pointer to where the current image of the block is. This is why ROW-based snapshot systems are much more appropriate for flat backups, although there are some COW-based systems that have figured out a way to work around its limitations from a performance standpoint. Unfortunately, how they have done this is a trade secret.
The importance of integration
Most storage systems today can take snapshots, and most newer ones let you take hundreds of snapshots, without any degradation in performance. Flat backups are usually used for unstructured data, but they also work with structured data. However, if they aren't integrated with the applications that are writing data to storage, your snapshot might be "blurry" and not useful for a recovery.
There are four ways to integrate snapshots with the applications they're protecting: scripting, Volume Shadow Copy Service (VSS), vSphere Storage APIs -- Data Protection, and custom APIs.
The old-school method for integrating an application with its snapshots is to use a script that puts the application into a special mode prior to taking the snapshot. This is often referred to as "Quiescing" the application, although that's not technically what happens with all applications. For example, telling Oracle to go into backup mode simply changes how it records changes in the redo logs. It's not truly quiescing anything, but rather it's changing how it records changes. Other applications, such as SQL Server, halt writes to data files when they are quiesced.
Regardless of what the database does when it is told to quiesce, a script tells it to do whatever that is before a snapshot, issues the snapshot and then tells the database the snapshot has completed. That snapshot will then be application-consistent when it's replicated to the other storage system.
Microsoft introduced the VSS snapshot management system, which simplifies the process for applications running on Windows. Backup software simply needs to use the VSS infrastructure to tell all supported applications it's about to take a snapshot and that they should do whatever it is they do before that happens. Once the snapshot is completed, it uses VSS to let the application know it can resume normal operations.
Another snapshot integration system is VMware's vSphere APIs. Backup applications wishing to back up VMware integrate with this system, which then integrates with systems like VSS that are running inside virtual machines (VMs).
This means that, depending on your storage vendor, it's possible to take an application-consistent flat backup of VMs by placing them on storage capable of doing such things. A fully integrated storage vendor could communicate with vSphere APIs, which would communicate with VSS, which would then communicate with the applications running inside the VMs. Once that three-level system creates a snapshot, the storage vendor can create its own snapshot that can then be replicated for backup purposes.
Flat backup's shortcomings
Traditional backup systems have features that some flat systems don't have. Specifically, they have sophisticated scheduling and reporting mechanisms that ensure backups are taken on a regular schedule and make sure the right people know when backups don't work. Many flat data backup systems leave this up to you.
Another feature sometimes missing from these systems is the catalog. With a flat backup system, you can crawl the file system tree the same way you would with the file system you're protecting. You know the directory where the file was located before it was deleted or damaged, and you simply go to that same directory on the protection system. Because of this capability, some people feel a catalog isn't needed on a flat backup system.
However, a catalog can be useful when you find yourself missing a single piece of information, such as where a file was located. A catalog lets you search the entire database for files with particular names. That said, if you have an alternate way to do such searches, a catalog shouldn't be necessary. Some storage products let you search across multiple directories and points in time. NetApp filers, for instance, store historical copies of data, including deleted files, in a user-accessible .snapshot directory.
The copy data management advantage
Copy data management takes flat and snapshot-based backups to a whole other level. Where the flat data backup idea only concentrates on making a native copy of data for backup and recovery purposes, CDM creates a copy of data for several purposes, including recovery, development and testing, and data analytics.
Actifio was the first vendor to put forth the idea that copies can be used for multiple purposes. It creates a single gold copy of all data remotely on its storage systems and incrementally updates that copy over time. Actifio manages both the transfer of data from protected storage and storing the gold copy on its storage. This approach is different from competitor Catalogic, which offers a management system for your existing storage products' snapshot and replication capabilities.
Are snapshots backups?
When discussing flat backups, it's important to mention that the snapshot itself isn't really a backup until it's replicated to another storage system. A snapshot is a virtual representation of the volume or file system at a particular point in time, but it's reliant on the volume it's protecting for the before and after images of all of the blocks. If you don't replicate your snapshot to another storage system, it's as useful as a snapshot of your house after your house burns down.
Catalogic's selling point is that its copies are exactly like what you would get if you're doing it manually with your existing storage systems; Catalogic simply automates and manages the process. Actifio's selling point is that by creating a single gold copy, it gets rid of a lot of duplicate data and adds a consistent feature set across a variety of storage platforms.
What about my storage vendor?
Just about every modern storage array can perform flat backups. A lot of traditional backup products support managing the creation and replication of snapshots on their storage arrays as well. There are a few special arrangements, however, where one type of storage system can use another for flat data backups; Hewlett Packard Enterprise 3PAR StoreServ, for instance, can do them to HPE StoreOnce; Dell EMC VMAX to Dell EMC Data Domain; and Pure Storage to Cohesity.
Essentially, if your storage system can take snapshots at least every hour and lets you create hundreds of snapshots without degrading performance, it's possible that it can be used as part of a flat backup system. However, even when a system can create hundreds of snapshots, the scheduling and reporting features may be lacking. Even NetApp, which has been doing this for a long time, gets a lot of complaints in that area, and that's why companies like Actifio and Catalogic are around.
Flat data backup products gain popularity
The pros and cons of flat backup and snapshots
The growing role of snapshots in data protection