BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
A disk backup target is the primary destination for data protection jobs. But over the years, disk-based backup targets have evolved to become far more than just cheap disk arrays. As IT professionals consider upgrading or replacing their current technology, they need to understand the different capabilities these systems can offer.
Understanding data efficiency features
The traditional focus of disk backup has been data efficiency features, typically deduplication and compression. Without these characteristics, disks would have never replaced tape libraries as the initial backup landing area. For the most part, debates over deduplication and compression types have ended except for discussions about when optimization should occur.
There are two options when it comes to data deduplication: inline or post-process. The advantage of inline deduplication is that there is a single storage area; with post-process there are two: native and deduplicated. However, inline deduplication adds latency to the backup process because it has to compare data on the hard disk drive to inbound data to determine if it is unique. This latency is offset to some extent since inline deduplication eliminates writes to disk before they occur.
A new feature appearing in backup software -- in-place recovery -- is causing a greater concern over deduplication. This popular feature allows a virtual machine to instantiate its data store directly from an appliance. But if that data store is on a deduplicated volume, I/O performance may be so bad that the VM is unusable.
The importance of ingestion rates
Ingestion rates remain an important feature for IT professionals because data sets continue to grow, and backup windows continue to shrink. The sooner the disk backup target can receive data, the faster the backup can complete and it can run more often. Since most disk backup appliances will continue to be HDD-based, scaling ingestion rates is challenging.
Scaling hard drive performance requires many HDDs and lots of processing power. Scale-up systems enhance ingestion rates by increasing the processing power of the head unit so it can drive data to hundreds of HDDs. Scale-out systems spread the I/O load across storage nodes and divide backup jobs across these nodes. Scale-out systems don't typically need as powerful of a processor, nor do they need as many HDDs.
The scaling approach also impacts how much backup data the disk backup target can store. Scale-up systems typically support a finite number of drive shelves. When that limit is reached, IT professionals either have to add new systems or upgrade the head unit. Scale-up systems do have an advantage. Once IT professionals buy the initial unit, they just need to add a shelf and connect it to the head unit to increase capacity.
Scale-out disk backup targets increase capacity by adding nodes to the system. They have an almost limitless potential capacity. However, IT professionals need to understand how the scale-out architecture works. For example, some systems allow the mixture of different node types, while others require all nodes to be identical. In addition, most scale-out architectures limit the number of nodes they will support.
Integration with backup software and applications is the next frontier for the disk backup target market. Most options focus on increasing performance by creating a protocol alternative to traditional IP that is more suitable for the large block transfers of the backup process. Scale-out offerings have integrated with databases so they can automatically back up the discrete components of a database across nodes.
Another new trend is the convergence of backup software onto backup hardware. An increasing number of backup hardware products have the ability to host their own or third-party backup software directly on the disk backup device. The result is a direct transfer of data between the software and the hardware, which greatly improves performance.
Disk backup targets are no longer just fat, cheap disk arrays. These backup target products are highly scalable, high-performance appliances that can integrate with existing solutions. Each vendor varies in features offered and how they deliver them. IT professionals need to understand that not all backup targets are created equal, and they need to examine the capabilities that these systems offer to find the best combination for their data center.
Integrated backup appliances vs. traditional disk backup targets
Deduplication can help overcome disk array limitations
Data protection strategy should include disk, tape, cloud