While mobile and cloud platforms are relatively new, data centers have been under the watchful eye of IT professionals for decades; so why is backup not solved yet? There are at least two primary reasons that even data center data protection continues to challenge IT:
- Changes in workload recovery requirements and workload protection mechanisms.
- The sheer amount of production and protection storage required.
Part one of our three-part series on modernizing data protection and disaster recovery takes a look at how data center backup and DR are evolving today.
Protection and recovery requirements are changing
As the platforms that host our production resources change, the protection methods must change with them. As one notable example, with the mass adoption of virtualization, many of the traditional methods for backing up server data have either evolved or been replaced or supplemented. Whereas each production server used to have its own agent, the ideal scenario for most environments today is to utilize virtualization host-centric data protection mechanisms, which provide hypervisor-specific APIs to enable whole (virtual) machine backups, while still offering granular restore capabilities. In addition, as production data continues to migrate from traditional data center servers to either mobile devices or cloud platforms, the protection and recovery requirements have to evolve accordingly.
Because of the increasing dependencies on data, the tolerance against downtime/data-inaccessibility of any kind is increasingly tight. But in order to gain a broader range of recovery agility, one must often use a broader range of protection mechanisms, including snapshots and replication, in addition to traditional backups.
Data growth is forcing changes in protection and recovery
The other primary driver -- beyond the desire to improve recovery agility and production-evolution -- is simply the necessity to change because the status quo is unsustainable with today's data growth. Enterprise Strategy Group research indicates that primary storage is growing by nearly 40% annually, but overall IT spending and storage-specific spending are growing at nowhere close to that rate. IT professionals are being forced to store data more effectively, while also increasing the types of protection and recovery capabilities. At first glance, those two trends might appear contradictory; but, in fact, the synergies between them are driving the most exciting parts of how IT is evolving from a backup mentality to a data protection strategy -- including not only backups, but snapshots and replication, as well.
While not necessarily new, the use of snapshots has evolved over the past few years. By reverting to a snapshot within primary storage, users can recover to a previous, albeit somewhat recent, point in time much faster than restoring from any backup on secondary storage. And, because of the very granular nature of snapshots, whereby disk blocks that aren't changed do not incur any storage consumption, snapshots can also partially address storage-scale issues related to multiple near-term copies held within a backup server's secondary storage pool.
Those capabilities aren't new, but the extended management and flexible usability of snapshots is -- and that is making all the difference. In the past, snapshots (as a storage-centric technology) were managed solely by the storage administrator and typically without coordination with the upper‑level applications or backup applications. Today, many storage array manufacturers have developed extensions so that snapshots of common business applications can be done in a more coordinated fashion; thereby ensuring a more application-consistent recovery. In addition, the usability of snapshots has evolved to enable granular file- or object-level restores that can be invoked by the snapshot management UI, an application/platform UI (e.g., database or compute-hypervisor), or from within the backup application. By integrating the management (invocation schedules for snaps and restores) and monitoring (health-awareness of the underlying storage), snapshots are now a much more holistic aspect of an overall data protection strategy.
While snapshots provide a complement to backups through rapidly restorable versions within the primary storage, replication creates yet another copy of the data -- most often on tertiary storage. This provides a survivable copy of data at a geographically separate location, typically as part of a business continuity or disaster recovery scenario.
It is essential to understand the mechanisms that are facilitating replication, which will affect the efficiency of the replication itself, as well as the usability of the data. Replication can be achieved at multiple levels within an infrastructure stack.
Application-centric replication (e.g., SQL database mirroring) is accomplished between the primary application engine and one or more partner application engines. It provides an immediately usable secondary instance of the data, since the entire stack (OS, platform and storage) exists under each application engine. Efficiency will vary by platform, but each platform must be managed separately -- through separate UIs, with separate strategies, often by separate individuals (e.g., database administrators).
OS/platform-centric replication encompasses a variety of technologies, including file-system centric replication (e.g., Windows Distributed File System/DFS), virtual-machine replication as facilitated between hypervisors, or third-party block- and file-centric replication offerings. Most of these products are designed to replicate data as part of enabling a high availability scenario. It is notable that resuming functionality may not be transparent to the users in many cases, but the switchover window is often negligible.
Storage-centric replication is the product that typically impacts CPU (application/server) the least, since the storage array does the work, which is often an external appliance with other advanced capabilities beyond replication. While storage-based replication achieves the same "data survivability" goals of other tiers of replication, the secondary instance of the data isn't necessarily for geographically separate scenarios. Some environments will replicate a second copy within the original or nearby site, so that the higher stack (application, OS, VM) has twin copies of data to access with transparent/synchronous capabilities. In other environments, the storage copies will be at separate facilities, but will require the second infrastructure stack to be recreated (in-advance or upon-crisis) before the secondary storage copy can be mounted and utilized.
Continuous data protection (CDP) and near-CDP. CDP products often combine some of the aspects of the other replication mechanisms: application-integration, multi-platform management and highly granular replication. Storage Networking Industry Association purists would also suggest that along with truly continuous replication, CDP products should also offer granular recovery to any of the infinite previous points of time using journal‑like behaviors, while near-CDP products provide the near-continuous (seconds or less latency) without the infinite/granular restore option.
By combining the agility of snapshots, the durability of replicas, and the flexibility of backups, you have what you'll need to truly modernize the protection of your data center.
Dealing with endpoint data protection issues
How cloud and virtualization are changing DR