Two trends are pushing data out of the data center. 1) Branch offices are growing in number. 2) Wireless technologies have expanded the use of laptops and other mobile computing devices, making each mobile user a "branch office."But with more critical business data residing outside the primary corporate data center, storage administrators need to find ways to protect that data without impairing user productivity. Moving data from remote sites or users to a central location requires an understanding of remote backup technologies.
Remote office backups have historically been made to tape drives located in each remote office. Full tapes were then shipped to the data center or moved directly to off-site storage using services from companies such as Iron Mountain Inc. The problem with this approach is that non-technical personnel were usually pressed into service to perform the backup and handle the tapes, which often led to missed or incomplete backups.
Recently, storage administrators have been replacing remote tapes with WAN links. Dedicated links can be used, but inexpensive and readily available broadband Internet services are commonly employed to collect critical data from remote offices. In most cases, an initial full backup is completed first. After that, only changed data must be passed to the data center. This approach offers storage administrators more control over the backup process without involving remote personnel. A popular software tool for remote backups is LiveVault's InControl (owned by Iron Mountain).
WAN backups require planning
WAN backups require a certain amount of preparation and planning. The most important issue for a storage administrator is data volume -- understanding just how much data needs to be backed up in a typical evening from each remote site. Once you know how much data must be transferred, find a WAN service that offers adequate bandwidth (and fits your budget). Bandwidth costs can be mitigated by adjusting backup needs. For example, incremental or delta differencing backups can involve far less data than full backups, and some organizations reduce demand even further by backing up only business-critical files, such as sales databases or email records, across the WAN.
Other factors, such as data deduplication, are also gaining importance for WAN backups. While a typical backup may record numerous copies of the same file, data deduplication only records one copy of the file, simply providing pointers to subsequent iterations of the file. For example, if there are 20 copies of the same 1 MB file, a typical backup would need 20 MB for those copies. But data deduplication saves only one 1 MB file and references additional copies -- using only 1 MB instead of 20 MB. Compression and encryption should also be considered.
Several WAN acceleration technologies are now available to boost WAN performance. For example, IP data can be repackaged into larger packets, requiring far fewer data exchanges to complete a file transfer. WAN acceleration can also reduce the number of handshakes needed to open or transfer a file. Acceleration typically requires a dedicated appliance installed at each end of the WAN link.
Role of WAFS
Wide-area file services (WAFS) is receiving increased attention as a means of IT consolidation. WAFS uses WAN connections to share applications and data directly from the corporate data center. Changed files are saved directly to the data center. This effectively eliminates servers and other IT infrastructure at remote sites. Although WAFS is not a remote backup technology, it can potentially eliminate the need for remote backups simply because there is no remote data to backup; all data used by remote offices is obtained from (and retained in) the primary data center.
To mitigate bandwidth limitations and latency issues, WAFS is typically implemented through remote appliances that locally cache the most frequently used files from the data center. One example of this is the Steelhead appliance family from Riverbed Technology Inc. Any changes to the data are held in the local appliance's cache until it can be resynchronized with the data center. Techniques like data deduplication and compression are frequently employed to optimize WAFS bandwidth and accelerate apparent performance.
WAN interruptions can be particularly troublesome for a WAFS implementation -- locally cached files cannot be updated back at the data center, and new (uncached) files cannot be obtained until the WAN link returns. Organizations considering WAFS deployments must pay particular attention to WAN reliability. In some cases, a backup WAN link is used to provide redundancy.