Published: 07 Oct 2005
- Locking open files globally to prevent multiple people from simultaneously editing a file
- Controlling which files get sent to specific remote offices and under what circumstances
- Transmitting only file changes when files are saved or closed
- Working in conjunction with existing remote file servers or by replacing them
- Minimizing WAN traffic
WAFS products--whether software or appliances--address different file-server configurations and enterprise needs. For example, Burlington, MA-based Signiant Inc.'s Mobilize product is deployed as an agent on remote file servers, but provides only point-in-time replication and lacks the edit locking and synchronous file services most WAFS products offer. Conversely, San Francisco-based Riverbed Technology Inc.'s Steelhead product is installed as an appliance in the network where it maintains real-time file consistency and speeds up the performance of all TCP network application traffic like e-mail.
Most WAFS software is intended for small companies/departments that need to keep their existing remote file servers and allow remote users to collaborate on files at near-LAN speeds. Hardware WAFS products deliver files at LAN speeds, and manage data and file servers centrally. Many WAFS products also offer disk caches that feature file segmentation, application optimization and data compression as standard features.
Software-only WAFS products require installing an agent on file servers at the data center and remote location. The storage administrator must configure which directories and/or files are replicated to the remote sites. (Tacit's IShared, classified as a hardware product in this overview, is also available in a software-only version which doesn't require server agents.) One distinct benefit that most software products provide and appliances don't is that they allow groups of remote offices to collaborate and share data using their existing file servers.
Signiant's Mobilize provides point-in-time file replication that can execute as frequently as once a minute to keep files consistent. Although Mobilize sends these changes in byte-level increments, it was originally designed as a data replication and protection solution. It's possible for files to become out of sync, however, if users at different locations start manipulating the same file in between file replications. To avoid inconsistencies, Mobilize includes file versioning that tracks file changes and can maintain up to 39 versions of the same file. If consistency problems do occur, someone must review each version of the file to determine which is the most current or correct.
Availl 3.0, from Andover, MA-based Availl Inc., is another software-only WAFS product. It tries to overcome potential file inconsistencies by maintaining a real-time copy of all files on the file servers it manages. Availl creates a mirrored disk cache on all of the different file servers. Every byte-level change to each file is mirrored to all of the file servers as changes occur. Availl claims this lets users at multiple sites work on the same file at the same time, and to see their edits and changes as they occur. But this real-time approach can give users a false sense of security; if a WAN link to a site fails, a file change at the offline site may still need to be reconciled if the file was changed at other sites.
The initial sync up of files between sites is a slow process and should be scheduled during relatively inactive periods or off-hours. To minimize network traffic, Availl and Signiant employ bandwidth optimization techniques such as data compression, and only send changes to files once the initial file synchronization completes. However, these products generate increased CPU and memory overhead, so the performance of busy file servers may be affected during peak times.
These two products also differ in the number of operating systems and file-sharing network protocols they support. Availl 3.0 supports Windows file servers and the CIFS protocol. Signiant's Mobilize works with a much wider range of OSes, including Linux, VMware, Windows and different varieties of Unix, and it supports the CIFS and NFS protocols. A major deficiency of both products is their inability to address the existence of NAS appliances such as Network Appliance (NetApp) Inc. filers or EMC Corp.'s Celerra.
WAFS appliances are available in three forms:
- An integrated hardware and software appliance
- Software loaded on a standard, off-the-shelf server
- A blade on an Ethernet switch
- No agents to deploy
- Operating system agnostic
- Works with any vendor's NAS appliances
- Removes most responsibility for data management from the remote office
- Stores all files in the central office data center
- Allows organizations to introduce new functions beyond WAFS
While all appliances deliver protocol optimization and some replication features, not all appliances work with file systems, an important consideration for a WAFS implementation. Juniper Networks Inc., Network Executive (NetEx) Software and Orbital Data Corp. offer protocol optimization and replication in their respective WX, HyperIP and TotalTransport products. Each product complements existing file- server replication and synchronization approaches by eliminating or reducing the network overhead created by the chattiness of the CIFS protocol.
Remote office users may suffer when WAN links slow down, CAD programs open large numbers of files or Microsoft Office applications create multiple temporary files. To ease these pain points, four vendors--Cisco, DiskSites, Riverbed Technologies and Tacit Networks Inc.--now include disk caches in their appliances that keep local copies of centrally stored files. Disk caches at the remote office maintain a limited or complete copy of the central office files needed by the remote office and provide remote users with the following benefits:
WAFS products offer many attractive benefits, but the following questions should be considered before they're implemented:
Do you know what applications remote offices run? Applications use chatty protocols like CIFS and MAPI to communicate with file servers. Although they work fine in remote offices, they're not optimized for WAN connections. Document which applications branch offices use and which files the applications open, create, save and delete.
What happens to existing file servers? When WAFS appliances go into remote offices, remote file servers or NAS appliances go out. Establish how these servers or filers will be managed or disposed of to prevent the remote office from reusing them without your knowledge.
Is there a pre-installation checklist? All vendors promise nearly transparent installations, but ask for a pre-install checklist to ensure it goes as smoothly as possible. The vendor's reaction to this request can tip you off as to how well-prepared they are to do the install.
How will the disk cache be managed? Disk caches increase the performance and availability of files at remote sites, but they also introduce remote management issues. Make sure you verify what RAID levels the disk cache supports, how the appliance reports drive failures and who will manage the appliance when problems occur.
- Near-LAN speed access to centrally stored files
- Local access to files should the WAN link drop or the central server go offline
Cisco also offers disk caches on its Wide Area Application Engine (WAE) line of products. "The working set of data for branch-office users is typically less than 10% of the total data repository," says Baruch Deutsch, director of product marketing for Cisco's caching division. "Generally, 90% of the branch user file requests can be served locally for the data set in the cache. We've discovered that the amount of storage needed to handle these requests and store these files is often only in the hundreds of megabytes." Despite this statistic, Cisco's remote office WAE disk cache starts at 80GB and tops out at 840GB.
But as the size and number of files on remote disk caches increases, the challenge to maintain and support the availability, consistency and integrity of these files across the enterprise grows. In addition, apps like CAD, e-mail, Office and enterprise resource planning create unique problems such as opening multiple network files, creating temporary files or using chatty network protocols that are acceptable in LAN environments but can choke WAN links. Meeting these issues head-on requires WAFS vendors to fine-tune their products for different mission-critical applications.
With applications designed to run in a high-bandwidth LAN environment, a normal application operation like "Open" or "Save" can clog the WAN with network traffic. For instance, to open a single AutoCAD drawing with links to other files results in AutoCAD accessing and opening those other files. Similarly, a "Save" operation from a Microsoft Office application generates 1,000 to 1,500 transactions on the network. While the size of each transaction may be small, a composite of thousands of them slows performance across a WAN link.
Appliance vendors use different techniques to minimize the impact of application file operations on WAN performance. For temporary files created as a scratch file by applications such as AutoCAD or Microsoft Office, DiskSites' FileController and Tacit Networks' IShared use file-aware differencing technology. This configurable option allows these applications to store temporary files on the remote appliances disk cache, but can prevent these files from being transmitted to the central office until the primary file is actually saved.
Cisco's WAE and Riverbed Technology's Steelhead appliances send and receive all files, but minimize network traffic by first breaking up and storing files as file segments. Cisco's WAE transmits only changed file segments across the WAN. Riverbed Technology similarly stores like segments from different files together, but takes the concept of file segments one step further. Its Steelhead stores file segments in lengths of approximately 100 bytes in size; the appliance then recognizes which file segments have the same underlying structure and stores them as one logical segment. This approach led to the discovery that many of the files companies work with are copies of files that already exist. This technique reduces the amount of new data, the data in the remote office disk cache, and the number of changes that need to be transmitted over the WAN.
WAFS products are evolving to address a variety of enterprise-wide concerns. While some of the available WAFS products effectively solve the problem of real-time file consistency, an extended WAN outage can, of course, affect the currency of file synchronization operations. Still, they can help organizations to get much closer to providing central and branch-office users with secured, simultaneous access to files. The introduction of file segmentation and application optimization techniques give remote users the near-LAN speed they require and the centralized file protection and management data centers seek.