In data backup and recovery parlance, a storage snapshot is a copy of a set of files, directories and/or volumes as they were at a specific point in time.
Snapshot technology was originally architected to solve several data backup problems, including:
- Backing up data that's too large to complete in the allocated time
- Failing to back up data because it has moved from a directory that hasn't been backed up to one that already has been backed up
- Corruption of backed up data that can occur when it's being written to while it's being backed up
- The affect on application performance while a backup is in process
How to create a snapshot
A series of steps are required to initiate a snapshot:
- It starts with a command that a backup is about to occur
- This command tells the system to quiesce the file system and apps running at that point in time
- The file system is then flushed so that any pending file transactions are completed
- The snapshot is then created
- Afterwards, the file system and applications are released to resume normal operations
Snapshot technology has also moved beyond just data protection. Snapshots are an efficient and non-disruptive way to test application software against real data without endangering live production data. They're also ideal for data mining and e-discovery. Snapshots have also evolved into a very effective -- even preferred -- disaster recovery methodology that protects against malware, human errors and data corruption.
Where snapshot technology resides
The common perception may be that snapshotting is a storage system feature, but that's only one place that the technology may reside. Snapshot technologies are generally available in seven different types of implementations:
- File systems of servers, desktops and laptops
- Logical volume managers (LVMs)
- Network-attached storage (NAS)
- Storage arrays
- Storage virtualization appliances
- Server virtualization hypervisors
- SQL databases
File system-based snapshots
File system-based snapshots are available in Microsoft Corp.'s Windows NTFS via Volume Shadow Copy Services (Shadow Copy in Vista); Novell Storage Services (NSS) on NetWare 4.11 or better; Novell's OES-Linux in SUSE Linux; and the Zettabyte File System (ZFS) on Sun Microsystems Inc.'s Solaris and Apple Mac OS X 10.6 (Snow Leopard).
One of the advantages of file system-based snapshot is that it tends to be "free" because it comes with the file system. It also works well and the latest file systems make it pretty easy to use. On the downside, each file system must be managed separately, which can become onerous as the number of systems proliferates. It also means that if snapshot replication is required, each file system must be set up to replicate its own snapshots. In addition, different file systems will likely vary in the kinds of snapshots they provide; snapshot frequency; the amount of capacity that must be reserved (if capacity must be reserved); as well as snapshot set up, operations and manageability. The complexity increases as more servers and file systems must be managed.
Logical volume manager snapshots
Logical volume manager snapshot technology is available with Hewlett-Packard (HP) Co.'s HP-UX Logical Volume Manager, Linux Logical Volume Manager and Linux Enterprise Volume Management System; Microsoft's Logical Disk Manager for Windows 2000 and later; Sun Solaris 10 ZFS; and Symantec Corp.'s Veritas Volume Manager (part of Symantec Veritas Storage Foundation).
Logical volume manager snapshot technology can sometimes run across a number of file systems; for example, Symantec's Veritas Volume Manager can function with most common operating systems. LVMs also usually include storage multi-pathing and storage virtualization features.
When using LVMs, there are typically additional costs per server for license/maintenance fees. You may also confront the same issues of coordination and complicated implementations found with file system-based snapshots.
Network-attached storage (NAS) is essentially an optimized or specialized file system running on an appliance or an appliance integrated with storage. Most midrange and enterprise-class NAS systems provide snapshot capabilities, including those with proprietary operating systems and the wide variety of NAS systems that are based on Microsoft Windows Storage Server.
There's a lot to like about NAS-based snapshotting, including a common standard for all of the physical and virtual servers, desktops and laptops that connect to the NAS device. It's also very easy to implement, operate and manage. NAS-based snapshot technology tends to be integrated with Windows Volume Shadow Copy Services (VSS), as well as with backup servers and their agents. Some NAS vendors have their own agents for non-Windows structured data applications. Other NAS snapshot offerings include data deduplication (EMC Corp., FalconStor Software Inc. and NetApp), and some even offer thin snapshot provisioning that minimizes the amount of storage reserved for snapshots.
But there's a price to pay for the convenience and added features: fairly hefty software licensing and maintenance charges that are often system or capacity based. NAS systems tend to proliferate in most companies and, as they do, the number of touchpoints required for snapshots will also increase, making operations and management more complex.
Storage array-based snapshots
Storage array-based snapshots are included with most block-storage array's operating systems.
The advantages of using snapshotting that comes with the storage array operating system are similar to those of NAS-based snapshots. They provide a common standard and touchpoint for all of the physical and virtual servers, desktops and laptops connected to the array, and are easy to implement, operate and manage. And, like NAS, many storage arrays integrate their snapshot technology with Windows VSS, as well as with backup servers and their agents. Some vendors even provide their own agents for non-Windows structured data applications.
The drawbacks include hefty license and maintenance fees, lack of integration with non-Windows-based structured data applications and increasing complexity as the number of storage systems increases.
Snapshots with storage virtualization appliances
Storage virtualization appliances are primarily SAN based with the exception of F5 Network Inc.'s Acopia ARX, which is file (NFS) based. Other examples of virtualization appliances (or storage systems that incorporate virtualization) include Cloverleaf Communication Inc.'s Intelligent Storage Networking System (iSN), DataCore Software Corp.'s SANsymphony and SANmelody, EMC's Celerra Gateway blades, FalconStor's IPStor, Hewlett-Packard's XP series, Hitachi Data Systems' Universal Storage Platform V/VM, IBM's SAN Volume Controller, LSI Corp.'s StoreAge Storage Virtualization Manager (SVM) and NetApp's V-Series storage controllers.
Storage virtualization approaches to snapshots have the same advantages as storage array- and NAS-based snapshots, but offer others as well. They provide a common standard and point of management for multiple storage systems from a single or several vendors, aggregating them into fewer or just one image. This greatly simplifies snapshot management, operations and training.
The negatives related to storage virtualization-based snapshots are a bit different. These devices will add some transaction latency, even those that have split-path architectures, which ultimately affects app response time. It also complicates troubleshooting and has the potential to exacerbate multivendor finger-pointing. And while the additional hardware or software comes with a price, it may be offset by lower software license or maintenance fees for the virtualized storage.
Snapshots with server virtualization hypervisors
The ascendancy of server virtualization has made hypervisor-based snapshot technology progressively more popular. This technology is available with virtualization software such as Citrix Systems Inc.'s XenServer, Microsoft's Hyper-V, Sun's xVM Ops Center, and VMware's ESX and vSphere4.
The advantages of using hypervisor-based snapshots are straightforward. The technology comes bundled with the hypervisor; it provides the same snapshot methodology for all virtual machines (VMs); it's integrated with Microsoft's VSS; and it's easy to implement, use and manage.
What's not to like about this approach? Snapshots must be managed separately for each hypervisor, and when snapshots are used for any OS other than Windows, only the entire VM will be imaged. That means restores are coarse grain and time consuming, and the snapshots aren't structured-data-aware outside of Windows and may produce non-consistent images.
Snapshots with SQL databases
In SQL databases, snapshotting is called "snapshot isolation." Snapshot isolation is required for databases such as Oracle and PostgreSQL to guarantee that all transactions are serializable and appear to be isolated and serially executed. Other SQL databases also support snapshot isolation but don't require it for serialization. In general, the SQL databases backup features take advantage of snapshot isolation to provide crash consistent dumps of tables.
The main advantage of using SQL database snapshot technology is that snapshots of the database, and any applications based on the database, will be crash consistent.
But there are some significant disadvantages. The snapshot technology is very limited and it only works with that particular database and the apps tied to it. It doesn't work with the file system, any other application on the server, or with other databases or servers. So you'll need other snapshot technologies or data protection, thus complicating operation and management.
This article originally appeared in Storage magazine.
About the author: Marc Staimer is the founder, senior analyst, and CDS of Dragon Slayer Consulting in Beaverton, OR. The consulting practice of 11 years has focused in the areas of strategic planning, product development, and market development. With over 28 years of marketing, sales and business experience in infrastructure, storage, server, software, and virtualization, he's considered one of the industry's leading experts. Marc can be reached at firstname.lastname@example.org.