Managing and protecting all enterprise data


Can LTFS save tape?

The Linear Tape File System (LTFS) makes tape look like a file system, enabling drag-and-drop operations that resemble a NAS share. We'll see broader applications soon.

The Linear Tape File System (LTFS) makes tape look like a file system, enabling drag-and-drop operations that resemble a NAS share. We'll likely see broader applications soon.

The Linear Tape File System has been available for almost three years. It was released when Linear Tape-Open (LTO) drives reached their fifth generation (LTO-5). As a file system, LTFS has since matured and currently supports core operating systems such as Windows, Linux and Macintosh. Now, LTO-6 with LTFS seems ready to reestablish tape as a primary data center storage device and may be able to move tape beyond its old boundaries of backup and archive.

LTFS defined

The goal of LTFS was to simplify the way users interacted with the tape device itself. The designers of LTFS wanted to make tape as easy to use as a USB stick: just plug it in and start writing data. But due to tape's capacity, data would be measured in terabytes instead of the gigabytes a USB stick could store. In large part, LTFS has achieved that goal. Users simply need to insert an LTFS-formatted tape into an LTO-5 or LTO-6 tape drive and in a minute the LTFS volume appears on the user desktop. From there it can be browsed just like an external hard disk or flash memory stick, just a little slower.

LTFS relies on two changes from previous LTO generations. The first is the creation of partitions on tape, one of which LTFS uses to store catalog information about the files written to that tape. The second is a series of drivers for each of the major operating systems listed earlier.

When the driver is installed and a tape is inserted into an attached tape drive, the driver reads the catalog partition and presents a folder structure similar to what a user sees when they browse a hard disk or USB drive. Files can be dragged to and from this tape volume just like a hard disk, and the driver updates the catalog and stores the data on tape.

Why external storage is so important

In today's data center, backup and replication happen in near-real-time and repeatedly throughout the day. The relative ease at which near-zero data loss and application availability can be delivered is unprecedented. While valuable for data center managers, this ability to make nearly instant copies has some risks. As a result, there are specific needs for external storage that tape, especially with LTFS, is competing for with other external storage technologies.

First, thanks to repeated backups and replicated storage, data that has been improperly modified or corrupted can be instantly propagated through the infrastructure making the "last known good copy" almost impossible to find. There have even been cases where an accidental deletion is replicated and data is erased throughout the environment. Similarly, a virus infection can be replicated into the copy data set, driving the need for an unconnected, offline copy of data that would be immune from these situations.

Read more about LTFS

The second use case is caused by data growth outpacing available wide-area network (WAN) bandwidth. While bandwidth has improved and is now able to keep secondary sites synchronized to within seconds of primary sites thanks to incremental replication, there are times when a full data set still needs to be transferred to another location. Examples include seeding a cloud storage facility with a data baseline, sharing a large project with a collaborating firm or delivering the legal documents requested in a discovery motion. These are situations where the bandwidth of a FedEx truck can deliver a larger overnight payload than the fastest WAN connections. There's a need for a storage device that can be easily transported in such a fashion.

Finally, there's the financial reality of storing so much copy data on disk for years. In addition to the physical cost of the disk capacity itself, the cost of data center floor space, electricity and cooling can be significant. With the move to big data real-time analytics, there's a greater need to keep more data online. But there are also sets of data for which the recovery point is known or where each iteration of a file doesn't need to be kept online. When these delineations are understood, a strategy of moving as much of the non-analytic data to an external, offline and compact device makes sense. For example, a database backup has limited online value after two or three more recent backups have been taken, eliminating the need to keep dozens of copies on disk and available for immediate restore.

The LTFS advantage

Tape with LTFS has several advantages over the other external storage devices it would typically be compared to. First, tape has been designed from Day 1 to be an offline device and to sit on a shelf. External hard drives weren't designed to be powered off and stored for years on a shelf. With the appropriate LTFS driver installed, tapes can be inserted into any LTO-5 or LTO-6 drive and read. No special application is required, eliminating what has historically been a big shortcoming of tape: The need to have proprietary software running in all locations that would read the proprietary format of each application.

An LTFS-formatted LTO-6 tape can store 2.5 TB of uncompressed data and almost 6 TB with compression. That means many data centers could fit their entire data set into a small FedEx box. Also, tape is more rugged than other portable forms of storage and better designed for transport. Again, with LTFS the sending and receiving data centers no longer need to be running the same application to access the data on the tape.

While data deduplication can help make storage more efficient by eliminating redundant copies of data on disk, it's seldom implemented on primary storage. Usually, a single file exists on primary, secondary and backup storage, each having to run its own independent deduplication processes. Tape allows for a clean sweep of data that simply doesn't need to be on any form of disk but still needs to be kept. The cost and capacity of tape makes these "just in case" copies very affordable.

Archivers leverage LTFS

The previously cited examples of LTFS applications can all be achieved without any additional software, using just the LTFS drivers that are freely available. Archive application vendors have been quick to adopt LTFS to make their products more appealing. Most of these offerings will integrate a hard disk system of your choosing with a tape library and automate the movement of data between disk and tape.

Users interact with the archive as if it was just another NFS or CIFS mount point on their networks. Data is then automatically copied to one or multiple tapes as a matter of policy, but also retained on disk for fast access. As the disk area fills up with data, again based on policy, data is removed from the disk and is then only available on tape.

The technology to merge disk and tape into a single mount point has been around for decades. While archive products pairing disk and tape were successful in a few niche markets, broad adoption has been hampered in large part because of the proprietary nature of the way the archive products wrote data to tape. It meant that applications had to remain running for decades, although few applications had that life expectancy.

LTFS removes that concern. If the archive application can output data in the LTFS format, it can be moved from application to application or sent to a site with no application. This capability allows users to select different applications based on changing needs. It also forces archive application vendors to remain competitive from a development standpoint and allows the movement of project data between companies without requiring them to standardize on the same application.

The future of LTFS

LTFS backup. The next step for LTFS is for backup application vendors to embrace the standard. This will allow data portability between backup applications in the same manner it has among archive applications. Any time a data center decides to switch backup applications, it needs to factor in the cost of running a single copy of the old application even if it's backing up to disk, since most backup vendors write to disk and tape in proprietary formats. With LTFS support, a user could simply keep historical backups on tape that could be imported into the new application when needed.

LTFS-integrated NAS. One of the biggest challenges facing storage managers is keeping up with the growth of unstructured data. Much of this data doesn't need high-performance hardware. An ideal solution would be a NAS with tape integrated with it, essentially a primary storage version of the above archive example, but with a high-speed and larger disk cache. Data could then be automatically protected and eventually migrated from primary storage. Imagine a high-speed but cost-effective set of solid-state drives acting as the primary storage tier and then older data being moved to and recalled from LTFS tape without IT intervention.

Direct execution. The final step in LTFS evolution is direct execution of data from the tape device and even direct modification. This means data wouldn't need to be restored to a disk area prior to recovery. A simple example would be streaming a video file from tape instead of transferring it to disk first. This would be ideal in situations where something needed to be looked up from the archive but not recovered. A database application with direct access to LTFS would be able to do this by extending its database to tape and enabling the search for old records or documents directly on tape. Another example would be using Microsoft SharePoint's Remote BLOB Storage feature to move older documents or spurious revisions of documents to a tape-based storage area.

The lowdown on LTFS

LTFS has the potential to change the way data centers think of and use tape. In the past, LTO and other tape formats were clumsy, slow and difficult to deal with. Now, with LTFS, it's as easy to interact with tape as it is with any other storage device. It has immediate application when data transfer volumes are high and bandwidth is low. But it also has broader implications on backup and archive processes, and potentially on the way databases and file systems access data.

About the author: 
George Crump is president of Storage Switzerland, an IT analyst firm focused on storage and virtualization.

Article 3 of 8

Dig Deeper on Tape backup and tape libraries

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

Get More Storage

Access to all of our back issues View All