One of the givens in our industry is that storage capacities are continuing to explode. Data rates are going to grow 50% to 60% per year over the next five years and in excess of 85% of that growth will be unstructured data. . .in other words, file system-based data.
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
So now, companies have to realistically think about how they put platforms in place that can store petabytes of data over time. The older architectures, the more monolithic approaches, don't provide a good way to do that cost-effectively. So we're seeing the introduction of storage grids and scale-out approaches targeted at unstructured data -- things like Web 2.0 and then also secondary applications like backup and archive.
These dense storage platforms use a very different approach from the monolithic architectures of the past. You can pay as you grow with them and you can add performance, I/O or capacity to these things independently, which gives you a lot of flexibility in building the configuration that best meets your performance requirements.
The key issues to look for are petabyte-class scalability and the ability to maintain high levels of I/O performance at that type of storage capacity. You're also going to look at data reliability. You'll also want to look for data deduplication technologies that are integrated into the platform so that you can use the deduplication multiplier to lower the overall cost of storage.
For example, if you're paying $8 a gigabyte for SATA-based storage, but you're achieving a 10:1 data reduction ratio using data deduplication technologies, now you're getting that down to under $1 a gigabyte. That's a key economic argument from the vendors playing in this space. If you're pitching these platforms for backup or archive use, they are clearly going up against tape, and tape is 20 cents to 30 cents per gigabyte.
There's one final area you would need to evaluate these platforms on: whether or not they've got an integrated replication capability, which is really the way you solve backing up that data storage cell, providing a disaster recovery solution, etc. If we're talking about a data store that's 400 TB in size, there's no way you're going to be able to back that up with tape. There has to be another approach, and what's becoming the accepted way to address that these days is by replicating that to an off-site location, such as a mirror platform.