News Stay informed about the latest enterprise technology news and product updates.

EMC's Slootman: Data Domain planning global deduplication, NetWorker integration this spring

Head of EMC's backup division discusses plans to add high availability pairs and backup software integration to Data Domain data deduplication series, streamline EMC's data protection portfolio, and move data deduplication beyond backup.

EMC Corp. reported in its fourth-quarter earnings call last month that it had taken share in the data deduplication market thanks to its $2.1 billion acquisition of Data Domain last summer. EMC CEO Joe Tucci said he expects the data backup product group including Data Domain and data deduplication to be EMC's fastest growing segment, and the company plans to increase its R&D spending on data backup and data archiving this year.

Former Data Domain CEO Frank Slootman, now president of EMC's backup and recovery division, sat down with this week to discuss EMC's data backup plans.

More on data deduplication
EMC's Slootman: No data deduplication for Disk Library virtual tape library

Data deduplication software trends in backup and recovery 

Data Domain takes data deduplication to the enterprise with backup system

What open-source data deduplication software options are available? 
SearchDataBackup: You've said in the past Data Domain is working on global deduplication. Is that still in the works, or is it not the direction you're headed in now?

Slootman: It is in the works. Actually, it's coming out in the next release of the product, but you've probably heard me say this on previous occasions ... one of the things that has made it sort of a luxury for Data Domain is that our individual nodes were growing so that the need for global dedupe was never acute for us. We were able to build very, very large, very, very fast systems without having to resort to global dedupe. The whole point of global dedupe is to create larger mount points, but every time we'd come out with our next generation controller, it would sort of ride right up the back of whatever we had going on with global dedupe. We've now gotten to a point where we are going to release it, and then we'll see how the market reacts to it. You're going to see it in the flesh here, before the mid-year timeframe.

SearchDataBackup: Will the first release of global dedupe link together more than two nodes or will it start with a high availability configuration?

Slootman: We're going to start with a pair – two DD880 systems, which is the high end of our product line. A single 880 will already stack 200 terabytes of raw storage behind it, so with two you get 400 terabytes of raw storage. Tie in the compression factor, and you're getting five to six petabytes of backup storage. That's awfully large behind a single pair of controllers, and it's blistering fast.

SearchDataBackup: Symantec Corp. has been gaining support among deduplication vendors with its OpenStorage (OST) API. EMC could do something like that with NetWorker and Data Domain because it owns both sides of the IP. Is that a thought for EMC?

Slootman: It's more than a thought, it's coming to a theater near you. It's very obvious because the advantages of OST are super, super compelling. We're even getting approached by other data backup software vendors to do a similar thing. It won't be OST -- OST is really proprietary to Symantec, but we have a Data Domain software client that can work with OST as well as with other people's implementation of that concept.

SearchDataBackup: How do you respond to people who say EMC has too many different data replication and backup products?

Slootman: We're definitely rationalizing the relative R&D agendas that we have between Avamar and NetWorker, and between the Disk Library and Data Domain. Essentially, we have two front-end products and two back-end products, and customers are being fairly clear with us, saying 'Hey, figure out how to make that make more sense than what it does today.' That's one of the reasons why I agreed to take this on, because while I was blissfully happy running Data Domain all by itself, being a part of EMC, we have to get our act together on that.

We can't keep running in four separate lanes, if you will, between all these products. Between Disk Library and Data Domain, there's already a huge amount of substitution. So yeah, there's work to do there. The same is true between Avamar and NetWorker. Avamar is really not a backup software product, it's much more a replication product in the sense that it's very much next-generation in the way that It approaches data protection. NetWorker is sort of the classic, same generation of approach as products like [Symantec] NetBackup and [IBM] TSM.

SearchDataBackup: We've talked in the past about Data Domain for nearline storage, as well as the potential to use solid-state drives to boost the performance to primary storage levels. Is that something you're still considering, or is Data Domain now firmly in the backup category within EMC?

Slootman: It is true that solid-state storage can change some of the natural impediments that we have with the random I/O capabilities of disk-based storage. There are no imminent announcements in that department, but it's certainly something we're not giving up on. But as I probably said in our prior conversations, we're always hamstrung by the fact that backup redesign is such a hot topic. If we don't pay attention to that, we could become unfocused and open the door for competition.

We are going to scope Data Domain to tackle non-backup workloads, notably in long-term retention-type applications. It just comes up every single time with customers. It's one of those areas where it's very uncomfortable to the customer. There's not a clear blueprint there at all in terms of how people should tackle that. We're all sitting here with our boatloads of data and it's not going away -- we have to hang on to it, how do you do that on an economic basis? How much should now be on tape and what should be on tape, and those issues. Just forget the compliance end of it for a second, but how do you economically hang on to all that data? That's a big part of our world right now and in years to come. Backup has been at the top of the agenda, but it won't always be that way.

Dig Deeper on Disk-based backup

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.