Nmedia - Fotolia

Cohesity founder: Secondary data needs a new approach

Cohesity founder Mohit Aron claims the Cohesity Data Platform can consolidate all secondary data functions, similar to the way Nutanix hyper-converged primary storage.

Startup Cohesity Inc. last week publicly launched the Cohesity Data Platform and OASIS (Open Architecture for Scalable, Intelligent Storage) software designed to let customers converge all secondary storage workloads. Cohesity sells its software bundled on Intel-based 2U CS2000 appliances as a minimum four-node cluster of hybrid storage. The appliances serve as the building block for its scale-out architecture.

Each Cohesity block provides up to 96 TB of hard disk drive storage and 6.4 TB of solid-state drives. The appliances are powered by the vendor's OASIS software, which includes storage quality-of-service (QoS) management to converge analytics, archiving and data protection on a single unified platform.

Mohit Aron, founder and CEO, CohesityMohit Aron

We caught up with Cohesity founder and CEO Mohit Aron, who previously helped start hyper-converged pioneer Nutanix and helped design the Google File System as a Google engineer.

Aron describes himself as the "co-founder and brains of Nutanix, which invented the concept of hyper-convergence," adding, "hyper-convergence disrupted primary storage. Now, Cohesity is looking to disrupt secondary storage."

Give us a high-level view of what Cohesity does.

Aron:  At the core of what we've built is a distributed file system, a distributed storage system, which we call OASIS, [which gives] Google-like scalability. It scales infinitely. You can write data to it on any of the nodes simultaneously and grow the cluster incrementally. We give you a way to store tons of data, without worrying about space.

How does Cohesity differ from traditional secondary storage?

Aron: Today, there is a bunch of fragmentation in secondary storage. A customer goes and buys a bunch of different products from multiple vendors, and somehow has to interface them together manually, managing them through multiple UIs [user interfaces]. That becomes a major manageability headache.

Cohesity Data Platform converges all your data protection workflows on one appliance. We have a single pane of glass that can be used to manage all these workloads, and all the pieces of software run natively on our appliance. The analogy I use is that our infrastructure is similar to what Apple did with the iPhone. We are building the infrastructure and the platform that can deploy some native applications to solve these customer use cases. In the future, we want to expand and have third parties write software on our platform.

We bring the cost down by using commodity hardware, but we don't treat data protection as insurance against disaster. We put more workloads there -- even nomadic workloads that aren't mission-critical. One of them is analytics; another is test and development. Our architecture supports QoS, so we can segregate workloads and prevent them from interfering with one another.

How do you impact backups?

Aron: One other thing that's a big pain point for our customers: Backups are taken only at night. For the most part, the [backup] device sits idle. Again, just like an insurance policy. We enable them to do continuous backups. That requires that we have cloning/snapshot technology that is capable of taking snaps every few minutes.

Second, we take distributed snapshots. Most systems don't even scale, especially in secondary storage. In traditional snapshots, whenever you take a snap, you start to grow a chain. Every chain adds another link. You wind up with very long chains, and that's when performance can start to suffer.  

Most vendors break that chain in the background by copying metadata around. A big, distinguishing part of our technology is that it never allows those chains to form. The beauty of our snapshot and cloning is that it makes continuous backups possible, and makes very fine-grained RPO [recovery point objective] possible.

Fragmented secondary storage, and lack of visibility into it, is a long-standing problem. How did Cohesity evolve from an idea to an actual product?

Aron: We can't look at secondary storage as another point problem, which is how it's been looked at in the past.

The answer we came up with was consolidation. That's simple to say, but hard to execute. We had to build a highly scalable platform -- not easy to do, even in primary storage. But that's not enough. Taking converged workloads that have never sat together and making them interoperate on one system means you have to apply features such as QoS. It's a huge task to undertake. We want to attack immediate pain points first, and then possibly partner with software vendors to write software that runs on our platform.

How will you convince the backup software vendors to treat you as a partner, rather than a competitor?

Aron: Partnering with software vendors is going to be easy. We expose industry-standard protocols like NFS [Network File System], and soon, SMB [Server Message Block]. So, a vendor like Veeam can easily write to us and we can work together to possibly have a single pane of glass. Partnering with vendors that provide backup storage infrastructure, which, frankly, is outdated, doesn't scale, only does backup, doesn't give insight … that type of partnership I don't see happening, because we are displacing them.

You are trying to sell a new model for secondary storage, combining analytics, backups and data protection. How will you convince enterprise customers to trust a startup with all of those processes?

Aron: We are not looking to rip and replace anything. But if you have 75 [EMC] Data Domain appliances, each managed with a separate UI, our pitch is: When you are about to buy the 76th one, think of buying from us.

The second entry point is management of nomadic applications -- a subset of which is test and development. Right now, those applications don't have a home. People will try to run them on primary storage and pay premium dollars, and worse, they will contend with their primary apps. 

Where do you play in relation to the cloud, which is playing a greater role in secondary storage?

Hyper-convergence disrupted primary storage. Now, Cohesity is looking to disrupt secondary storage.
Mohit Aronfounder and CEO at Cohesity

Aron: Like I said, the big problem in secondary storage is fragmentation. Cloud has only added to that fragmentation. My philosophy is that the cloud is like renting a hotel room, and your private data center is like owning a house. There are [positives] in both.

Our appliance will interface with the cloud to do multiple things. We'll take care of your archival needs, should you choose to port them to the cloud. For example, we interface with Google Nearline and, possibly, even with Amazon Glacier in the future to use [the public cloud] as an archival tier. You set the policies to move data after so many days and we will do it for you.

In the future, we plan to add a subscription service to run a soft copy of our appliance in the cloud to let some of our smaller customers [set up] active/active failover in the cloud.

How do you price Cohesity Data Platform?

Aron: The discounted price for a four-node box would come to somewhere between $80,000 and $100,000. Any other backup storage and software you can buy with a similar capacity, you would be running more than at least $200K or more. That's just the Capex cost.

How frequently do you plan to add different protocols and storage features?

Aron: We will have rapid-fire addition of features. At the end of October, we will roll out a release that will have encryption for data at rest. SMB support is around the corner, later this year. I'm going to keep some of the announcements for the future as a surprise. But we have a bunch of announcements coming that are going to add really big functionality.

Do you support all vendors' storage and work with all hypervisors?

Aron: We have concentrated first on the VMware ESXi hypervisor. As soon as we roll out SMB at the end of the year, we will start supporting Microsoft Hyper-V. Support for KVM will come in the future. We will also be adding support for data center containers technology. All of those features will come probably sometime next year.

Next Steps

Copy data management vs. traditional backup

Backup, archive may leverage same technologies

Cohesity seeks convergence in data protection

Dig Deeper on Archiving and backup