GreyBeards deconstruct storage with Brian Biles and Hugo Patterson, CEO and CTO, Datrium

In this our 32nd episode we talk with Brian Biles (@BrianBiles), CEO & Co-founder and Hugo Patterson, CTO & Co-founder of Datrium a new storage startup. We like to call it storage deconstructed, a new view of what storage could be based on today and future storage technologies.  If I had to describe it succinctly, I would say it’s a hybrid between software defined storage, server side flash and external disk storage.  We have discussed server side flash before but this takes it to a whole another level.

Their product, the DVX consists of Hyperdriver host software and a NetShelf, external disk storage unit. The DVX was designed from the ground up based on the use of host/server side flash or non-volatile memory as a given and built everything else around that. I hesitate to say this but the DVX NetShelf backend storage is pretty unintelligent, just a dual controller disk storage with a multi-task coordinator. In contrast, the DVX Hyperdriver host software used to access their storage system is pretty smart and is installed as a VIB in vSphere. Customers can assign up to 8TB of host-based, server side flash/non-volatile memory to the storage system per server. The Datrium DVX does the rest.

The Hyperdriver leverages host flash, DRAM and compute cores to act as a caching layer for read and write IO and as a data management engine. Write data is write-thru straight from the server side flash to the NetShelf storage system which has Non-volatile DRAM (NVRAM) caching. Once write data is in NetShelf cache, it’s in two places, one on the host server side flash and the other in storage NVRAM. Reads are easier to handle, just being cached from the NetShelf storage in the server side flash. There’s no unique data residing in the hosts.

The Hyperdriver looks like a NFS mount to vSphere and the DVX uses a proprietary protocol to talk with the backend DVX NetShelf. Datrium supports up to 32 hosts and you can define the amount of Flash, DRAM and host compute allocated to the DVX Hyperdriver activity.

But the other interesting part about DVX is that much of the storage management functionality and storage control logic is partitioned between the host  Hyperdriver and NetShelf, with both participating to do what they do best.

For example,  disk rebuilds are done in combination with the host Hyperdriver. DVX RAID rebuild brings data from the backend into host cache, computes rebuild data and writes the reconstructed data back out to the NetShelf backend. This way rebuild performance can scale up with the number of hosts active in a cluster.

DVX data are compressed and deduplicated at the host before being sent to the NetShelf. The NetShelf backend also does a global deduplication on the host data. Hashing computations and data compression activities are all done on the host and passed on to the NetShelf.  Brian and Hugo were formerly with EMC Data Domain, and know all about data deduplication.

At the moment DVX is missing some storage functionality but they have an extensive roadmap with engineering resources to match and are plugging away at all of it. On the other hand, very few disk storage devices offer deduped/compressed data storage and warm server side caches during vMotion. They also support QoS functionality to limit the amount of host resources consumed by DVX Hyperdriver software

The podcast runs ~41 minutes and episode covers a lot of ground about how the new DVX product came about, how they separated storage functionality between host and backend and other aspects of DVX storage.  Listen to the podcast to learn more.

AAEAAQAAAAAAAAK8AAAAJGQyODQwNjg1LWI3NTMtNGY0OC04MGVmLTc5Nzg3N2IyMmEzYQBrian Biles, Datrium CEO & Co-founder

Prior to Datrium, Brian was Founder and VP of Product Mgmt. at EMC Backup Recovery Systems Division. Prior to that he was Founder, VP of Product Mgmt. and Business Development for Data Domain (acquired by EMC in 2009).

Hugo Patterson, Datrium CTO & Co-founderAAEAAQAAAAAAAANZAAAAJDhiMTI2NzMyLTdkZDAtNDE5Yy1hMTM5LTNiMWM2MWM3NTlmMA

Prior to Datrium, Hugo was an EMC Fellow serving as CTO of the EMC Backup Recovery Systems Division, and the Chief Architect and CTO of Data Domain (acquired by EMC in 2009), where he built the first deduplication storage system. Prior to that he was the engineering lead at NetApp, developing SnapVault, the first snap-and-replicate disk-based backup product. Hugo has a Ph.D. from Carnegie Mellon.