113: GreyBeards talk storage for next gen. workloads with Liran Zvibel, Co-Founder & CEO WekaIO

Sponsored By:

I’ve known Liran Zvibel, Co-founder and CEO of Weka IO for many years now and it’s the second time he’s been on our show, (see: Episode 56: GreyBeards talk high performance file storage...). In those days, WekaIO was just coming out and hitting the world with this extremely high-performing, scale out unstructured data solution. Well since then, they’ve just gotten better.

Keith and I had a great time talking with Liran again. Liran has deep knowledge about unstructured data and how enterprises use it these days. WekaIO’s story, over the last two years has gone beyond great performance to real world, hybrid cloud offerings e as well as going after the cloud native app’s (read Kubernetes [K8S]) persistent storage. Listen to the podcast to learn more.

We started with a history lesson on WekaIO. Back in those days (which persists today, I might add) there were many IO workloads that required companies to purchase different solutions for different work. For example, they needed DAS or SAN for performance, NAS for ease of access and object for scale. WekaIO came out with an answer to all these problems in a single, scaleable storage system. That is, they performed IO as fast as DAS or SAN block, had all the ease of access of NAS, and could scale as much as object.

However, the real culprit holding the world back was “NFS”. At the outset NFS was designed (back in the 1990s) with the then current networking speeds available (10-100Mbps), which performed just fine at those speeds. But when 10-100GbE came out in the 2000’s, NFS’s metadata overhead was too chatty to support wire speeds. Thus, any storage that depended on NFS protocols couldn’t supply (small) files fast enough for modern applications.

This is why WekaIO has moved to not only support NFS and SMB but also POSIX and NVIDIA® GPUDirect® Storage interfaces. By offering POSIX, WekaIO is able to plug into standard Linux and Windows server systems and provide excellent small file performance. Of course applications that demand small file performance today are mostly data analytics and AI/ML/DL workloads.

Consequently., NVIDIA came out with their GPUDirect Storage protocol to address getting small file (data) into GPUs faster. With GPUDirect, storage systems can RDMA data directly from storage to GPU memory and vice versa, with no OS intervention (other than to set up the transfer). If you happen to have a small file, high performing storage system attached to your fabric that supports GPUDirect , like WekaIO, you can significantly speed up your AI/ML/DL workloads.

Next we started talking K8S storage. WekaIO usestheir POSIX interface in their CSI plugin to support K8S container persistent storage. Again, supplying high performance for small files seems to be tailor made for K8S container applications that exist today and will for the foreseeable future.

Enter the cloud. Almong other things, WekaIO is a AWS primary storage vendor. It also offers snap to cloud. And with both of these in tandem, it’s just become a lot easier to move and access your unstructured data in the cloud. Liran mentioned that WekaIO primary storage in AWS operates across AZ’s. This means it can be configured to support better availability than EBS.

Large BioPharma companies are using WekaIO in AWS to store and process field data and research data, so that this work can be done around the world. Some companies have run out of compute in a single AZ (unbelievable I know but it’s COVID). By offering multi-AZ support unstructured data access with WekaIO, these companies can spread their compute across AZ’s and region and still access their data. And when their products are ready for gov’t certification, having all this data in the cloud, can make provide an easy way to have gov’t access this same data.

Liran Zvibel, Co-founder and CEO WekaIO

As Co-Founder and CEO, Mr. Liran Zvibel guides long term vision and strategy at WekaIO. Prior to creating the opportunity at WekaIO, he ran engineering at social startup and Fortune 100 organizations including Fusic, where he managed product definition, design, and development for a portfolio of rich social media applications.

Liran also held principal architectural responsibilities for the hardware platform, clustering infrastructure and overall systems integration for XIV Storage System, acquired by IBM in 2007.

Mr. Zvibel holds a BSc.in Mathematics and Computer Science from Tel Aviv University.

GreyBeards talk scale-out storage with Coho Data CTO/Co-founder, Andy Warfield

Welcome to our fifth episode. We return now to an in-depth technical discussion of leading edge storage systems, this time with Andrew Warfield, CTO and Co-founder of Coho Data. Coho Data supplies VMware scale-out storage solutions with PCIe SSDs and disk storage using the NFS protocol. Howard and I talked with Andy and Coho Data at Storage Field Day 4 last November but we thought he was so interesting, he deserved a second conversation.

This months podcast comes in at a little over 40 minutes. I apologize for the occasional poor sound quality. I used WiFi while recording the call while recuperating from foot surgery. Hopefully, next month I will be back to my normal office and using LAN.

Andy comes at storage from a stint at XenSource and Citrix Systems and sees many parallels between server virtualization and storage. In the case of servers, CPUs had become so powerful that in order to take advantage of all that speed you needed to run multiple independent workloads using a non-intrusive hypervisor to coordinate it all. In storage, the case can be made that PCIe SSDs can now supply more IOPS and throughput than most single application can possibly use and the way to take effective advantage of that performance is to support multiple IO workloads using a non-intrusive storage hyper/supervisor to coordinate it all. For Coho Data, all IO lands on PCIe SSD first and then is only migrated to Disk if it’s cold enough not to warrant flash residency.

The other interesting thing about Coho Data was their inclusion of an OpenFlow SDN switch in their scale-out storage system. They use SDN switching to help implement the NFS presentation layer,  balance IO workload across different nodes and direct IO to an appropriate node.

Although, I may have made mention of using 8″ floppies to gather data from storage systems in the old days, contrary to popular myth I never played frisbee with them.

Listen to the podcast to learn more…

Andrew Warfield, CTO/Co-founder Coho Data


Andy is an established researcher in computer systems, specializing in storage, virtualization, and security. At Coho Data, Andy leads the technology vision and directs the engineering team in building elegant and functional systems that enable customers to focus on the data and applications instead of the underlying infrastructure that drives them. As a PhD student at the University of Cambridge, he was one of the original authors of the Xen hypervisor, and has since done award-winning research in virtualization and high availability. At XenSource and Citrix Systems, he was the Technical Director for Storage & Emerging Technologies.