83: GreyBeards talk NVMeoF/TCP with Muli Ben-Yehuda, Co-founder & CTO and Kam Eshghi, VP Strategy & Bus. Dev., Lightbits Labs

This is the first time we’ve talked with Muli Ben-Yehuda (@Muliby), Co-founder & CTO and Kam Eshghi (@KamEshghi), VP of Strategy & Business Development, Lightbits Labs. Keith and I first saw them at Dell Tech World 2019, in Vegas as they are a Dell Ventures funded organization. The company has 70 (mostly engineering) employees and is based in Israel, with offices in NY and the Valley as well as elsewhere around the world. Kam was previously with (Dell) EMC DSSD and Muli’s spent years as a Master Inventor with IBM Research.

[This was Keith Townsend’s (@CTOAdvisor & The CTO Advisor), first time as a GreyBeard co-host and we had a great time with him on the show.]

I would have to say it was a far ranging discussion but focused on their software defined, NVMeoF/TCP storage. As you may recall we talked with Solarflare Communications last year who were also working on a NVMeoF/TCP, only in their case it was an accelerator board. After the recording, Muli said the hardware accelerator they have is their own design.

Why NVMeoF/TCP?

Most NVMeoF today, that uses Ethernet, requires RoCE or iWARP compatible NICs and switches. Lightbits Labs has long been active in the NVMeoF/RoCE-iWARP market place. Early on they noticed that enterprise and cloud service providers were reluctant to adopt NVMeoF technology because of the need to change out all their networking equipment to use it. This is what brought about their focus on NVMeoF/TCP.

The advantage of NVMeoF/TCP is that it can be run on any Ethernet NIC and switch available today. From Muli’s perspective, NVMeoF/TCP is going to become the next SAN of choice for the data center. They were active, early on, in the standards committee to push for NVMeoF/TCP adoption.

How does it work?

Their software defined solution runs LightOS® storage software, a Linux based package, and uses off the shelf, server hardware with persistent storage (Optane DC PM/SSDs, NV DIMMs, V-NAND, etc.). They use persistent memory for a FAST write buffer and a place where they can “mold” the written data into something that can be better written to backend NVMe SSDs.

One surprise about Lightbits solution is that it offers a decent set of data services. These include erasure coding, thin provisioning, wire-speed inline compression, QoS and wide striping. It seems like any of these can be disabled by a customers want. But they only add very little overhead. I think Muli mentioned one Lightbits customer with encrypted data that disabled compression.

Lightbits also offers a global FTL (flash translation layer), which means they control SSD addressing which maps data to physical/raw NAND locations at the storage system level. If done well, a global FTL can help improve flash endurance and may offer better write performance (through increased parallelism).

Lightbits claim to inline, wire speed data compression is premised on the use of more current CPUs with high (>=28) core counts in a storage server. If the storage server has older CPUs (<28 cores), they suggest you install their LightField™ hardware accelerator add in card. LightField offers a number of hardware based, performance accelerations in addition to compression speedups.

LightOS requires no host (client) software. Muli’s a long time Linux kernel contributor and indicated that the only thing LightOS needs is a current Linux Kernel (5.0 or later) which has the NVMeoF/TCP driver software (and persistent memory). Lightbits believes that it’s only a matter of time until other OSs also implement NVMeoF/TCP drivers.

Lightbits business considerations

Long term, Lightbits sees a need for compute-storage disaggregation in hyper scalar and enterprise cloud environments. Early on it was relatively easy to replicate servers with DAS storage but as NVMe SSDs came out the expense to do this throughout their >>1000 server environment starts to become exorbitant. If they only had an easy way to disaggregate their storage from compute and still enjoy all the performance advantages of DAS NVMe SSDS. With LightOS they can do that.

Lightbits can be sold today through Dell, as a partner solution, which means that Dell can integrate, test and validate their servers with LightField accelerator card and deliver that package to your data center. I believe you still need to purchase and install their LightOS software yourself.

Lightbits charges for LightOS software on a per storage node basis, but they have different charges based on the maximum number of NVMe SSD slots available is in a server. There is no capacity charge. They also offer worldwide service and support for LightOS software and LightField hardware.

It’s all about performance

From a performance perspective, one Fortune 500 hyper-scalar benchmarked their storage solution against a DAS NVMe server and found it added about 30 µsec to the IO latency as compare to DAS NVMe SSDs. From their perspective, the added data services, better endurance, and disaggregated compute-storage environment provided by LightOS more than made up for the additional overhead.

Finally, I asked about whether multiple LightOS storage servers could be clustered together. Muli intervened, after stating some legal stuff, said they were working on the next generation LightOS and it will support clustered storage servers, local data replication as well as distributed (across storage servers) erasure coding.

The podcast is a long one and runs over ~47 minutes. There was a lot to talk about and Kam and Muli seem to know it all. It was interesting to hear the history of their pivot to TCP. They seem to have the right technology to address the market. Listen to the podcast to learn more.

Muli Ben-Yehuda, Co-founder and CTO, Lightbits Labs

Muli Ben-Yehuda is the CTO and Co-Founder of Lightbits Labs, where he leads technological developments.

Prior to founding Lightbits, he was chief scientist at Stratoscale and a researcher and Master Inventor at IBM Research.

He holds an M.Sc. in Computer Science (summa cum laude) from the Technion — Israel Institute of Technology and a B.A. (cum laude) from the Open University of Israel.

He is a long time Linux kernel contributor and his code and ideas are most likely included in an operating system or hypervisor running near you. He is also one of the authors of the NVMe/TCP standard and technology. 

Kam Eshghi, VP Strategy & Business Development, Lightbits Labs

Kam joined Lightbits Labs from Dell EMC and has over 20yrs of experience in strategic marketing and business development with startups and public companies.

Most recently as VP of strategic alliances at startup DSSD, Kam led business development with technology partners and developed DSSD’s partnership with EMC, leading to EMC’s acquisition of DSSD.

Previously as Sr. Director of Marketing & Business Development at IDT, Kam built their NVMe Controller business from scratch. Previous to that, Kam worked in data center storage, compute and networking markets at HP, Intel, and Crosslayer Networks. 

Kam is a U.C. Berkeley and MIT graduate with a BS and MS in Electrical Engineering and Computer Science and an MBA.

75: GreyBeards talk persistent memory IO with Andy Grimes, Principal Technologist, NetApp

Sponsored By:  NetApp
In this episode we talk new persistent memory IO technology  with Andy Grimes, Principal Technologist, NetApp. Andy presented at the NetApp Insight 2018 TechFieldDay Extra (TFDx) event (video available here). If you get a chance we encourage you to watch the videos as Andy, did a great job describing their new MAX Data persistent memory IO solution.

The technology for MAX Data came from NetApp’s Plexistor acquisition. Prior to the acquisition, Plexistor had also presented at a SFD9 and TFD11.

Unlike NVMeoF storage systems, MAX Data is not sharing NVMe SSDs across servers. What MAX Data does is supply an application-neutral way to use persistent memory as a new, ultra fast, storage tier together with a backing store.

MAX Data performs a write or an “active” (Persistent Memory Tier) read in single digit µseconds for a single core/single thread server. Their software runs in user space and as such, for multi-core servers, it can take up to 40  µseconds.  Access times for backend storage reads is the same as NetApp AFF but once read, data is automatically promoted to persistent memory, and while there, reads ultra fast.

One of the secrets of MAX Data is that they have completely replaced the Linux Posix File IO stack with their own software. Their software is streamlined and bypasses a lot of the overhead present in today’s Linux File Stack. For example, MAX Data doesn’t support metadata-journaling.

MAX Data works with many different types of (persistent) memory, including DRAM (non-persistent memory), NVDIMMs (DRAM+NAND persistent memory) and Optane DIMMs (Intel 3D Xpoint memory, slated to be GA end of this year). We suspect it would work with anyone else’s persistent memory as soon as they come on the market.

Even though the (Optane and NVDIMM) memory is persistent, server issues can still lead to access loss. In order to provide data availability for server outages, MAX Data also supports MAX Snap and MAX Recovery. 

With MAX Snap, MAX Data will upload all persistent memory data to ONTAP backing storage and ONTAP snapshot it. This way you have a complete version of MAX Data storage that can then be backed up or SnapMirrored to other ONTAP storage.

With MAX Recovery, MAX Data will synchronously replicate persistent memory writes to a secondary MAX Data system. This way, if the primary MAX Data system goes down, you still have an RPO-0 copy of the data on another MAX Data system that can be used to restore the original data, if needed. Synchronous mirroring will add 3-4  µseconds to the access time for writes, quoted above.

Given the extreme performance of MAX Data, it’s opening up whole new set of customers to talking with NetApp. Specifically, high frequency traders (HFT) and high performance computing (HPC). HFT companies are attempting to reduce their stock transactions access time to as fast as humanly possible. HPC vendors have lots of data and processing all of it in a timely manner is almost impossible. Anything that can be done to improve throughput/access times should be very appealing to them.

To configure MAX Data, one uses a 1:25 ratio of persistent memory capacity to backing store. MAX Data also supports multiple LUNs.

MAX Data only operates on Linux OS and supports (IBM) RedHat and CentOS, But Andy said it’s not that difficult to add support for other versions of Linux Distros and customers will dictate which other ones are supported, over time.

As discussed above, MAX Data works with NetApp ONTAP storage, but it also works with SSD/NVMe SSDs as backend storage. In addition, MAX Data has been tested with NetApp HCI (with SolidFire storage, see our prior podcasts on NetApp HCI with Gabriel Chapman and Adam Carter) as well as E-Series storage. The Plexistor application has been already available on AWS Marketplace for use with EC2 DRAM and EBS backing store. It’s not much of a stretch to replace this with MAX Data.

MAX Data is expected to be GA released before the end of the year.

A key ability of the MAX Data solution is that it requires no application changes to use persistent memory for ultra-fast IO. This should help accelerate persistent memory adoption in data centers when the hardware becomes more available. Speaking to that, at Insight2018, Lenovo, Cisco and Intel were all on stage when NetApp announced MAX Data.

The podcast runs ~25 minutes. Andy’s an old storage hand (although no grey beard) and talks the talk, walks the walk of storage religion. Andy is new to TFD but we doubt it will be the last time we see him there. Andy was very conversant on the MAX Data technology and the market that it apparently is opening up for this new technology.  Listen to our podcast to learn more.

Andy Grimes, Principal Technologiest, NetApp

Andy has been in the IT industry for 17 years, working in roles spanning development, technology architecture, strategic outsourcing and Healthcare..

For the past 4 years Andy has worked with NetApp on taking the NetApp Flash business from #5 to #1 in the industry (according to IDC). During this period NetApp also became the fastest growing Flash and SAN vendor in the market and regained leadership in the Gartner quadrant.

Andy also works with NetApp’s product vision, competitive analysis and future technology direction and working with the team bringing the MAX Data PMEM product to market.

Andy has a BS degree in psychology, a BPA in management information systems, and an MBA. He current works as a Principal Technologist for the NetApp Cloud Infrastructure Business Unit with a focus on PMEM, HCI and Cloud Strategy. Andy lives in Apex, NC with his beautiful wife and has 2 children, a 4 year old and a 22 year old (yes don’t let this happen to you). For fun Andy likes to Mountain Bike, Rock Climb, Hike and Scuba Dive.

69: GreyBeards talk HCI with Lee Caswell, VP Products, Storage & Availability, VMware

Sponsored by:

For this episode we preview VMworld by talking with Lee Caswell (@LeeCaswell), Vice President of Product, Storage and Availability, VMware.

This is the third time Lee’s been on our show, the previous one was back in August of last year. Lee’s been at VMware for a couple of years now and, among other things, is leading the HCI journey at VMware.

The first topic we discussed was VMware’s expanded HCI software defined data center (SDDC) solution, which now includes compute, storage, networking and enhanced operations with alerts/monitoring/automation that ties it all together.

We asked Lee to explain VMware’s SDDC:

  • HCI operates at the edge – with ROBO-2-server environments, VMware’s HCI can be deployed in a closet and remotely operated by a VI from the central site.
  • HCI operates in the data center – with vSphere-vSAN-NSX-vRealize and other software, VMware modernizes data centers for the  pace of digital business..
  • HCI operates in the public Cloud –with VMware Cloud (VMC)  on AWS, IBM Cloud and over 400 service providers, VMware HCI also operates in the public cloud.
  • HCI operates for containers and cloud native apps – with support for containers under vSphere, vSAN and NSX, developers are finding VMware HCI an easy option to run container apps in the data center, at the edge, and in the public cloud.

The importance of the edge will become inescapable, as 50B edge connected devices power IoT by 2020. Lee heard Pat saying compute processing is moving to the edge because of 3 laws:

  1. the law of physics, light/information only travels so fast;
  2. the law of economics, doing all processing at central sites would take too much bandwidth and cost; and
  3. the law(s) of the land, data sovereignty and control is ever more critical in today’s world.

VMware SDDC is a full stack option, that executes just about anywhere the data center wants to go. Howard mentioned one customer he talked with at FMS18, just wanted to take their 16 node VMware HCI rack and clone it forever, to supply infinite infrastructure.

Next, we turned our discussion to Virtual Volumes (VVols). Recently VMware added replication support for VVols. Lee said VMware has an intent to provide a SRM SRA for VVols. But the real question is why hasn’t there been higher field VVol adoption. We concluded it takes time.

VVols wasn’t available in vSphere 5.5 and nowadays, three or more years have to go by before a significant amount of the field moves to a new release. Howard also said early storage systems didn’t implement VVols right. Moreover, VMware vSphere 5.5 is just now (9/16/18) going EoGS.

Lee said 70% of all current vSAN deployments are AFA. With AFA, hand tuning storage performance is no longer something admins need to worry about. It used to be we all spent time defragging/compressing data to squeeze more effective capacity out of storage, but hand capacity optimization like this has become a lost art. Just like capacity, hand tuning AFA performance doesn’t make sense anymore.

We then talked about the coming flash SSD supply glut. Howard sees flash pricing ($/GB) dropping by 40-50%, regardless of interface. This should drive AFA shipments above 70%, as long as the glut continues.

The podcast runs ~21 minutes. Lee’s always great to talk with and is very knowledgeable about the IT industry, HCI in general, and of course, VMware HCI in particular.  Listen to the podcast to learn more.

Lee Caswell, V.P. of Product, Storage & Availability, VMware

Lee Caswell leads the VMware storage marketing team driving vSAN products, partnerships, and integrations. Lee joined VMware in 2016 and has extensive experience in executive leadership within the storage, flash and virtualization markets.

Prior to VMware, Lee was vice president of Marketing at NetApp and vice president of Solution Marketing at Fusion-IO. Lee was a founding member of Pivot3, a company widely considered to be the founder of hyper-converged systems, where he served as the CEO and CMO. Earlier in his career, Lee held marketing leadership positions at Adaptec, and SEEQ Technology, a pioneer in non-volatile memory. He started his career at General Electric in Corporate Consulting.

Lee holds a bachelor of arts degree in economics from Carleton College and a master of business administration degree from Dartmouth College. Lee is a New York native and has lived in northern California for many years. He and his wife live in Palo Alto and have two children. In his spare time Lee enjoys cycling, playing guitar, and hiking the local hills.

65: GreyBeards talk new FlashSystem storage with Eric Herzog, CMO and VP WW Channels IBM Storage

Sponsored by:

In this episode, we talk with Eric Herzog, Chief Marketing Officer and VP of WorldWide Channels for IBM Storage about the FlashSystem 9100 storage series.  This is the 2nd time we have had Eric on the show (see Violin podcast) and the 2nd time we have had a guest from IBM on our show (see CryptoCurrency talk). However, it’s the first time we have had IBM as a sponsor for a podcast.

Eric’s a 32 year storage industry veteran who’s worked for many major storage companies, including Seagate, EMC and IBM and 7 startups over his carreer. He’s been predominantly in marketing but was CFO at one company.

New IBM FlashSystem 9100

IBM is introducing a new FlashSystem 9100 storage series, using new NVMe FlashCore Modules (FCM) that have been re-designed to fit a small form factor (SFF, 2.5″) drive slot but also supports standard, NVMe SFF SSDs in a 2U appliance package. The new storage has dual active-active RAID controllers running the latest generation IBM Spectrum Virtualize software that’s running over 100K storage systems in the field today.

FlashSystem 9100 supports up to 24 NVMe FCMs or SSDs, which can be intermixed. The FCMs offer up to 19.2TB of usable flash and have onboard hardware compression and encryption.

With FCM media, the FlashSystem 9100 can sustain 2.5M IOPS at 100µsec response times with 34GB/sec of data throughput. Spectrum Virtualize is a clustered storage system, so one could cluster together up to 4 FlashSystem 9100s into a single storage system and support 10M IOPS and 136GB/sec of throughput.

Spectrum Virtualize just introduced block data deduplication within a data reduction pool. With thin provisioning, data deduplication, pattern matching, SCSI Unmap support, and data compression, the FlashSystem 9100 can offer up to 5:1 effective capacity:useable flash capacity. That means with 24 19.2TB FCMs, a single FlashSystem 9100 offers over 2PB of effective capacity.

In addition to the appliances 24 NVMe FCMs or NVMe SSDS, FlashSystem 9100 storage can also attach up to 20 SAS SSD drive shelves for additional capacity. Moreover, Spectrum Virtualize offers storage virtualization, so customers can attach external storage arrays behind a FlashSystem 9100 solution.

With FlashSystem 9100, IBM has bundled additional Spectrum software, including

  • Spectrum Virtualize for Public Cloud – which allows customers to migrate  data and workloads from on premises to the cloud and back again. Today this only works for IBM Cloud, but plans are to support other public clouds soon.
  • Spectrum Copy Data Management – which offers a simple way to create and manage copies of data while enabling controlled self-service for test/dev and other users to use snapshots for secondary use cases.
  • Spectrum Protect Plus – which provides data backup and recovery for FlashSystem 9100 storage, tailor made for smaller, virtualized data centers.
  • Spectrum Connect – which allows Docker and Kubernetes container apps to access persistent storage on FlashSystem 9100.

To learn more about the IBM FlashSystem 9100, join the virtual launch experience July 24, 2018 here.

The podcast runs ~43 minutes. Eric has always been knowledgeable on the enterprise storage market, past, present and future. He had a lot to talk about on the FlashSystem 9100 and seems to have mellowed lately. His grey mustache is forcing the GreyBeards to consider a name change – GreyHairsOnStorage anyone,  Listen to the podcast to learn more.

Eric Herzog, Chief Marketing Officer and VP of Worldwide Channels for IBM Storage

Eric’s responsibilities include worldwide product marketing and management for IBM’s award-winning family of storage solutions, software defined storage, integrated infrastructure, and software defined computing, as well as responsibility for global storage channels.

Herzog has over 32 years of product management, marketing, business development, alliances, sales, and channels experience in the storage software, storage systems, and storage solutions markets, managing all aspects of marketing, product management, sales, alliances, channels, and business development in both Fortune 500 and start-up storage companies.

Prior to joining IBM, Herzog was Chief Marketing Officer and Senior Vice President of Alliances for all-flash storage provider Violin Memory. Herzog was also Senior Vice President of Product Management and Product Marketing for EMC’s Enterprise & Mid-range Systems Division, where he held global responsibility for product management, product marketing, evangelism, solutions marketing, communications, and technical marketing with a P&L over $10B. Before joining EMC, he was vice president of marketing and sales at Tarmin Technologies. Herzog has also held vice president business line management and vice president of marketing positions at IBM’s Storage Technology Division, where he had P&L responsibility for the over $300M OEM RAID and storage subsystems business, and Maxtor (acquired by Seagate).

Herzog has held vice president positions in marketing, sales, operations, and acting-CFO roles at Asempra (acquired by BakBone Software), ArioData Networks (acquired by Xyratex), Topio (acquired by Network Appliance), Zambeel, and Streamlogic.

Herzog holds a B.A. degree in history from the University of California, Davis, where he graduated cum laude, studied towards a M.A. degree in Chinese history, and was a member of the Phi Alpha Theta honor society.

57: GreyBeards talk midrange storage with Pierluca Chiodelli, VP of Prod. Mgmt. & Cust. Ops., Dell EMC Midrange Storage

Sponsored by:

Dell EMC Midrange Storage

In this episode we talk with Pierluca Chiodelli  (@chiodp), Vice President of Product, Management and Customer Experience at Dell EMC Midrange storage.  Howard talked with Pierluca at SFD14 and I talked with Pierluca at SFD13. He started working there as a customer engineer and has worked his way up to VP since then.

This is the second time (Dell) EMC has been on our show (see our EMCWorld2015 summary podcast with Chad Sakac) but this is the first sponsored podcast from Dell EMC. Pierluca seems to have been with (Dell) EMC forever.

You may recall that Dell EMC has two product families in their midrange storage portfolio. Pierluca provides a number of reasons why both continue to be invested in, enhanced and sold on the market today.

Dell EMC Unity and SC product lines

Dell EMC Unity storage is the outgrowth of unified block and file storage that was first released in the EMC VNXe series storage systems. Unity continues that tradition of providing both file and block storage in a dense, 2 rack U system configuration, with dual controllers, high availability, AFA and hybrid storage systems. The other characteristic of Unity storage is its tight integration with VMware virtualization environments.

Dell EMC SC series storage continues the long tradition of Dell Compellent storage systems, which support block storage and which invented data progression technology.  Data progression is storage tiering on steroids, with support for multi-tiered rotating disk (across the same drive), flash, and now cloud storage. SC series is also considered a set it and forget it storage system that just takes care of itself without the need for operator/admin tuning or extensive monitoring.

Dell EMC is bringing together both of these storage systems in their CloudIQ, cloud based, storage analytics engine and plan to have both systems supported under the Unisphere management engine.

Also Unity storage can tier files to the cloud and copy LUN snapshots to the public cloud using their Cloud Tiering Appliance software.  With their UnityVSA Software Defined Storage appliance and VMware vSphere running in AWS, the file and snapshot data can then be accessed in the cloud. SC Series storage will have similar capabilities, available soon.

At the end of the podcast, Pierluca talks about Dell EMC’s recently introduced Customer Loyalty Programs, which include: Never Worry Data Migrations, Built-in VirtuSteram Storage Cloud, 4:1 Storage Efficiency Guarantee, All-inclusive Software pricing, 3-year Satisfaction Guarantee, Hardware Investment Protection, and Predictable Support Pricing.

The podcast runs ~27 minutes. Pierluca is a very knowledgeable individual and although he has a beard, it’s not grey (yet). He’s been with EMC storage forever and has a long, extensive history in midrange storage, especially with Dell EMC’s storage product families. It’s been a pleasure for Howard and I to talk with him again.  Listen to the podcast to learn more.

Pierluca Chiodelli, V.P. of Product Management & Customer Operations, Dell EMC Midrange Storage

Pierluca Chiodelli is currently the Vice President of Product Management for Dell EMC’s suite of Mid-Range solutions including, Unity, VNX, and VNXe from heritage EMC storage and Compellent, EqualLogic, and Windows Storage Server from heritage Dell Storage.

Pierluca’s organization is comprised of four teams: Product Strategy, Performance & Competitive Engineering, Solutions, and Core & Strategic Account engineering. The teams are responsible for ensuring Dell EMC’s mid-range solutions enable end users and service providers to transform their operations and deliver information technology as a service.

Pierluca has been with EMC since 1999, with experience in field support and core engineering across Europe and the Americas. Prior to joining EMC, he worked at Data General and as a consultant for HP Corporation.

Pierluca holds one degree in Chemical Engineering and second one in Information Technology.