74: Greybeards talk NVMe shared storage with Josh Goldenhar, VP Cust. Success, Excelero

Sponsored by:

In this episode we talk NVMe shared storage with Josh Goldenhar (@eeschwa), VP, Customer Success at Excelero. Josh has been on our show before (please see our April 2017 podcast), the last time with Excelero’s CTO & Co-founder, Yavin Romen.

This is Excelero’s 1st sponsored GBoS podcast and we wish to welcome them again to the show. Since Excelero’s NVMesh storage software is in customer hands now, Josh is transitioning to add customer support to his other duties.

NVMe storage industry trends

We started our discussion with the maturing NVMe market. Howard mentioned he heard that NVMe SSD sales have overtaken SATA SSD volumes. Josh mentioned that NVMe SSDs are getting harder to come by,  driven primarily by Super 8 (8 biggest hyper-scalars) purchases. And even when these SSDs can be found, customers are paying a premium for NVMe drives.

The industry is also starting to sell larger capacity NVMe SSDs. Customers view this as a way of buying cheaper ($/GB) storage. However, most NVMe shared storage systems use mirroring for data protection, which cuts effective (protected) capacity in half, doubling cost/GB.

Another change in the market, is that with today’s apps many customers no longer need all the  read AND write IO performance from their NVMe storage. For newer applications/workloads, writes are less frequent and as such, less a driver of application performance. But read performance is still critical.

The other industry trend is a number of new vendors offering NVMeoF (Ethernet) storage arrays (see: Pavillion Data’s, Atalla Systems’s, and Solarflare Communication’s  podcasts in just the last few months). Most of the startup systems are essentially top of rack shared NVMe SSDs and some with limited data protection/ management services.

Excelero’s NVMesh has offered a logical volume manager as well as protected NVMe shared storage since the start, with RAID 0 and protected, RAID 1/10 storage.

Excelero is coming out with a new release of its NVMesh™ software defined storage.

NVMesh 2

We were particularly interested in one of NVMesh 2’s new capabilities, its distributed data protection, which is based on Erasure Coding (EC, like RAID 6), with a stripe that includes 8+2 segments. Unlike mirroring/RAID1-10, EC only reduces effective NVMe storage capacity by 20% for protection. And also protects against 2 drive failures within a RAID group.

However, with distributed data protection, write IO will not perform as well as reads. But reads perform just as fast as ever.

As with any data protection, customers will need sufficient spare capacity to rebuild data for a failed device.

The latest release will be available to all current customers, on service contract. When available, customers should immediately start benefiting from the space efficient, distributed data protection for new data on the system.

The new release also adds Fibre Channel (as Howard correctly guessed  on the podcast) and TCP/IP protocols to their current InfiniBand, RoCE, and NVMeoF support as well as new performance analytics to help diagnose performance issues faster and at scale.

The podcast runs ~25 minutes. Josh has an interesting perspective on the NVMe storage market as well as competitive solutions and was great to talk with again. The new data protection functionality in Excelero NVMesh 2 signals an evolving NVMe storage market. As NVMe storage matures, the tradeoff between performance and data services, looks to be an active war zone for some time to come. Listen to the podcast to learn more.

Josh Goldenhar, Vice President Customer Success, Excelero

Josh has been responsible for product strategy and vision at leading storage companies for over two decades. His experience puts him in a unique position to understand the needs of customers.
Prior to joining Excelero, Josh was responsible for product strategy and management at EMC (XtremIO) and DataDirect Networks. Previous to that, his experience and passion was in large scale, systems architecture and administration with companies such as Cisco Systems. He’s been a technology leader in Linux, Unix and other OS’s for over 20 years. Josh holds a Bachelor’s degree in Psychology/Cognitive Science from the University of California, San Diego.

55: GreyBeards storage and system yearend review with Ray & Howard

In this episode, the Greybeards discuss the year in systems and storage. This year we kick off the discussion with a long running IT trend which has taken off over the last couple of years. That is, recently the industry has taken to buying pre-built appliances rather than building them from the ground up.

We can see this in all the hyper-converged solutions available  today but it goes even deeper than that. It seems to have started with the trend in organizations to get by with less man-women power.

This led to a desire to purchase pre-buit software applications and now, appliances rather than build from parts. It just takes to long to build and lead architects have better things to do with their time than checking compatibility lists, testing and verifying that hardware works properly with software. The pre-built appliances are good enough and doing it yourself doesn’t really provide that much of an advantage over the pre-built solutions.

Next, we see the coming systems using NVMe over Fabric storage systems as sort of a countertrend to the previous one. Here we see some customers paying well for special purpose hardware with blazing speed that takes time and effort to get working right, but the advantages are significant. Both Howard and I were at the Excelero SFD12 event and it blew us away. Howard also attended the E8 Storage SFD14 event which was another example along a similar vein.

Finally, the last trend we discussed was the rise of 3D TLC and the absence of 3DX and other storage class memory (SCM) technologies to make a dent in the marketplace. 3D TLC NAND is coming out of just about every fab these days and resulting in huge (but costly) SSDs, in the multi-TB range.  Combine these with NVMe interfaces and you have msec access to almost a PB of storage without breaking a sweat.

The missing 3DX SCM tsunami some of us predicted is mainly due to the difficulties in bringing new fab technologies to market. We saw some of this in the stumbling with 3D NAND but the transition to 3DX and other SCM technologies is a much bigger change to new processes and technology. We all believe it will get there someday but for the moment, the industry just needs to wait until the fabs get their yields up.

The podcast runs over 44 minutes. Howard and I could talk for hours on what’s happening in IT today. Listen to the podcast to learn more.

Howard Marks is the Founder and Chief Scientist of howardmarksDeepStorage, a prominent blogger at Deep Storage Blog and can be found on twitter @DeepStorageNet.

 

Ray Lucchesi is the President and Founder of Silverton Consulting, a prominent blogger at RayOnStorage.com, and can be found on twitter @RayLucchesi.

46: Greybeards discuss Dell EMC World2017 happenings on vBrownBag

In this episode Howard and I were both at Dell EMC World2017 this past month and Alastair Cooke (@DemitasseNZ) asked us to do a talk at the show for the vBrownBag group (Youtube video here). The GreyBeards asked for a copy of the audio for this podcast.

Sorry about the background noise, but we recorded live at the show, with a huge teleprompter in the background that was re-broadcasting keynotes/interviews from the show.

At the show

Howard was at Dell EMC World2017 on a media pass and I was at the show on an industry analyst pass. There were parts of the show that he saw, that I didn’t and vice versa, but all keynotes and major industry outreach were available to both of us.

As always the Dell EMC team put on a great show, and kudos have to go to their AR and PR teams for having both of us there and creating a great event. There were lots of news at the show and both of us were impressed by how well Dell EMC have come together, in such a short time.

In addition, there were a number of Dell partners at the show. Howard met  Datadobi on the show floor who have a file migration tool that walks a filesystem tree and migrates files as well as reports on files it can’t. And we both saw Datrium (who we talked with last year).

Servers and other news

We both liked Dell’s new 14th generation server. But Howard objected to the lack of technical specs on it. Apparently, Intel won’t let specs be published until they announce their new CPU chipsets, sometime later this year. On the other hand, there were a few server specs discussed. For example, I was impressed the new servers would support many more NVMe cards. Howard liked the new server support for NV-DIMMs, mainly for the potential latency reduction that could provide software defined storage.

That led us on a tangent discussion about whether there is a place for non-software defined storage anymore.  Howard mentioned the downside of HCI/software defined storage on upgrading server (DIMM, PCIe card) hardware.

However, appliance hardware seems to be getting easier to upgrade. The new Unity AFA storage can be upgraded, non-disruptively from the low end to high end appliance by just swapping out controller hardware canisters.

Howard was also interested in Dell EMC’s new CloudFlex purchasing model for HCI solutions. This supplies an almost cloud-like purchasing option for customers. Where for a one year commitment,  you pay as you go (no money down, just monthly payments) rather than an up front capital purchase. After the year’s commitment expires you can send the hardware back to Dell EMC and stop paying.

We talked about Tier 0 storage. EMC DSSD was an early attempt to provide Tier 0 but came with lots of special purpose hardware. When commodity hardware and software emerged last year with NVMe SSD speed, DSSD was no longer viable at the premium pricing needed for all that hardware and was shut down. Howard and I discussed how doing special hardware requires one to be much faster (10-100X) than commodity hardware solutions to succeed and the gap has to be continued.

The other big storage news was the new VMAX 950F AFA and its performance numbers. Dell EMC said the new VMAX could do 6.7M IOPS of RRH (random read hit) and had a 350µsec response time. Howard noted that Dell EMC didn’t say at what IO load they achieved the 350µsec response time. I told him it almost didn’t matter, even if it was a single IO at that response time, it was significant.

The podcast runs about 40 minutes. It’s just Howard and I talking about what we saw/heard at the show and the occasional, tangental topic.  Listen to the podcast to learn more.


Howard Marks, DeepStorage

Howard Marks is the Founder and Chief Scientist of howardmarksDeepStorage, a prominent blogger at Deep Storage Blog and can be found on twitter @DeepStorageNet.

Ray Lucchesi, Silverton Consulting

Ray Lucchesi is the President and Founder of Silverton Consulting, a prominent blogger at RayOnStorage Blog, and can be found on twitter @RayLucchesi.

43: GreyBeards talk Tier 0 again with Yaniv Romem CTO/Founder & Josh Goldenhar VP Products of Excelero

In this episode, we talk with another next gen, Tier 0 storage provider. This time our guests are Yaniv Romem CTO/Founder  & Josh Goldenhar (@eeschwa) VP Products from Excelero, another new storage startup out of Israel.  Both Howard and I talked with Excelero at SFD12 (videos here) earlier last month in San Jose. I was very impressed with their raw performance and wrote a popular RayOnStorage blog post on their system (see my 4M IO/sec@227µsec 4KB Read… post) from our discussions during SFD12.

As we have discussed previously, Tier 0, next generation flash arrays provide very high performing storage at very low latencies with modest to non-existent advanced storage services. They are intended to replace server, direct access SSD storage with a more shared, scaleable storage solution.

In our last podcast (with E8 Storage) they sold a hardware Tier 0 appliance. As a different alternative, Excelero is a software defined, Tier 0 solution intended to be used on any commodity or off the shelf server hardware with high end networking and (low to high end) NVMe SSDs.

Indeed, what impressed me most with their 4M IO/sec, was that target storage system had almost 0 CPU utilization. (Read the post to learn how they did this). Excelero mentioned that they were able to generate high (11M random 4KB) IO/sec on  Intel Core 7, desktop-class CPU. Their one need in a storage server is plenty of PCIe lanes. They don’t even need to have dual socket storage servers, single socket CPU’s work just fine as long as the PCIe lanes are there.

Excelero software

Their intent is to bring Tier 0 capabilities out to all big storage environments. By providing a software only solution it could be easily OEMed by cluster file system vendors or HPC system vendors and generate amazing IO performance needed by their clients.

That’s also one of the reasons that they went with high end Ethernet networking rather than just Infiniband, which would have limited their market to mostly HPC environments. Excelero’s client software uses RoCE/RDMA hardware to perform IO operations with the storage server.

The other thing having little to no target storage server CPU utilization per IO operation gives them is the ability to scale up to 1000 of hosts or storage servers without reaching any storage system bottlenecks.  Another concern eliminated by minimal target server CPU utilization is that you can’t have a noisy neighbor problem, because there’s no target CPU processing to be shared.  Yet another advantage with Excelero is that bandwidth is only  limited by storage server PCIe lanes and networking.  A final advantage of their approach is that they can support any of the current and upcoming storage class memory devices supporting NVMe (e.g., Intel Optane SSDs).

The storage services they offer include RAID 0, 1 and 10 and a client side logical volume manager which supports multi-pathing. Logical volumes can span up to 128 storage servers, but can be accessed by almost any number of hosts. And there doesn’t appear to be a specific limit on the number of logical volumes you can have.

 

They support two different protocols across the 40GbE/100GbE networks. Standard NVMe over Fabric or RDDA (Excelero patented, proprietary Remote Direct Disk Array access). RDDA is what mainly provides the almost non-existent target storage server CPU utilization. But even with standard NVMe over Fabric they support low target CPU utilization. One proviso, with NVMe over Fabric, they do add shared volume functionality to support RAID device locking and additional fault tolerance capabilities.

On Excelero’s roadmap is thin provisioning, snapshots, compression and deduplication. However, they did mention that adding advanced storage functionality like this will impede performance. Currently, their distributed volume locking and configuration metadata is not normally accessed during an IO but when you add thin provisioning, snapshots and data reduction, this metadata needs to become more sophisticated and will necessitate some amount of access during and after an IO operation.

Excelero’s client software runs in Linux kernel mode client and they don’t currently support VMware or Hyper-V. But they do support KVM as a hypervisor and would be willing to support the others, if APIs were published or made available.

They also have an internal OpenStack Cinder driver but it’s not part of their OpenStack’s release yet. They’re waiting for snapshot to be available before they push this into the main code base. Ditto for Docker Engine but this is more of a beta capability today.

Excelero customer experience

One customer (NASA Ames/Moffat Field) deployed a single 2TB NVMe SSD across 128 hosts and had a single 256TB logical volume shared and accessed by all 128 hosts.

Another customer configured Excelero behind a clustered file system and was able to generate 30M randomized IO/sec at 200µsec latencies but more important, 140GB/sec of bandwidth. It turns out high bandwidth is important to many big data applications that have to roll lots of data into their analytics clusters, processing it and output results, and then do it all over again. Bandwidth limitations can impact the success of these types of applications.

By being software only they can be used in a standalone storage server or as a hyper-converged solution where applications and storage can be co-resident on the same server. As noted above, they currently support Linux O/Ss for their storage and client software and support any X86 Intel processor, any RDMA capable NIC, and any NVMe SSD.

Excelero GTM

Excelero is focused on the top 200 customers, which includes the hyper-scale providers like FaceBook, Google, Microsoft and others. But hyper-scale customers have huge software teams and really a single or few, very large/complex applications which they can create/optimize a Tier 0 storage for themselves.

It’s really the customers just below the hyper-scalar class, that have similar needs for high low latency IO/sec or high IO bandwidth (or both) but have 100s to 1000s of applications and they can’t afford to optimize them all for Tier 0 flash. If they solve sharing Tier 0 flash storage in a more general way, say as a block storage device. They can solve it for any application. And if the customer insists, they could put a clustered file system or even an object storage (who would want this) on top of this shared Tier 0 flash storage system.

These customers may currently be using NVMe SSDs within their servers as a DAS device. But with Excelero these resources can be shared across the data center. They think of themselves as a top of rack NVMe storage system.

On their website they have listed a few of their current customers and their pretty large and impressive.

NVMe competition

Aside from E8 Storage, there are few other competitors in Tier 0 storage. One recently announced a move to an NVMe flash storage solution and another killed their shipping solution. We talked about what all this means to them and their market at the end of the podcast. Suffice it to say, they’re not worried.

The podcast runs ~50 minutes. Josh and Yaniv were very knowledgeable about Tier 0, storage market dynamics and were a delight to talk with.   Listen to the podcast to learn more.


Yaniv Romem CTO and Founder, Excelero

Yaniv Romem has been a technology evangelist at disruptive startups for the better part of 20 years. His passions are in the domains of high performance distributed computing, storage, databases and networking.
Yaniv has been a founder at several startups such as Excelero, Xeround and Picatel in these domains. He has served in CTO and VP Engineering roles for the most part.


Josh Goldenhar, Vice President Products, Excelero

Josh has been responsible for product strategy and vision at leading storage companies for over two decades. His experience puts him in a unique position to understand the needs of our customers.
Prior to joining Excelero, Josh was responsible for product strategy and management at EMC (XtremIO) and DataDirect Networks. Previous to that, his experience and passion was in large scale, systems architecture and administration with companies such as Cisco Systems. He’s been a technology leader in Linux, Unix and other OS’s for over 20 years. Josh holds a Bachelor’s degree in Psychology/Cognitive Science from the University of California, San Diego.