68: GreyBeards talk NVMeoF/TCP with Ahmet Houssein, VP of Marketing & Strategy @ Solarflare Communications

In this episode we talk with Ahmet Houssein, VP of Marketing and Strategic Direction at Solarflare Communications, (@solarflare_comm). Ahmet’s been in the industry forever and has a unique view on where NVMeoF needs to go. Howard had talked with Ahmet at last years FMS. Ahmet will also be speaking at this years FMS (this week in Santa Clara, CA)..

Solarflare Communications sells Ethernet communication gear, mostly to the financial services market and has developed a software plugin for the standard TCP/IP stack on Linux that supports both target and client mode NVMeoF/TCP. That is, their software plugin provides a complete implementation of NVMeoF across TCP Ethernet that extends the TCP protocol but doesn’t require RDMA (RoCE or iWARP) or data center bridging.

Implementing NVMeoF/TCP

Solarflare’s NVMeoF/TCP is a free plugin that once approved by the NVMe(oF) standard’s committees anyone can use to create a NVMeoF storage system and consume that storage from almost anywhere. The standards committee is expected to approve the protocol extension soon and sometime after that the plugin will be added to the Linux Kernel. After standards approval, maybe VMware and Microsoft will adopt it as well, but may take more work.

Over the last year plus most NVMeoF/Ethernet we encounter requires sophisticated RDMA hardware. When we talked with Pavilion Data Systems, a month or so ago, they had designed a more networking like approach to NVMeoF using RoCE and TCP a special purpose FPGA that’s used in their RDMA NICs and Mellanox switches to support client-target mode NVMeoF/UDP [updated 8/8/18 after VR’s comment, the ed.]. When we talked with Attala Systems, they had special purpose FPGA that’s used in RDMA NICs and Mellanox switches to support target & client mode NVMeoF/UDP were using standard RDMA NICs and Mellanox switches to support their NVMeoF/Ethernet storage [updated 8/8/18 after VR’s comment, the ed.].

Solarflare is taking a different tack.

One problem with the NVMeoF/Ethernet RDMA is compatibility. You can use either RoCE or iWARP RDMA NICs but at the moment you can’t use both. With TCP/IP plugins there’s no hardware compatibility issue. (Yes, there’s software compatibility at both ends of the pipe).

SolarFlare recently measured latencies for their NVMeoF/TCP (Iometer/FIO) which shows that the with the protocol running it adds about a 5-10% increase in latency versus running RDMA NVMeoF/UDP-RoCE-iWARP.

Performance measurements were taken using a server, running Red Hat Linux + their TCP plugin with NVMe SSDs on the storage side and a similar configuration on the client side without the SSDs.

If they add 10% latency to 10 microsec. IO (for Optane), latency becomes 11 microsec. Similarly for flash NVMe SSDs it moves from 100 microsec to 110 microsec.

Ahmet did mention that their NICs have some hardware optimizations which brings down this added latency into something approaching closer to 5%. And later we discuss the immense parallelism opportunities using the TCP stack in user space. Their hardware also better supports more threads doing IO in parallel.

Why TCP

Ahmets on a mission. He says there’s this misbelief that Ethernet RDMA hardware is required to achieve lightening fast response times using NVMeoF, but it’s not true. Standard TCP with proper protocol enhancements is more than capable of performing at very close to the same latencies as RDMA, without special NICs and DCB switch configurations.

Furthermore, TCP/IP already has multipathing support. So current high availability characteristics of TCP are readily applicable to NVMeoF/TCP

Parallelism through user space

NVMeoF/TCP was the subject of 1st half of our discussion but we spent the 2nd half talking about scaling or parallelism. Even if you can do 11 or 110 microsecond latency at some point, if you do enough of these IOs, the kernel overhead in processing blocks and transferring control from kernel space to user space will become a bottleneck.

However, there’s nothing stopping IT from running the TCP/IP stack in user space and eliminating any kernel control transfer whatsoever. By doing so, data centers could parallelize all this IO using as many cores as available.

Running the plugin in a TCP/IP stack in user space allows you to scale NVMeoF lightening fast IO to as many users as you have user spaces or cores, and the kernel doesn’t even break into a sweat

Anyone could simply download Solarflare’s plugin, configure a white box server with Linux and 24 NVMe SSDs and support ~8.4M IOPS (350Kx24) at ~110 microsec latency And with user space scaling, one could easily have 1000s of user spaces connected to it.

They’re going to need need faster pipes!

The podcast runs ~39 minutes. Ahmet was very knowledgeable about NVMe, NVMeoF and TCP.  He was articulate and easy to talk with.  Listen to the podcast to learn more.

Ahmet Houssein, VP of Marketing and Strategic Direction at Solarflare Communications 

Ahmet Houssein is responsible for establishing marketing strategies and implementing programs to drive revenue growth, enter new markets and expand brand awareness to support Solarflare’s continuous development and global expansion.

He has over twenty-five years of experience in the server, storage, data center and networking industry, and held senior level executive positions in product development, marketing and business development at Intel and Honeywell. Most recently Houssein was SVP/GM at QLogic where he successfully delivered first to market with 25Gb Ethernet products securing design wins at HP and Dell.

One of the key leaders in the creation of the INFINIBAND and PCI-Express industry standard, Houssein is a recipient of the Intel Achievement Award and was a founding board member of the Storage Network Industry Association (SNIA), a global organization of 400 companies in the storage market. He was educated in London, UK and holds an Electrical Engineering Degree equivalent.

67: GreyBeards talk infrastructure monitoring with James Holden, Sr. Prod. Mgr. NetApp

Sponsored by: Howard and I first talked with James Holden, NetApp Senior Product Manager for OnCommand Insight and Cloud Insights,  last month, at Storage Field Day 16 (SFD16) in Waltham, MA. At the time, we thought it would be great to also have him on the show.

James has been with the NetApp OnCommand Insight (OCI) team for quite awhile now and is very knowledgeable about the product and its technology. NetApp Cloud Insights is a new SaaS offering that provides some of the same services as OCI without the footprint, focused on newer, non-traditional applications and available on a pay as you go model.

NetApp OnCommand Insight (OCI)

NetApp OCI is sort of a stripped down, souped up enterprise SRM tool, without storage and servers configuration-provisioning (see James’s introduction video from SFD15 for more info). It supports NetApp and just about anyone’s storage including Dell EMC, IBM, Hitachi Vantara (HDS), HPE, Infinidat, and Pure Storage as well as most major OSs such as VMware vSphere, Microsoft HyperV, RHEL, etc. Other storage can easily be  added to OCI through a patch/minor update and is typically done by customer request.

NetApp OCI currently runs in some of the biggest enterprises  in the world today, including top F500 companies and one of the world’s largest banks. OCI is agentless but does use a data collector server/VM onprem or in cloud that takes advantage of storage and system APIs to gather data.

OCI provides extensive end-to-end infrastructure monitoring and trouble shooting (see James’s SFD16 OCI monitoring & troubleshooting session). OCI monitors application workloads from VMs to the storage supporting them.

OCI also supplies extensive charge back capabilities (see his SFD16 OCI cost control/chargebacks session). In times like these when IT competes with public cloud offerings every day, charge backs can be very illuminating.

Also, OCI has extensive integration with ServiceNOW and similar offerings (see SFD16 OCI ecosystem session). With this level of integration, OCI can provide seamless tracking of service requests from initiation to completion through verification.

In addition, OCI can monitor public cloud infrastructure as well as onprem. For example, with Amazon Web Services (AWS), customers can use OCI to monitor EC2 instances EBS IO activity. OCI reports on AWS IOPS rates by EC2-EBS connection. Customers paying for EBS IOPS, can use OCI to monitor and tailor their EBS costs. OCI also supports Microsoft Azure environments.

NetApp Cloud Insights

NetApp Cloud Insights, a new SaaS offering, that is currently in Public Preview status but is expected to release in October, 2018 (checkout his SFD16 Cloud Insights session video).

Customers can currently register to use the preview version at Cloud.netapp.com/Cloud Insights. There’s a registration wall but that’s all it takes to get started. .

The minimum Cloud Insights instance is a single server and 5TB of storage. Unlike OCI, Cloud Insights is tailored to support smaller shops without significant infrastructure. However, Cloud Insight also offers standard onprem enterprise infrastructure monitoring as well.

Cloud Insights is also focused on modern, cloud-native applications whether they operate on prem or in the cloud. The problem with cloud native, container apps is that they come and go in seconds, and there’s thousands of them. Cloud Insights was designed specifically for container and other cloud native applications and as such, should provide a more accurate monitoring of operations for these systems.

We talked about Cloud Insight’s development cadence. James said that because it’s a SaaS offering new Cloud Insights functionality can be released daily, if not more frequently. Contrast that with OCI, where they schedule 3-4 releases a year.

Cloud Insight currently supports the Kubernetes container ecosystems today but more are on the way. Again, customers determine which Container or other cloud native ecosystems will be supported next.

The podcast runs ~22 minutes. James was very knowledgeable about OCI, Cloud Insights and infrastructure monitoring in general and he was easy to talk with. Howard and I had a great time at SFD16 and enjoyed our time talking with him again on the podcast.  Listen to the podcast to learn more.

James Holden, Senior Product Manager NetApp OCI and Cloud Insights 

 

James Holden is a Senior Manager of Product Management at NetApp, and for the last 5 years  has been building the infrastructure monitoring and reporting tool OnCommand Insight.

Today he is working across NetApp’s Cloud Analytics portfolio, including Cloud Insights, a new SaaS offering currently in preview.

Prior to NetApp, James worked for 14 years at CSC in both the US and the UK on their storage, compute and automation solutions.

 

 

66: GreyBeards talk Midrange storage part 2, with Sean Kinney, Sr. Dir. Midrange Storage Mkt, Dell EMC

Sponsored by:

Dell EMC Midrange Storage

In this episode we talk with Sean Kinney (@SeanRKinney14), senior director, midrange storage marketing at Dell EMC.  Howard and I have both known Sean for a number of years now. Sean has had multiple roles in the IT industry, doing various marketing and management duties at multiple vendors. He’s back at Dell EMC now and wanted to take on opportunity to discuss Dell EMC midrange storage with us.

As you probably already know, Dell EMC midrange storage dominates their market and has done so for a number of years now. Currently, Dell EMC midrange storage has 2X the revenue of any other competitor.

This is the third time (Dell) EMC has been on our show (see our EMCWorld2015 summary podcast with Chad Sakac, and  Talk with Pierluca Chiodelli sponsored podcast).  Since our last podcast, there’s been plenty of happenings at Dell EMC midrange storage.

Dell EMC Unity and SC storage news

Dell EMC Unity storage has recently added new file data reduction and file sync replication functionality. And a short time ago, Dell EMC came out with an AFA version of their SC Series storage.

With the two midrange product lines there’s been some cross fertilization. That is Dell EMC is starting to take some of the best features from one solution and applying it to the other.

For example,

  • SC series has had its Health Check offering since Compellent days. This is a PS, offered by Dell EMC, that reviews the health of your data center’s SC storage, DR plans, backup activity, IO performance, etc. and provides recommendations as to how to improve the overall storage environment. The Health Check PS is now also available for Unity storage.
  • Unity storage has had its CloudIQ management/monitoring solution since December of 2016. CloudIQ is a big data analytics-remote management, software-as-a-service offering, running in the cloud that allows customers to manage/monitor Unity storage from anywhere. With SC Series’ latest, 7.3 code update, SC storage is also supported under CloudIQ.

We also discussed some of the inherent advantages to SC Series storage, such as their forever software licensing, storage federation/scale out clusters and economical $/GB pricing.

Sean mentioned some of the Future Proof guarantees that Dell EMC offers on both Unity and SC series storage. These include hardware investment protection, data-in-place upgrades, data reduction guarantees, etc.

The podcast runs ~20 minutes. Sean has been around the storage for a long time now and is very knowledgeable about Dell EMC Midrange storage as well as competitive solutions. Howard and I have talked with Sean at a number of industry events in the past and it was fun to talk with him again.  Listen to the podcast to learn more.

Sean Kinney, Senior Director,  Dell EMC Midrange Storage Marketing

Sean Kinney is an industry leader in the storage and data protection market, with over 20 years of experience in the IT industry.

Currently, he is the Senior Director for midrange storage marketing at Dell EMC.  He spent the first 10 years of his career at EMC, and then held positions including VP and General Manager of online backup at Acronis and Senior Director, Storage Marketing at Hewlett-Packard Enterprise.

Sean has a B.A. from Dartmouth College and a M.B.A from the University of Michigan.

65: GreyBeards talk new FlashSystem storage with Eric Herzog, CMO and VP WW Channels IBM Storage

Sponsored by:

In this episode, we talk with Eric Herzog, Chief Marketing Officer and VP of WorldWide Channels for IBM Storage about the FlashSystem 9100 storage series.  This is the 2nd time we have had Eric on the show (see Violin podcast) and the 2nd time we have had a guest from IBM on our show (see CryptoCurrency talk). However, it’s the first time we have had IBM as a sponsor for a podcast.

Eric’s a 32 year storage industry veteran who’s worked for many major storage companies, including Seagate, EMC and IBM and 7 startups over his carreer. He’s been predominantly in marketing but was CFO at one company.

New IBM FlashSystem 9100

IBM is introducing a new FlashSystem 9100 storage series, using new NVMe FlashCore Modules (FCM) that have been re-designed to fit a small form factor (SFF, 2.5″) drive slot but also supports standard, NVMe SFF SSDs in a 2U appliance package. The new storage has dual active-active RAID controllers running the latest generation IBM Spectrum Virtualize software that’s running over 100K storage systems in the field today.

FlashSystem 9100 supports up to 24 NVMe FCMs or SSDs, which can be intermixed. The FCMs offer up to 19.2TB of usable flash and have onboard hardware compression and encryption.

With FCM media, the FlashSystem 9100 can sustain 2.5M IOPS at 100µsec response times with 34GB/sec of data throughput. Spectrum Virtualize is a clustered storage system, so one could cluster together up to 4 FlashSystem 9100s into a single storage system and support 10M IOPS and 136GB/sec of throughput.

Spectrum Virtualize just introduced block data deduplication within a data reduction pool. With thin provisioning, data deduplication, pattern matching, SCSI Unmap support, and data compression, the FlashSystem 9100 can offer up to 5:1 effective capacity:useable flash capacity. That means with 24 19.2TB FCMs, a single FlashSystem 9100 offers over 2PB of effective capacity.

In addition to the appliances 24 NVMe FCMs or NVMe SSDS, FlashSystem 9100 storage can also attach up to 20 SAS SSD drive shelves for additional capacity. Moreover, Spectrum Virtualize offers storage virtualization, so customers can attach external storage arrays behind a FlashSystem 9100 solution.

With FlashSystem 9100, IBM has bundled additional Spectrum software, including

  • Spectrum Virtualize for Public Cloud – which allows customers to migrate  data and workloads from on premises to the cloud and back again. Today this only works for IBM Cloud, but plans are to support other public clouds soon.
  • Spectrum Copy Data Management – which offers a simple way to create and manage copies of data while enabling controlled self-service for test/dev and other users to use snapshots for secondary use cases.
  • Spectrum Protect Plus – which provides data backup and recovery for FlashSystem 9100 storage, tailor made for smaller, virtualized data centers.
  • Spectrum Connect – which allows Docker and Kubernetes container apps to access persistent storage on FlashSystem 9100.

To learn more about the IBM FlashSystem 9100, join the virtual launch experience July 24, 2018 here.

The podcast runs ~43 minutes. Eric has always been knowledgeable on the enterprise storage market, past, present and future. He had a lot to talk about on the FlashSystem 9100 and seems to have mellowed lately. His grey mustache is forcing the GreyBeards to consider a name change – GreyHairsOnStorage anyone,  Listen to the podcast to learn more.

Eric Herzog, Chief Marketing Officer and VP of Worldwide Channels for IBM Storage

Eric’s responsibilities include worldwide product marketing and management for IBM’s award-winning family of storage solutions, software defined storage, integrated infrastructure, and software defined computing, as well as responsibility for global storage channels.

Herzog has over 32 years of product management, marketing, business development, alliances, sales, and channels experience in the storage software, storage systems, and storage solutions markets, managing all aspects of marketing, product management, sales, alliances, channels, and business development in both Fortune 500 and start-up storage companies.

Prior to joining IBM, Herzog was Chief Marketing Officer and Senior Vice President of Alliances for all-flash storage provider Violin Memory. Herzog was also Senior Vice President of Product Management and Product Marketing for EMC’s Enterprise & Mid-range Systems Division, where he held global responsibility for product management, product marketing, evangelism, solutions marketing, communications, and technical marketing with a P&L over $10B. Before joining EMC, he was vice president of marketing and sales at Tarmin Technologies. Herzog has also held vice president business line management and vice president of marketing positions at IBM’s Storage Technology Division, where he had P&L responsibility for the over $300M OEM RAID and storage subsystems business, and Maxtor (acquired by Seagate).

Herzog has held vice president positions in marketing, sales, operations, and acting-CFO roles at Asempra (acquired by BakBone Software), ArioData Networks (acquired by Xyratex), Topio (acquired by Network Appliance), Zambeel, and Streamlogic.

Herzog holds a B.A. degree in history from the University of California, Davis, where he graduated cum laude, studied towards a M.A. degree in Chinese history, and was a member of the Phi Alpha Theta honor society.

64: GreyBeards discuss cloud data protection with Chris Wahl, Chief Technologist, Rubrik

Sponsored by:

In this episode we talk with Chris Wahl, Chief Technologist, Rubrik. This is our second time having Chris on our show. The last time was about three years ago (see our Chris on agentless backup podcast). Talking with Chris again was great and there’s been plenty of news since we last spoke with him.

Rubrik now has three products the Rubrik Cloud Data Protection suite (onprem, virtual or in the [AWS & Azure] cloud), the Rubrik Datos IO (recent acquisition) for NoSql database with semantic dedupe and Rubrik Polaris GPS, a SaaS monitoring/trending/management solution for your data protection environment. Polaris GPS monitors and watches data protection trends for you, to insure all your data protection SLAs are being met. But we didn’t spend much time on Polaris.

Datos IO was designed from the start to backup new databases based on NoSQL technologies and provides, a semantic based deduplication capability, that’s unique in the industry . We talked with Datos IO before their acquisition by Rubrik (see our podcast with Tarun on 3rd generation data protection).

Cloud Data Protection

As for their Cloud Data Protection suite, one major differentiator is that all their functionality is available via RESTful APIs. Their GUI is completely built off their APIs. This means any customer could use their set of APIs to integrate Rubrik data protection with any application/workload on the planet.

Chris mentioned that Rubrik has 40+ specific application/system integrations that provide “strictly consistent” data protection. We assume this means application consistent backups and recovery but goes beyond mere applications.

With the Cloud Data Protection solution, data resides on the appliance for only a short (customer specifiable) period and then is migrated off to cloud or onprem object storage. The object storage could be any onprem S3 compatible storage, in the AWS or Azure cloud. It’s completely automatic. The data migrated to object storage is self-defining, meaning that metadata and data are all available in one spot and can be restored anywhere there’s a Rubrik Cloud Data Protection suite operating.

The Cloud Data Protection appliance also supports onboard search and analytics to search backup/recovery metadata/catalogs. As such, there’s no need to purchase other tools to uncover which backup files exist. Their solution also uses data deduplication to reduce the data stored.

Data stored is also encrypted by customer keys and use HTTPS to transfer data. So, data is secured at rest, secured in flight and deduped. Cloud Data Protection also offers data mobility. That is it can move your VMs and data from onprem to the cloud and use Rubrik in the cloud to rehydrade the data and translate your VMs to run in AWS or Azure and it works in reverse, translating AWS/Azure compute instances into VMs.

Rubrik’s major differentiator is simplicity. Traditionally, customers had been conditioned to thinking data protection took hours to maintain, fix and keep running. But with Rubrik Cloud Data Protection, a customer just points it to an application and selects an SLA, and Rubrik takes over from there.

The secret behind Rubrik’s simplicity is Cerebro. Cerebro is where they have put all the smarts to understand a data center’s infrastructure, applications/VMs, protected data and requested SLAs and just makes it work

The podcast runs ~27 minutes. Chris was great to talk with again and given how long it’s been since we last talked, he had much to discuss. Rubrik seems like an easy solution to adopt and if their growth is any indicator, customers agree. Listen to the podcast to learn more.

Chris Wahl, Chief Technologist, Rubrik

Chris Wahl, author of the award winning Wahl Network blog and host of the Datanauts Podcast, focuses on creating content that revolves around virtualization, automation, infrastructure, and evangelizing products and services that benefit the technology community.

In addition to co-authoring “Networking for VMware Administrators” for VMware Press, he has published hundreds of articles and was voted the “Favorite Independent Blogger” by vSphere-Land three years in a row (2013 – 2015). Chris also travels globally to speak at industry events, provide subject matter expertise, and offer perspectives to startups and investors as a technical adviser.