140: Greybeards talk data orchestration with Matt Leib, Product Marketing Manager for IBM Spectrum Fusion

As our listeners should know, Matt Leib (@MBleib) was a GreyBeards co-host But since then, Matt has joined IBM to become Product Marketing Manager on IBM Spectrum Fusion, a data orchestration solution for Red Hat OpenShift environments. Matt’s been in and around the storage and data management industry for many years which is why we tapped him for GreyBeards co-host duties.

IBM Fusion, in its previous incarnation, came as an OpenShift software defined storage or as an OpenShift (H)CI solution. But recently, Fusion has taken on more of a data orchestration role for OpenShift stateful containerized applications. Listen to the podcast to learn more.

Fusion can run in any OpenShift deployment whether (currently AWS, Azure, & IBM) clouds, under VMware (wherever it runs), or on (x86 or IBM Z) bare metal. It supplies NFS file or S3 compatible object storage for container applications running under OpenShift. But it does more than just storage.

Beyond storage, Fusion includes backup/recovery, site to site DR and global (file & object) data access. It’s almost like someone opened up the IBM Spectrum software pantry and took out the best available functionality and cooked it up in to an OpenShift solution. IBM’s Spectrum Fusion current website (linked to above (Dec.’22)) still refers only to the software defined storage and (H)CI solution, but today’s Fusion includes all of the functions identified above.

All Fusion facilities run as containers under OpenShift. Customers can elect to run all Fusion services or pick and chose which ones they want for their environment. IBM Fusion supports an API, an API backed GUI, and CLI for its storage & data management as well as REST access. Fusion is fully compatible with Red Hat Ansible.

IBM Fusion is intended to be storage agnostic. Which means it can support its data management services for any NFS file storage as well as anyone’s S3 compatible, object storage.

Now that Red Hat software defined CEPH and ODF are under IBM product management, CEPH and ODF options will become available under Fusion. And CEPH offers block as well as file and object. We’ve talked about CEPH before, packaged in a hardware appliance, see our SoftIron podcast.

One intriguing part of the Fusion solution is its global data access. With global access, any OpenShift application can access data from any Fusion data store, across clouds, across on prem installations, or just about anywhere OpenShift is running. Matt mentioned that compute could be on AWS OpenShift, Fusion’s data control plane could be running on prem OpenShift and the data storage could be running on Azure OpenShift. All this would be glued together by Fusion global access, so that AWS compute had access to data on Azure.

There’s some sophisticated caching magic to make global access happen seamlessly and with decent levels of performance, but customers no longer have to copy whole file systems over from one cloud to another in order to move compute or data. IBM Fusion would need to run in all those locations for global access.

Keith asked if it was directly available in the AWS marketplace. Matt said not yet but you can deploy OpenShift out of the marketplace and then deploy IBM Fusion onto that.

It took us sometime to get our heads wrapped around what Fusion has to offer and throughout it all, Keith and I had a bit of fun with Matt.

Matthew Leib, Product Marketing Manager, IBM Spectrum Fusion

Matt has spent years in IT, from Engineering, to Architecture, from PreSales to analyst work, and finally to Product Marketing at IBM.

He’s spent years trying to achieve both credibility in the space, as a podcaster, blogger, and community member.

In his spare time, he’s a dad, dog owner, and amateur guitar player..

127: Annual year end wrap up podcast with Keith, Matt & Ray

[Ray’s sorry about his audio, it will be better next time he promises, The Eds] This was supposed to be the year where we killed off COVID for good. Alas, it was not to be and it’s going to be with us for some time to come. However, this didn’t stop that technical juggernaut we call the GreyBeards on Storage podcast.

Once again we got Keith, Matt and Ray together to discuss the past year’s top 3 technology trends that would most likely impact the year(s) ahead. Given our recent podcasts, Kubernetes (K8s) storage was top of the list. To this we add AI-MLops in the enterprise and continued our discussion from last year on how Covid & WFH are remaking the world, including offices, data centers and downtowns around the world. Listen to the podcast to learn more.

K8s rulz

For some reason, we spent many of this year’s podcasts discussing K8s storage. TK8s was never meant to provide (storage) state AND as a result, any K8s data storage has had to be shoe horned in.

Moreover, why would any IT group even consider containerizing enterprise applications let alone deploy these onto K8s. The most common answers seem to be automatic scalability, cloud like automation and run-anywhere portability.

Keith chimed in with enterprise applications aren’t going anywhere and we were off. Just like the mainframe, client-server and OpenStack applications before them, enterprise apps will likely outlive most developers, continuing to run on their current platforms forever.

But any new apps will likely be born, live a long life and eventually fade away on the latest runtime environment. which is K8s.

Matt mentioned hybrid and multi-cloud as becoming the reason-d’etre for enterprise apps to migrate to containers and K8s. Further, enterprises have pressing need to move their apps to the hybrid- & multi-cloud model. AWS’s recent hiccups, notwithstanding, multi-cloud’s time has come.

Ray and Keith then discussed which is bigger, K8s container apps or enterprise “normal” (meaning virtualized/bare metal) apps. But it all comes down to how you define bigger that matters, Sheer numbers of unique applications – enterprise wins, Compute power devoted to running those apps – it’s a much more difficult race to cal/l. But even Keith had to agree that based on compute power containerized apps are inching ahead.

AI-MLops coming on strong

AI /MLops in the enterprise was up next. For me the most significant indicator for heightened interest in AI-ML was VMware announced native support for NVIDIA management and orchestration AI-MLops technologies.

Just like K8s before it and VMware’s move to Tanzu and it’s predecessors, their move to natively support NVIDIA AI tools signals that the enterprise is starting to seriously consider adding AI to their apps.

We think VMware’s crystal ball is based on

  • Cloud rolling out more and more AI and MLops technologies for enterprises to use. on their infrastructure
  • GPUs are becoming more and more pervasive in enterprise AND in cloud infrastructure
  • Data to drive training and inferencing is coming out of the woodwork like never before.

We had some discussion as to where AMD and Intel will end up in this AI trend.. Consensus is that there’s still space for CPU inferencing and “some” specialized training which is unlikely to go away. And of course AMD has their own GPUs and Intel is coming out with their own shortly.

COVID & WFH impacts the world (again)

And then there was COVID and WFH. COVID will be here for some time to come. As a result, WFH is not going away, at least not totally any time soon. And is just becoming another way to do business.

WFH works well for some things (like IT office work) and not so well for others (K-12 education). If the GreyBeards were into (non-crypto) investing, we’d be shorting office real estate. What could move into those millions of square feet (meters) of downtime office space is anyones guess. But just like the factories of old, cities and downtowns in particular can take anything and make it useable for other purposes.

That’s about it, 2021 was another “interesteing” year for infrastructure technology. It just goes to show you, “May you live in interesting times” is actually an old (Chinese) curse.

Keith Townsend, (@TheCTOadvisor)

Keith is a IT thought leader who has written articles for many industry publications, interviewed many industry heavyweights, worked with Silicon Valley startups, and engineered cloud infrastructure for large government organizations. Keith is the co-founder of The CTO Advisor, blogs at Virtualized Geek, and can be found on LinkedIN.

Matt Leib, (@MBLeib)

Matt Leib has been blogging in the storage space for over 10 years, with work experience both on the engineering and presales/product marketing. His blog is at Virtually Tied to My Desktop and he’s on LinkedIN.

Ray Lucchesi, (@RayLucchesi)

Ray is the host and co-founder of GreyBeardsOnStorage and is President/Founder of Silverton Consulting, and a prominent (AI/storage/systems technology) blogger at RayOnStorage.com. Signup for SCI’s free, monthly industry e-newsletter here, published continuously since 2007. Ray can also be found on LinkedIn

116: GreyBeards talk VCF on VxBlock 1000 with Martin Hayes, DMTS, Dell Technologies

Sponsored By:

This past week, we had a great talk with Martin Hayes (@hayes_martinf), Distinguished Member Technical Staff at Dell Technologies about running VMware Cloud Foundation (VCF) on VxBlock 1000 converged infrastructure (CI). It used to be that Cloud Foundation required VMware vSAN primary storage but that changed a few years ago. . When that happened, the Dell Technologies team saw it as a great opportunity to support VCF on VxBlock CI.

This is the first GreyBeards podcast for Martin, but he was extremely knowledgeable about VxBlock and Cloud Foundation technologies. He’s been a technical product manager on the VxBlock converged infrastructure at Dell Technologies for many years. He’s an expert on Cloud Foundation and he knows an awful lot more about VMware NSX-T networking than seems reasonable (good thing). In any case, Martin’s expertise covers the whole gamut of VCF services as well as VxBlock 1000 infrastructure. The podcast is a bit longer than our normal sponsored podcast but there was a lot of information to cover. Listen to the podcast to learn more.

With VCF enabling primary storage on networked storage systems, all the storage vendors in the world gave a mighty cheer. But VMware Cloud Foundation still requires the vSAN servers to run its management domain. Late in 2020, VxBlock 1000 from Dell Technologies released a new software defined version of its Advanced Management Platform (AMP) to run on vSAN Ready Nodes. AMP is VxBlock’s management platform but also runs management domains for VCF and NSX-T.

For workload domains, VxBlock 1000 offers Cisco UCS M5 rack and blade servers, that can be configured to support just about any workload needed by a data center.

Historically, VMware vSphere problems with DR weren’t as much storage replication issues as networking problems. But NSX-T and VCF seemed to have solved that problem.

And with vRealize Automation plugins and NSX-T APIs, customers can have 0 touch network provisioning which enables the use of IaaS or infrastructure as code for their data center.

VMware vVOLs are now available with Dell EMC PowerMax storage. So, now VxBlock 1000 customers can use vSphere storage policy-based management (SPBM) as well as automated vVOL replication for data on PowerMax.

VMware NSX-T implements Application Virtual Networks (AVNs) using a GENEVE overlay network, which make extensive use of encapsulation. But where there’s encapsulation, de-encapsulation must follow to access outside networks. All this (encapsulation on ingress, de-encapsulation on egress) is done through NSX-T Edge clusters.

The net result of all this is that VMware customers have more choice, i.e., now they can run VCF on HCI or CI. And with VxBlock 1000 CI, VCF customers can select a best of breed components for each level of their 3-tier infrastructure.

Martin Hayes, DMTS, Dell Technologies

Martin Hayes is a Technical Product Manager at Dell Technologies, where he develops and executes data center product strategies that incorporate virtualization, software-defined networking (SDN) and converged systems.

Previously, he served in network advisory and architect roles at Dell EMC, converged systems pioneer VCE and Irish broadband provider eircom.

112: GreyBeards annual year end wrap-up with Keith & Matt

It’s the end of the year, so time for our regular year end wrap up discussion with the GreyBeards. 2020 has been an interesting year to say the least. It started out just fine, then COVID19 showed up and threw a wrench in everyone’s plans and as the year closes, we were just starting to see some semblance of the new normal, when one of the largest security breaches in years shows up. Whew, almost glad that’s over and onto 2021.

As always the GreyBeards had a great discussion on these and other topics to highlight the year just past. The talk was wide ranging and hard to characterize but I did my best below. Listen to the podcast to learn more.

COVID19s impact on the enterprise

It will probably take some time before we learn the true, long term impacts of COVID19 on IT but one major change has to be the massive Work From Home (WFH) transition that took place overnight.

While WFH can be more productive for some, the lack of face2face interaction can be challenging for others. The fact that many of the GreyBeards have been working from home for decades now, left us a bit oblivious to how jarring this transition can be for newcomers.

There’s definitely some psychological changes that need to occur to be productive at WFH. Organization skills become even more important. Structured interactions (read conference calls, zoom/webex and other forms of communication become much more important. And then there’s security.

Turns out VMware and others have been touting VDI solutions for the past decade or so to better support remote work and at the same time providing corporate levels of security for remote work. While occasionally this doesn’t work quite as well as expected, it’s certainly much much better than having end users access corporate data without any security around that data or worse yet, the “bring your own device”. All these VDI solutions had a field day when WFH happened.

Many workers found they could be more productive at WFH, due the less distractions, no commute time and more flexible hours. What happens when COVID19 is vanquished to all these current WFHers is anyone’s guess.

We thought there might be less need for large office campuses/buildings. But there’s something to be said for more collaboration and random interactions through face2face meetings that can only occur in an office setting with workers present at the same time. Some organizations will take to this new way of work while others will try to dial WFH back to non-existent. Where your organization fits on this spectrum and why, will be telling across a number of dimensions.

The rise of ARM

There’s been a slow but steady improvement in ARM processors over the last almost half century. Nowadays it’s starting to make a place for itself in the enterprise. ARH has always been the goto microprocessor for low power solutions (like smartphones) but nowadays they are being deployed in the cloud and even the enterprise. These can be used as server processors but even outside servers, ARM cores are showing up in hardware accelerators as the brains behind SmartNICs, DPUs, SPUs, etc.

Keith made mention AWS 2nd generation Graviton 64-bit ARM processor EC2 instances. And yes there’s significant cost ( & power) savings that can be had using AWS Graviton ARM instances. So the cloud is starting to adopt them. Somewhere over the past couple of years I heard that VMware was porting ESX to work on ARM cores.

But apparently, it’s not just as simple as dropping an ARM multi-core processor into a server and recompiling your code and away you go. Applications need a certain amount of optimization to run effectively on ARM processors. And the speed up between non-optimized and optimized versions of an application running on ARM cores is significant.

As for SmartNICs and DPUs, these are data networking hardware accelerators that provide real time processing capabilities needed to keep up with higher speed networking, 100GbE and beyond. These DPUs perform deep packet inspection, data compression, encryption and other services all at wire speeds.. Yes you could devote 1 or more X86 cores to do this, but it’s much cheaper (and more effective) to do this outside the CPU core. Moreover, performing this activity at the network entry point to the server means that much of this data doesn’t have to be transferred back and forth through server memory. So not only does it save CPU core cycles but also memory size and memory & PCIe bus bandwidth. We published a recent podcast with Kevin Deierling, NVIDIA Networking discussing DPUs if you want to learn more.

Pat made mention at (virtual) VMworld their plans to port ESX to the DPU. Keith followed up on this and asked some other exec’s at VMware about this and they said VMware will more likely support DPUs as just another hardware accelerator in their cluster. In either case, CPU cycles should be freed up and this should help VMware use X86 cores more efficiently. And perhaps this will help them engage in more CPU constrained environments such as Telcom.

Then there’s computational storage. We have been watching this technology for a couple of years now and it’s seeing some success in being deployed to public cloud environments. They seem to be being used to provide outboard data compression. It’s unclear whether these systems depend on ARM processing or not but my bet is that they do. To learn more about computational storage check out these podcasts, FMS2020 wrap up with Jim Handy and our talk with Scott Shadley on NGD’s computational storage.

System security

At yearend, we are learning of a massive security breach throughout US government IT facilities. All based on what is believed to be a Russian hack to a software package that is embedded in a popular networking tool software solution, SolarWinds. They are calling this a software supply chain hack. Although we are mainly hearing about government agencies being hacked, SolarWinds is also pervasive in the enterprise as well.

There have been many hardware supply chain hacks in the past, where a board supplier used chips or logic that weren’t properly vetted. Over time, hardware suppliers have started to scrutinize their supply chains better and have reduced this risk.

And the US government have been lobbying for the industry to use a security chip with a backdoor or to supply back doors to smartphone encryption capabilities. Luckily, so far, none of these have been implemented by industry.

What Russia has shown us is that this particular hack is not limited to the hardware sphere. Software supply chain risk can’t be ignored anymore.

This means that any software application supplier will need to secure their supply chain or bring it all in house. Which may mean that costs for these packages will go up. It’s possible that using a pure open source supply chain may reduce this risk as well. At least that’s the promise of open source.

We said 2020 was an interesting year and it’s going out with a bang.

Matt Leib (@MBLeib), one of our co-hosts, has been blogging in the storage space for over 10 years, with work experience both on the engineering and presales/product marketing.. His blog is at Virtually Tied to My Desktop and he’s on LinkedIN.

Keith Townsend (@CTOAdvisor) is a IT thought leader who has written articles for many industry publications, interviewed many industry heavyweights, worked with Silicon Valley startups, and engineered cloud infrastructure for large government organizations. Keith is the co-founder of The CTO Advisor, blogs at Virtualized Geek, and can be found on LinkedIN.

109: GreyBeards talk SmartNICs & DPUs with Kevin Deierling, Head of Marketing at NVIDIA Networking

We decided to take a short break (of sorts) from storage to talk about something equally important to the enterprise, networking. At (virtual) VMworld a month or so ago, Pat made mention of developing support for SmartNIC-DPUs and even porting vSphere to run on top of a DPU. So we thought it best to go to the source of this technology and talk with Kevin Deierling (TechSeerKD), Head of Marketing at NVIDIA Networking who are the ones supplying these SmartNICs to VMware and others in the industry.

Kevin is always a pleasure to talk with and comes with a wealth of expertise and understanding of the technology underlying data centers today. The GreyBeards found our discussion to be very educational on what a SmartNIC or DPU can do and why VMware and others would be driving to rapidly adopt the technology. Listen to the podcast to learn more.

NVIDIA’s recent acquisition of Mellanox brought them Mellanox’s NIC, switch and router technology. And while Mellanox, and now NVIDIA have some pretty impressive switches and routers, what interested the GreyBeards was their SmartNIC technology.

Essentially, SmartNICS provide acceleration and offload of data handling needs required to move data around an enterprise network. These offload services include at a minimum, encryption/decryption, packet pacing (delivering gadzillion video streams at the right speed to insure proper playback by all), compression, firewalls, NVMeoF/RoCE, TCP/IP, GPU direct storage (GDS) transfers, VLAN micro-segmentation, scaling, and anything else that requires real time processing to perform at line speeds.

For those who haven’t heard of it, GDS transfers data from storage directly into GPU memory and from GPU memory directly to storage without any CPU cycles or server memory involvement, other than to set up the transfer. This extends NVMeoF RDMA tech to/from storage and server memory, to GPUs. That is, GDS offers a RDMA like path between storage and GPU memory. GPU to/from server memory direct interface already exists over the PCIe bus.

But even with all the offloads and accelerators above, they can also offer an additional a secure enclave outside the TPM in the CPU, to better isolate security sensitive functionality for a data center. (See DPU below).

Kevin mentioned multiple times that the new unit of computation is no longer a server but rather is now a data center. When you have public cloud, private cloud and other systems that all serve up virtual CPUs, NICs, GPUs and storage, what’s really being supplied to a user is a virtual data center. Cloud providers can carve up their hardware and serve it to you any way you want or need it. Virtual data centers can provide a multitude of VMs and any infrastructure that customers need to use to run their workloads.

Kevin mentioned by using SmartNics, IT or cloud providers can return 30% of the processor cycles (that were being spent doing networking work on CPUs) back to workloads that run on CPUs. Any data center can effectively obtain 30% more CPU cycles and increased networking speed and performance just by deploying SmartNICs throughout all the servers in their environment.

SmartNICs are an outgrowth of Mellanox technology embedded in their HPC InfiniBAND and high end Ethernet switches/routers. Mellanox had been well known for their support of NVMeoF/RoCE to supply high IOPs/low-latency IO activity for NVMe storage over Ethernet and before that their InfiniBAND RDMA technologies.

As Mellanox came out with their 2nd Gen SmartNIC they began to call their solution a “DPU” (data processing unit), which they see forming part of a “holy trinity” underpinning the new data center which has CPUs, GPUs and now DPUs. But a DPU is more than just a SmartNIC.

All NVIDIA SmartNICs and DPUs are based on Mellanox’s BlueField cards and chip technology. Their DPU uses BlueField2 (gen 2 technology) chips, which has a multi-core ARM engine inside of it and memory which can be used to perform computational processing in addition to the onboard offload/acceleration capabilities.

Besides adding VMware support for SmartNICs, PatG also mentioned that they were porting vSphere (ESX) to run on top of NVIDIA Networking DPUs. This would move the core VMware’s hypervisor functionality from running on CPUs, to running on DPUs. This of course would free up most if not all VMware Hypervisor CPU cycles for use by customer workloads.

During our discussion with Kevin, we talked a lot about the coming of AI-ML-DL workloads, which will require ever more bandwidth, ever lower latencies and ever more compute power. NVIDIA was a significant early enabler of the AI-ML-DL with their CUDA API that allowed a GPU to be used to perform DL network training and inferencing. As such, CUDA became an industry wide phenomenon allowing industry wide GPUs to be used as DL compute engines.

NVIDIA plans to do the same with their SmartNICs and DPUs. NVIDIA Networking is releasing the DOCA (Data center On a Chip Architecture) SDK and API. DOCA provides the API to use the BlueField2 chips and cards which are the central techonology behind their DPU. They have also announced a roadmap to continue enhancing DOCA, as they have done with CUDA, over the foreseeable future, to add more bandwidth, speed and functionality to DPUs.

It turns out the real problem which forced Mellanox and now NVIDIA to create SmartNics was the need to support the extremely low latencies required for NVMeoF and GDS IO.

It wasn’t clear that the public cloud providers were using SmartNICS but Kevin said it’s been sort of a widely known secret that they have been using the tech. The public clouds (AWS, Azure, Alibaba) have been deploying SmartNICS in their environments for some time now. Always on the lookout for any technology that frees up compute resources to be deployed for cloud users, it appears that public cloud providers were early adopters of SmartNICS.

Kevin Deierling, Head of Marketing NVIDIA Networking

Kevin is an entrepreneur, innovator, and technology executive with a proven track record of creating profitable businesses in highly competitive markets.

Kevin has been a founder or senior executive at five startups that have achieved positive outcomes (3 IPOs, 2 acquisitions). Combining both technical and business expertise, he has variously served as the chief officer of technology, architecture, and marketing of these companies where he led the development of strategy and products across a broad range of disciplines including: networking, security, cloud, Big Data, machine learning, virtualization, storage, smart energy, bio-sensors, and DNA sequencing.


Kevin has over 25 patents in the fields of networking, wireless, security, error correction, video compression, smart energy, bio-electronics, and DNA sequencing technologies.

When not driving new technology, he finds time for fly-fishing, cycling, bee keeping, & organic farming.

This image has an empty alt attribute; its file name is Subscribe_on_iTunes_Badge_US-UK_110x40_0824.png
This image has an empty alt attribute; its file name is play_prism_hlock_2x-300x64.png
This image has an empty alt attribute; its file name is Spotify_Logo_CMYK_Black-1024x307.png