148: GreyBeards talk software defined infrastructure with Anthony Cinelli and Brian Dean, Dell PowerFlex

Sponsored By:

This is one of a series of podcasts the GreyBeards are doing with Dell PowerFlex software defined infrastructure. Today, we talked with Anthony Cinelli, Sr. Director Dell Technologies and Brian Dean, Technical Marketing for PowerFlex. We have talked with Brian before but this is the first time we’ve met Anthony. They were both very knowledgeable about PowerFlex and the challenges large enterprises have today with their storage environments.

The key to PowerFlex’s software defined solution is its extreme flexibility, which comes mainly from its architecture which offers scale-out deployment options ranging from HCI solutions to a fully disaggregated compute-storage environment, in seemingly any combination (see technical resources for more info). With this sophistication, PowerFlex can help consolidate enterprise storage across just about any environment from virtualized workloads, to standalone databases, big data analytics, as well as containerized environments and of course, the cloud. Listen to the podcast to learn more.

To support this extreme flexibility, PowerFlex uses both client and storage software that can be configured together on a server (HCI) or apart, across compute and storage nodes to offer block storage. PowerFlex client software runs on any modern bare-metal or virtualized environment.

Anthony mentioned that one common problem to enterprises today is storage sprawl. Most large customers have an IT environment with sizable hypervisor based workloads, a dedicated database workload, a big data/analytics workload, a modern container based workload stack, an AI/ML/DL workload and more often than not, a vertical specific workload.

Each workload usually has their own storage system. And the problem with 4-7 different storage systems is cost, e.g., cost of underutilized storage. Typical to these environments, each storage system could be used at say, 60% utilization on average, but this will vary a lot between silos, leading to stranded capacity.

The main reason customers haven’t consolidated yet is because each silo has different performance characteristics. As a result, they end up purchasing excess capacity which increases cost and complexity, as a standard part of doing business.

To consolidate storage across these disparate environments requires a no-holds barred approach to IO performance, second to none, which PowerFlex can deliver. The secret to to its high levels of IO performance is RAID 10, deployed across a scale-out cluster. And PowerFlex clusters can range from 4 to 1000 or more nodes.

RAIID 10 mirrors data and spreads mirrored data across all drives and servers in a cluster or some subset. As a result, as you add storage nodes, IO performance scales up, almost linearly.

Yes, there can be other bottlenecks in clusters like this, most often networking, but with PowerFlex storage, IO need not be one of them. Anthony mentioned that PowerFlex will perform as fast as your infrastructure will support. So if your environment has 25 Gig Ethernet, it will perform IO at that speed, if you use 100 Gig Ethernet, it will perform at that speed.

In addition, PowerFlex offers automated LifeCycle Management (LCM), which can make having a 1000 node PowerFlex cluster almost as easy as a 10 node cluster. However to make use this automated LCM, one must run its storage server software on Dell PowerEdge servers.

Brian said adding or decommissioning PowerFlex nodes is a painless process. Because data is always mirrored, customers can remove any node, at any time and PowerFlex will automatically rebuild data across other nodes and drives. When you add nodes, those drives become immediately available to support more IO activity. Another item to note, because of RAID 10, PowerFlex mirror rebuilds happen very fast, as just about every other drive and node in the cluster (or subset) participates in the rebuild process.

PowerFlex supports Storage Pools. This partitions PowerFlex storage nodes and devices into multiple pools of storage used to host volume IO and data Storage pools can be used to segregate higher performing storage nodes from lower performing ones so that some volumes can exclusively reside on higher (or lower) performing hardware.

Although customers can configure PowerFlex to use all nodes and drives in a system or storage pool for volume data mirroring, PowerFlex offers other data placement alternatives to support high availability.

PowerFlex supports Protection Domains which are subsets or collections of storage servers and drives in a cluster where volume data will reside. This will allow one protection domain to go down while others continue to operate. Realize that because volume data is mirrored across all devices in a protection domain, it will take lots of nodes or devices to go down before a protection domain is out of action.

PowerFlex also uses Fault Sets, which are a collection of storage servers and their devices within a Protection Domain, that will contain one half of a volume’s data mirror. PowerFlex will insure that a primary and its mirror copy of volume’s data will not both reside on the same fault set. A fault set could be a rack of servers, multiple racks, all PowerFlex storage servers in an AZ, etc. With fault sets, customer data will always reside across a minimum of two fault sets, and if any one goes down, data is still available.

PowerFlex also operates in the cloud. In this case, customers bring their own PowerFlex software and deploy it over cloud compute and storage.

Brian mentioned that anything PowerFlex can do such as reconfiguring servers, can be done through RESTful/API calls. This can be particularly useful in cloud deployments as above, if customers want to scale up or down IO performance automatically.

Besides block services, PowerFlex also offers NFS/CIFS-SMB native file services using a File Node Controller. This frontends PowerFlex storage nodes to support customer NFS/SMB file access to PowerFlex data.

Anthony Cinelli, Sr. Director Global PowerFlex Software Defined & MultiCloud Solutions

Anthony Cinelli is a key leader for Dell Technologies helping drive the success of our software defined and multicloud solutions portfolio across the customer landscape. Anthony has been with Dell for 13 years and in that time has helped launch our HCI and Software Defined businesses from startup to the multi-billion dollar lines of business they now represent for Dell.

Anthony has a wealth of experience helping some of the largest organizations in the world achieve their IT transformation and multicloud initiatives through the use of software defined technologies.

Brian Dean, Dell PowerFlex Technical Marketing

Brian is a 16+ year veteran of the technology industry, and before that spent a decade in higher education. Brian has worked at EMC and Dell for 7 years, first as Solutions Architect and then as TME, focusing primarily on PowerFlex and software-defined storage ecosystems.

Prior to joining EMC, Brian was on the consumer/buyer side of large storage systems, directing operations for two Internet-based digital video surveillance startups.

When he’s not wrestling with computer systems, he might be found hiking and climbing in the mountains of North Carolina.

146: GreyBeards talk K8s cloud storage with Brian Carmody, Field CTO, Volumez

We’ve known Brian Carmody (@initzero), Field CTO, Volumez for over a decade now and he’s always been very technically astute. He moved to Volumez earlier this year and has once again joined a storage startup. Volumez is a cloud K8s storage provider with a new twist, K8s persistent volumes hosted on ephemeral storage.

Volumes currently works in public clouds (AWS & Azure( soft launch), with GCP coming soon) and is all about supplying high performing, enterprise class data services to K8s container apps. But doing this using transient (Azure ephemeral &AWS instance) storage and standard Linux. Hyperscalers offer transient storage as almost an afterthought with customer compute instances. Listen to the podcast to learn more.

It turns out that over the last decade or so, there has been a lot of time and effort devoted to maturing Linux’s storage stack and nowadays, with appropriate configuration, Linux can offer enterprise class data services and performance using direct attached NVMe SSDs. These services include thin provisioning, encryption, RAID/erasure coding, snapshots, etc., which on top of NVMe SSDs, provide IOPS, bandwidth and latency performance that boggles the mind.

However, configuring Linux sophisticated and high performing data services is a hard problem to solve..

Enter Volumez, they have a SaaS control plane, client software plus CSI drivers that will configure Linux with ephemeral storage to support any performance and data service that can be obtained from NVMe SSDs.

Once installed on your K8s cluster, Volumez software profiles all ephemeral storage, and supplies that information to their SaaS control plane. Once that’s done your platform engineers can define specific storage class policies or profiles useable by DevOps to consume ephemeral storage. .

These policies identify volume [IOPs, Bandwidth, Latency] X [read, write] performance specifications as well as data protection, resiliency and other data service requirements. DevOps engineers consume this storage using PVCs that call for these storage classes at some capacity. When it sees the PVC claim, Volumez SaaS control plane will carve out slices of ephemeral storage that can support the performance and other storage requirements defined in the storage class.

Once that’s done, their control plane next creates a network path from the compute instances with ephemeral storage to the worker nodes running container apps. After that it steps out of the picture and the container apps have a direct (network) data path to the storage they requested. Note, Volumez’s SaaS control plane is not in the container app storage data path at all.

Volumez supports multi-AZ data resiliency for PVCs. In this case, another mirror K8s cluster would reside in another AZ, with Volumez software active and similar if not equivalent ephemeral storage. Volumez will configure the container volume to mirror data between AZs. Similarly, if the policy requests erasure coding, Volumez SaaS software configures the ephemeral storage to provide erasure coding for that container volume.

Brian said they’ve done some amazing work to increase the speed of Linux snapshotting and restoring.

As noted above, the Volumez control plane SaaS software is outside the data path, so even if the K8s cluster running Volumez enabled storage loses access to the control plane, container apps continue to run and perform IO to their storage. This can continue until there’s a new PVC request that requires access to their control plane.

Ephemeral storage is accessed through special compute instances. These are not K8s worker nodes and they essentially act as a passthru or network attachment between worker nodes running apps with PVC’s and the Volumez configured Linux Logical Volumes hosted on slices of ephemeral storage.

Volumez is gaining customer traction with data platform clients, DBaaS companies, and some HPC environments. But just about anyone needing high performing data services for cloud K8s container apps should give Volumez a try.

I looked at AWS to see how they price instance store capacities and found out it’s not priced separately, but rather instance storage is bundled into the cost of EC2 compute instances.

Volumez is priced based on the number of media devices (instance/ephemeral stores) and performance (IOPs) available. They also have different tiers depending on support level requirements (e.g., community, Business hrs, 7X24) which also offers different levels of enterprise security functionality.

Brian said they have a free tier that customers can easily signup for and try out by going to their web site (see link above), or if you would like a guided demo, just contact him directly.

Brian Carmody, Field CTO, Volumez

Brian Carmody is Field CTO at Volumez. Prior to joining Volumez, he served as Chief Technology Officer of data storage company Infinidat where he drove the company’s technology vision and strategy as it ramped from pre-revenue to market leadership.

Before joining Infinidat, Brian worked in the Systems and Technology Group at IBM where he held senior roles in product management and solutions engineering focusing on distributed storage system technologies.

Prior to IBM, Brian served as a technology executive at MTV Networks Viacom, and at Novus Consulting Group as a Principal in the Media & Entertainment and Banking practices.

142: GreyBeards talk scale-out, software defined storage with Bjorn Kolbeck, Co-Founder & CEO, Quobyte

Software defined storage is a pretty full segment of the market these days. So, it’s surprising when a new entrant comes along. We saw a story on Quobyte in Blocks and Files and thought it would be great to talk with Bjorn Kolbeck (LinkedIn), Co-Founder & CEO, Quobyte. Bjorn got his PhD in scale out storage and went to work at Google on anything but storage. While there, he was amazed by Goodle’s vast infrastructure being managed by only a few people and thought this could should be commercialized, so Quobyte was born. Listen to the podcast to learn more.

Quobyte is a scale out file and object storage system with mirrored metadata and data which is 3-way mirrored or erasure coded (EC). Minimum cluster is 4 nodes (fault tolerant for a single node failure.). Quobyte has current customers with ~250 nodes and ~20K clients accessing a storage cluster.

Although they support NFSv3 and NFSv4 for file (and object) access, their solution is typically deployed using host client and storage services software accessing the files with Posix or objects via S3. Objects can also be accessed as file within the file system directories.

Host client software runs on Linux, Mac or Windows machines. Storage server software runs on Linux systems bare metal or under VMs in user space. Quobyte also support containerized storage server software for K8s but their bare metal/VM storage server software option doesn’t require containers.

Quobyte is also available in the GCP marketplace and can run in AWS, Azure and Oracle Cloud.

Their metadata service is a mirrored key-value store distributed across any number of (customer configured, I believe) storage nodes. Metadata resides on flash and distribution is designed to eliminate the metadata service as a performance bottleneck.

Their data services supports (any number of) storage tiers. Storage policies determine how tiering is used for files, directories, objects, etc. For example, with 3 tiers (NVMe Flash, SSD, and disk), file data could be first landed on NVMe Flash, but as it grows, it gets moved off to SSD, and as it grows, even more, it’s moved to disk. This could also be triggered using time since last access.

Bjorn said anything in file system metadata could be used to trigger data movement across tiers. Each tier could be defined with different data protection policies, like mirroring or EC 8+3.

Backend storage is split up into Volumes. They also support thinly provisioned volumes for file creation.

Unclear how tiering and thin provisioning applies to objects with much richer metadata options but as they can be mapped to files, we suppose that anything in the object file metadata could conceivably used to trigger tiering as a bare minimum.

As for security, 

  1. Quobyte supports end to end data encryption. This is done once and the customer owns the keys. They do support external key servers.  I believe this is another option that is enabled by file based policy management. It seems like different files can have different keys to encrypt them.
  2. Quobyte supports TLS. Depending on customer requirements data may go across open networks and this is where TLS could very well be used. And Quobyte supports user X.509 certificates for users, devices and systems authentication. 
  3. Quobyte supports file access controls. They support a subset of Windows capabilities but have full support for Linux and Mac access controls.

Quobyte also supports two forms of cluster to cluster replication. One is event driven where event occurrence (i.e. file close) signals data replication and another which is time driven (i.e., every 5 minutes) but both are asynchronous.

Quobyte was designed from the start to be completely API driven. But they do support CLI and a GUI for those customers that want them. 

They have a Free (forever) edition, a downloadable version of the software without 24/7 support and minus some enterprise capabilities (think encryption). This is gated at 150TB disk/30TB flash with limited number of clients and volumes.

The Infrastructure edition is their full featured solution with 7/24 enterprise support. It’s comes with a yearly service fee, priced by capacity with volume discounts.

Bjorn Kolbeck, Co-Founder & CEO, Quobyte

Bjorn Kolbeck, Co-Founder and CEO of Quobyte attended the Technical University of Berlin and Humboldt University of Berlin.

His PhD thesis dealt with fault-tolerant replication, but he gained several years’ experience in distributed and storage systems while developing the distributed research file system XtreemFS at the Zuse Institute Berlin.

He then spent time at Google working as a Software Engineer before he and fellow Co-Founder Felix Hupfield decided to combine the innovative research from XtreemFS and the operations experience from Google to build a highly reliable and scalable enterprise-grade storage system now known as Quobyte.

141: GreyBeards annual 2022 wrap-up podcast

Well it has been another year and time for our annual year end wrap up. Since Covid hit, every year has certainly been interesting. This year we have seen the start of back in person conferences which was a welcome change from the covid lockdown. We are very glad to start seeing everybody again.

From the tech standpoint, the big news this year was CXL. As everyone should recall, CXL is a new-ish PCIe hardware and protocol that supports larger memory sitting out on a PCIe bus and in the future shared memory between servers. All this is to enable a new wave of memory based computing. We spent probably half our time discussing CXL and it’s impact on IT.

The other major topic was the Cloud Native ecosystem. In the past all we talked about was K8s but nowadays the ecosystem that surrounds it is almost as important as K8s itself. The final topic was a bit of a shock earlier this year and yes it was the Broadcom’s acquisition of VMware. Jason and I spend our Explore podcast talking about it (see our 137: VMware Explore wrap-up). Keith has high hopes that the EU will shut it down but the jury’s still out on that one. Listen to the podcast to learn more.

As for CXL, it turns out that AMD have just released full support for CXL hardware and protocols with their latest round of CPU chips. But the new AMD CPUs only support DDR5 memory, (something about there’s only so much logic one can fit on a chip…) which means all those DDR4 DIMs out in the wild need somewhere to land. CXL could supply a new lease on life for DDR4 DIMs.

And it’s not just about shared memory or increased memory sizes, CXL can also provide a tiered memory hierarchy, with gobs of flash behind memory DIMs (see: 136: FMS2022 wrap up …) So, now its no longer a TB or ten of server memory but potentially 100s of TBs. What this means for SAP HANNA, AWS Aurora and other heavy-memory solutions has yet to play out.

Cloud Native won. We see this in the increasing adoption of containers and K8s in the enterprise, cloud and just about anywhere IT happens these days. But the ecosystem surrounding K8s is chaos.

Over time, many of these ecosystem solutions will die off, be purchased, or consolidated but in the mean time, it’s entirely too confusing. Red Hat’s OpenShift is one answer and VMware’s Tanzu is another. And of course all the clouds have their own K8s packaged solution. But just to cover their bets, everyone also supports native K8s and just about every software package that works with it. So, K8s’s ecosystem is in a state of flux and may take time to become a stable set of tools useable by the enterprise IT.

Finally, Broadcom’s acquisition of VMware has everyone up in arms. Customers are concerned the R&D juggernaut that VMware has been, since its very beginning, will be jettisoned in favor of profits. And HCI vendors that always felt Dell EMC had an unfair advantage will all look at Broadcom in a similar light.

Keith says there’s a major difference in how USA regulators view an acquisition and how EU regulators view one. According to Keith, EU views acquisitions in how they help or hurt the customer. USA regulators view acquisitions on show they help or hurt the competition. Will have to wait and see how this all plays for Broadcom-VMware.

On the other hand, speaking of competition, Nutanix seems to be feeling the heat as well. Rumors are it’s up for sale. Who will want it and how the regulators view both of these acquisitions may be as interesting story for 2023

2023 looks to be another year of transition for enterprise IT. The cloud players all seem to be coming around to the view that they can’t be all things to all (IT) people. And the enterprise vendors are finally seeing some modicum of staying power in the face of a relentless push to the cloud. How this plays out over the next few years will be of major interest to everybody.

Happy New Year from the GreyBeards!

Keith Townsend, The CTO Advisor

Keith Townsend (@CTOAdvisor) is a IT thought leader who has written articles for many industry publications, interviewed many industry heavyweights, worked with Silicon Valley startups, and engineered cloud infrastructure for large government organizations. Keith is the co-founder of The CTO Advisor, blogs at Virtualized Geek, and can be found on LinkedIN.

Jason Collier, Principal Member of Technical Staff, AMD

Jason Collier (@bocanuts) is a long time friend, technical guru and innovator who has over 25 years of experience as a serial entrepreneur in technology. He was founder and CTO of Scale Computing and has been an innovator in the field of hyperconvergence and an expert in virtualization, data storage, networking, cloud computing, data centers, and edge computing for years. He’s on LinkedIN.

140: Greybeards talk data orchestration with Matt Leib, Product Marketing Manager for IBM Spectrum Fusion

As our listeners should know, Matt Leib (@MBleib) was a GreyBeards co-host But since then, Matt has joined IBM to become Product Marketing Manager on IBM Spectrum Fusion, a data orchestration solution for Red Hat OpenShift environments. Matt’s been in and around the storage and data management industry for many years which is why we tapped him for GreyBeards co-host duties.

IBM Fusion, in its previous incarnation, came as an OpenShift software defined storage or as an OpenShift (H)CI solution. But recently, Fusion has taken on more of a data orchestration role for OpenShift stateful containerized applications. Listen to the podcast to learn more.

Fusion can run in any OpenShift deployment whether (currently AWS, Azure, & IBM) clouds, under VMware (wherever it runs), or on (x86 or IBM Z) bare metal. It supplies NFS file or S3 compatible object storage for container applications running under OpenShift. But it does more than just storage.

Beyond storage, Fusion includes backup/recovery, site to site DR and global (file & object) data access. It’s almost like someone opened up the IBM Spectrum software pantry and took out the best available functionality and cooked it up in to an OpenShift solution. IBM’s Spectrum Fusion current website (linked to above (Dec.’22)) still refers only to the software defined storage and (H)CI solution, but today’s Fusion includes all of the functions identified above.

All Fusion facilities run as containers under OpenShift. Customers can elect to run all Fusion services or pick and chose which ones they want for their environment. IBM Fusion supports an API, an API backed GUI, and CLI for its storage & data management as well as REST access. Fusion is fully compatible with Red Hat Ansible.

IBM Fusion is intended to be storage agnostic. Which means it can support its data management services for any NFS file storage as well as anyone’s S3 compatible, object storage.

Now that Red Hat software defined CEPH and ODF are under IBM product management, CEPH and ODF options will become available under Fusion. And CEPH offers block as well as file and object. We’ve talked about CEPH before, packaged in a hardware appliance, see our SoftIron podcast.

One intriguing part of the Fusion solution is its global data access. With global access, any OpenShift application can access data from any Fusion data store, across clouds, across on prem installations, or just about anywhere OpenShift is running. Matt mentioned that compute could be on AWS OpenShift, Fusion’s data control plane could be running on prem OpenShift and the data storage could be running on Azure OpenShift. All this would be glued together by Fusion global access, so that AWS compute had access to data on Azure.

There’s some sophisticated caching magic to make global access happen seamlessly and with decent levels of performance, but customers no longer have to copy whole file systems over from one cloud to another in order to move compute or data. IBM Fusion would need to run in all those locations for global access.

Keith asked if it was directly available in the AWS marketplace. Matt said not yet but you can deploy OpenShift out of the marketplace and then deploy IBM Fusion onto that.

It took us sometime to get our heads wrapped around what Fusion has to offer and throughout it all, Keith and I had a bit of fun with Matt.

Matthew Leib, Product Marketing Manager, IBM Spectrum Fusion

Matt has spent years in IT, from Engineering, to Architecture, from PreSales to analyst work, and finally to Product Marketing at IBM.

He’s spent years trying to achieve both credibility in the space, as a podcaster, blogger, and community member.

In his spare time, he’s a dad, dog owner, and amateur guitar player..