Co-Hosts – Page 5 – Grey Beards on Systems

May 18, 2023May 18, 2023

148: GreyBeards talk software defined infrastructure with Anthony Cinelli and Brian Dean, Dell PowerFlex

Sponsored By:

This is one of a series of podcasts the GreyBeards are doing with Dell PowerFlex software defined infrastructure. Today, we talked with Anthony Cinelli, Sr. Director Dell Technologies and Brian Dean, Technical Marketing for PowerFlex. We have talked with Brian before but this is the first time we’ve met Anthony. They were both very knowledgeable about PowerFlex and the challenges large enterprises have today with their storage environments.

The key to PowerFlex’s software defined solution is its extreme flexibility, which comes mainly from its architecture which offers scale-out deployment options ranging from HCI solutions to a fully disaggregated compute-storage environment, in seemingly any combination (see technical resources for more info). With this sophistication, PowerFlex can help consolidate enterprise storage across just about any environment from virtualized workloads, to standalone databases, big data analytics, as well as containerized environments and of course, the cloud. Listen to the podcast to learn more.

Podcast: Play in new window | Download (Duration: 1:02:00 — 85.1MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

To support this extreme flexibility, PowerFlex uses both client and storage software that can be configured together on a server (HCI) or apart, across compute and storage nodes to offer block storage. PowerFlex client software runs on any modern bare-metal or virtualized environment.

Anthony mentioned that one common problem to enterprises today is storage sprawl. Most large customers have an IT environment with sizable hypervisor based workloads, a dedicated database workload, a big data/analytics workload, a modern container based workload stack, an AI/ML/DL workload and more often than not, a vertical specific workload.

Each workload usually has their own storage system. And the problem with 4-7 different storage systems is cost, e.g., cost of underutilized storage. Typical to these environments, each storage system could be used at say, 60% utilization on average, but this will vary a lot between silos, leading to stranded capacity.

The main reason customers haven’t consolidated yet is because each silo has different performance characteristics. As a result, they end up purchasing excess capacity which increases cost and complexity, as a standard part of doing business.

To consolidate storage across these disparate environments requires a no-holds barred approach to IO performance, second to none, which PowerFlex can deliver. The secret to to its high levels of IO performance is RAID 10, deployed across a scale-out cluster. And PowerFlex clusters can range from 4 to 1000 or more nodes.

RAIID 10 mirrors data and spreads mirrored data across all drives and servers in a cluster or some subset. As a result, as you add storage nodes, IO performance scales up, almost linearly.

Yes, there can be other bottlenecks in clusters like this, most often networking, but with PowerFlex storage, IO need not be one of them. Anthony mentioned that PowerFlex will perform as fast as your infrastructure will support. So if your environment has 25 Gig Ethernet, it will perform IO at that speed, if you use 100 Gig Ethernet, it will perform at that speed.

In addition, PowerFlex offers automated LifeCycle Management (LCM), which can make having a 1000 node PowerFlex cluster almost as easy as a 10 node cluster. However to make use this automated LCM, one must run its storage server software on Dell PowerEdge servers.

Brian said adding or decommissioning PowerFlex nodes is a painless process. Because data is always mirrored, customers can remove any node, at any time and PowerFlex will automatically rebuild data across other nodes and drives. When you add nodes, those drives become immediately available to support more IO activity. Another item to note, because of RAID 10, PowerFlex mirror rebuilds happen very fast, as just about every other drive and node in the cluster (or subset) participates in the rebuild process.

PowerFlex supports Storage Pools. This partitions PowerFlex storage nodes and devices into multiple pools of storage used to host volume IO and data Storage pools can be used to segregate higher performing storage nodes from lower performing ones so that some volumes can exclusively reside on higher (or lower) performing hardware.

Although customers can configure PowerFlex to use all nodes and drives in a system or storage pool for volume data mirroring, PowerFlex offers other data placement alternatives to support high availability.

PowerFlex supports Protection Domains which are subsets or collections of storage servers and drives in a cluster where volume data will reside. This will allow one protection domain to go down while others continue to operate. Realize that because volume data is mirrored across all devices in a protection domain, it will take lots of nodes or devices to go down before a protection domain is out of action.

PowerFlex also uses Fault Sets, which are a collection of storage servers and their devices within a Protection Domain, that will contain one half of a volume’s data mirror. PowerFlex will insure that a primary and its mirror copy of volume’s data will not both reside on the same fault set. A fault set could be a rack of servers, multiple racks, all PowerFlex storage servers in an AZ, etc. With fault sets, customer data will always reside across a minimum of two fault sets, and if any one goes down, data is still available.

PowerFlex also operates in the cloud. In this case, customers bring their own PowerFlex software and deploy it over cloud compute and storage.

Brian mentioned that anything PowerFlex can do such as reconfiguring servers, can be done through RESTful/API calls. This can be particularly useful in cloud deployments as above, if customers want to scale up or down IO performance automatically.

Besides block services, PowerFlex also offers NFS/CIFS-SMB native file services using a File Node Controller. This frontends PowerFlex storage nodes to support customer NFS/SMB file access to PowerFlex data.

Anthony Cinelli, Sr. Director Global PowerFlex Software Defined & MultiCloud Solutions

Anthony Cinelli is a key leader for Dell Technologies helping drive the success of our software defined and multicloud solutions portfolio across the customer landscape. Anthony has been with Dell for 13 years and in that time has helped launch our HCI and Software Defined businesses from startup to the multi-billion dollar lines of business they now represent for Dell.

Anthony has a wealth of experience helping some of the largest organizations in the world achieve their IT transformation and multicloud initiatives through the use of software defined technologies.

Brian Dean, Dell PowerFlex Technical Marketing

Brian is a 16+ year veteran of the technology industry, and before that spent a decade in higher education. Brian has worked at EMC and Dell for 7 years, first as Solutions Architect and then as TME, focusing primarily on PowerFlex and software-defined storage ecosystems.

Prior to joining EMC, Brian was on the consumer/buyer side of large storage systems, directing operations for two Internet-based digital video surveillance startups.

When he’s not wrestling with computer systems, he might be found hiking and climbing in the mountains of North Carolina.

April 25, 2023April 25, 2023

147: GreyBeards talk ransomware protection with Jonathan Halstuch, Co-Founder and CTO, RackTop Systems

Jonathan Halstuch, Co-Founder & CTO RackTop Systems

Jonathan Halstuch is the Chief Technology Officer and co-founder of RackTop Systems. He holds a bachelor’s degree in computer engineering from Georgia Tech as well as a master’s degree in engineering and technology management from George Washington University.

With over 20-years of experience as an engineer, technologist, and manager for the federal government he provides organizations the most efficient and secure data management solutions to accelerate operations while reducing the burden on admins, users, and executives.

April 13, 2023April 13, 2023

146: GreyBeards talk K8s cloud storage with Brian Carmody, Field CTO, Volumez

We’ve known Brian Carmody (@initzero), Field CTO, Volumez for over a decade now and he’s always been very technically astute. He moved to Volumez earlier this year and has once again joined a storage startup. Volumez is a cloud K8s storage provider with a new twist, K8s persistent volumes hosted on ephemeral storage.

Volumes currently works in public clouds (AWS & Azure( soft launch), with GCP coming soon) and is all about supplying high performing, enterprise class data services to K8s container apps. But doing this using transient (Azure ephemeral &AWS instance) storage and standard Linux. Hyperscalers offer transient storage as almost an afterthought with customer compute instances. Listen to the podcast to learn more.

Podcast: Play in new window | Download (Duration: 48:38 — 66.8MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

It turns out that over the last decade or so, there has been a lot of time and effort devoted to maturing Linux’s storage stack and nowadays, with appropriate configuration, Linux can offer enterprise class data services and performance using direct attached NVMe SSDs. These services include thin provisioning, encryption, RAID/erasure coding, snapshots, etc., which on top of NVMe SSDs, provide IOPS, bandwidth and latency performance that boggles the mind.

However, configuring Linux sophisticated and high performing data services is a hard problem to solve..

Enter Volumez, they have a SaaS control plane, client software plus CSI drivers that will configure Linux with ephemeral storage to support any performance and data service that can be obtained from NVMe SSDs.

Once installed on your K8s cluster, Volumez software profiles all ephemeral storage, and supplies that information to their SaaS control plane. Once that’s done your platform engineers can define specific storage class policies or profiles useable by DevOps to consume ephemeral storage. .

These policies identify volume [IOPs, Bandwidth, Latency] X [read, write] performance specifications as well as data protection, resiliency and other data service requirements. DevOps engineers consume this storage using PVCs that call for these storage classes at some capacity. When it sees the PVC claim, Volumez SaaS control plane will carve out slices of ephemeral storage that can support the performance and other storage requirements defined in the storage class.

Once that’s done, their control plane next creates a network path from the compute instances with ephemeral storage to the worker nodes running container apps. After that it steps out of the picture and the container apps have a direct (network) data path to the storage they requested. Note, Volumez’s SaaS control plane is not in the container app storage data path at all.

Volumez supports multi-AZ data resiliency for PVCs. In this case, another mirror K8s cluster would reside in another AZ, with Volumez software active and similar if not equivalent ephemeral storage. Volumez will configure the container volume to mirror data between AZs. Similarly, if the policy requests erasure coding, Volumez SaaS software configures the ephemeral storage to provide erasure coding for that container volume.

Brian said they’ve done some amazing work to increase the speed of Linux snapshotting and restoring.

As noted above, the Volumez control plane SaaS software is outside the data path, so even if the K8s cluster running Volumez enabled storage loses access to the control plane, container apps continue to run and perform IO to their storage. This can continue until there’s a new PVC request that requires access to their control plane.

Ephemeral storage is accessed through special compute instances. These are not K8s worker nodes and they essentially act as a passthru or network attachment between worker nodes running apps with PVC’s and the Volumez configured Linux Logical Volumes hosted on slices of ephemeral storage.

Volumez is gaining customer traction with data platform clients, DBaaS companies, and some HPC environments. But just about anyone needing high performing data services for cloud K8s container apps should give Volumez a try.

I looked at AWS to see how they price instance store capacities and found out it’s not priced separately, but rather instance storage is bundled into the cost of EC2 compute instances.

Volumez is priced based on the number of media devices (instance/ephemeral stores) and performance (IOPs) available. They also have different tiers depending on support level requirements (e.g., community, Business hrs, 7X24) which also offers different levels of enterprise security functionality.

Brian said they have a free tier that customers can easily signup for and try out by going to their web site (see link above), or if you would like a guided demo, just contact him directly.

Brian Carmody, Field CTO, Volumez

Brian Carmody is Field CTO at Volumez. Prior to joining Volumez, he served as Chief Technology Officer of data storage company Infinidat where he drove the company’s technology vision and strategy as it ramped from pre-revenue to market leadership.

Before joining Infinidat, Brian worked in the Systems and Technology Group at IBM where he held senior roles in product management and solutions engineering focusing on distributed storage system technologies.

Prior to IBM, Brian served as a technology executive at MTV Networks Viacom, and at Novus Consulting Group as a Principal in the Media & Entertainment and Banking practices.

April 4, 2023April 4, 2023

145: GreyBeards talk proactive NAS security with Jonathan Halstuch, CTO & Co-Founder, RackTop Systems

143: GreyBeards talk Chia cypto with Jonmichael Hands, VP Storage at Chia Project

Today we interview Jonmichael Hands (@LebanonJon, LinkedIn), VP Storage at Chia Project , who has been in and around the storage business forever, mostly with Intel and their SSD team, before it was sold. He was technical marketing for NVMe. He also ran the security and crypto track at FMS2022. He recently worked on sustainability, helping to create a circular economy for disk and SSD storage. Moreover, he assisted IEEE with their new (media) sanitization standard to make reuse/recycling storage easier.

Chia was born to provide a way to take advantage of storage media for blockchains in a government compliant way so that it could be spun off as a public company someday. Chia is a crypto currency that depends on proof of space (storage space exists) and proof of time (storage space is reserved for a period of time). There have been many crypto coins based on proof of work (running hard cryptographic algorithms to come up with some specific bit pattern). And ETH was forked last year to support proof of stake (where one stakes some amount of ETH for a defined period). But few, if any, have been based on proof of space and time.

Podcast: Play in new window | Download (Duration: 45:59 — 1.1GB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

Disk and SSD commands already exist to provide “Secure Erase” (multiple passes of different bit patterns overwriting the same block) and cryptographic erasure (For encrypted drives, the encryption key is changed). Both approaches insure that customer/organization data is no longer retained on media leaving an organization’s control. And yet, many companies use secure erase/cryptographic erasure and still shred disk drives and SSDs, just to be sure that no data is retained. This is a vast waste of energy and resources.

Jonmichael said that both disk and SSD drives typically have another 5 years beyond their guaranteed (5 year) production life where both can function perfectly well as storage devices (ok may performance may not be the same as current drives). And after using them for another 5 years, they are much easier to recycle, if left un-shredded and returned to manufacturers, who can dismantle them to reuse expensive components and rare earth materials.

We didn’t spend much time on the technical underpinnings of Chia so if you are interested in that we suggest you check out Jonmichael’s FMS2022 presentation video.

But if you’re interested in a high level understanding of Chia and what one can do with it we did cover that. For example, Chia has farmers (not miners). Farmers create (~100GB) Chia plot files and store these on media.

Plot files take some amount of CPU power and memory to create but once created can stay on storage forever. What makes Chia work is that it comes out and checks to see if you have a certain plot file and if you do you get rewarded for that. Jonmichael said that with a typical Chia crypto setup, one could make $0.50/TB/Month farming Chia.

The Chia project currently has about 24EB of plots online and at their peak had over 300EB. They also have 130K farmers in their current network. Bitcoin, at its peak, had about 60K miners. Jonmichael thinks Chia crypto coin may be the most distributed crypto coin in existence today.

A couple of years back Chia accounted for a significant amount of new disk drive purchases but that has died down considerably since then. As discussed earlier, Jonmichael is working to create a circular economy for storage that could lead to media reuse for Chia farming.

Jonmichael mentioned that Chia has matured significantly since peak use. It used to be that creating Chia plot files required high end CPUs and lots of technical skills, but today Jonmichael said you can be a farmer with an RPi. He did say that they have moved to making better use of available memory in the plotting process and have reduced the write load on the storage media.

Another aspect to Chia’s maturation is that they now support Chia smart coins or smart contracts. They have created ChiaLisp, a Turing complete language, as their language to implement Chia smart coins. It turns out that Lisp and other functional languages provide a natural way to implement secure code. Jonmichael mentioned that other crypto coins are starting to move towards using ChiaLisp.

Some recent innovations in Chia smart coins include:

Chia Offer Management – that is anything you wish to trade can be digitally tracked and traded using this Chia Offer Management smart coins.
Chia NFTs (non-fungible token) Management – NFT’s have been used by other blockchains to sell digital rights to assets Chia’s support for NFTs opens Chia up to this as well. The reference implementation for Chia’s NFT management is Chia Friends, where all proceeds are being donated to the Marmot Recovery Foundation.
Chia Data Layer Management, a federated database – here the Chia block chain is being used to support a K-V store, where the block chain stores the Key and a hash of the Value. Users can use this Chia Data Layer to store any key-hash(value) database they wish. It’s important to realize that actual the data or value is stored external to the Chia block chain.

The Data Layer solution is currently being used to develop a way to track carbon credits by the World Bank (see: the Climate Action Data Trust).

Chia has come a long way. In its heyday it was significant consumer of new disk media but with what Jonmichael and others have planned for it is to take advantage of the longer term life of storage media and to use this for the benefit of all humanity.

Jonmichael Hands, VP Storage at Chia Project

Jonmichael Hands partners with the storage vendors for Chia optimized product development, market modeling, and Chia blockchain integration.

Jonmichael spent the last ten years at Intel in the Non-Volatile Memory Solutions group working on product line management, strategic planning, and technical marketing for the Intel data center SSDs.

In addition, he served as the chair for NVM Express (NVMe), SNIA (Storage Networking Industry Association) SSD special interest group, and Open Compute Project for open storage hardware innovation.

Jonmichael started his storage career at Sun Microsystems designing storage arrays (JBODs) and holds an electrical engineering degree from the Colorado School of Mines.

	174: GreyBeards talk… on 174: GreyBeards talk SDN chips…
	GreyBeards talk Agen… on 169: GreyBeards talk AgenticAI…
	Computational (DNA)… on 155: GreyBeards SDC23 wrap up…
	155: GreyBeards SDC2… on 155: GreyBeards SDC23 wrap up…
	J Metz on 134: GreyBeards talk (storage)…