Administrator

December 29, 2023December 29, 2023

159: GreyBeards Year End 2023 Wrap Up

Jason and Keith joined Ray for our annual year end wrap up and look ahead to 2024. I planned to discuss infrastructure technical topics but was overruled. Once we started talking AI, we couldn’t stop.

It’s hard to realize that Generative AI and ChatGPT in particular, haven’t been around that long. We discussed some practical uses Keith and Jason had done with the technology.

Keith mentioned its primary skill is language expertise. He has used it to help write up proposals. He often struggles to convince CTO Advisor non-sponsors of the value they can bring and found that using GenAI has helped do this better.

Jason mentioned he uses it to create BASH, perl, and PowerShell scripts. He says it’s not perfect but can get ~80% there and with a few tweaks, is able to have something a lot faster than if he had to do it completely by hand. He also mentioned its skill in translating from one scripting language to others and how well the code it generates is documented (- that hurt).

I was the odd GreyBeard out, having not used any GenAI, proprietary or not. I’m still working to get a reinforcement learning task to work well and consistently. I figured once I mastered that, I train an LLM on my body of (text and code) work (assuming of course someone gifts me a gang of GPUs).

I agreed GenAI are good at (English) language and some coding tasks (where lot’s of source code exists, such as java, scripting, python, etc.).

However, I was on a MLops slack channel and someone asked if GenAI could help with IBM RPG II code. I answered, probably not. There’s just not a lot of RPG II code publicly accessible on the web and the structure of RPG was never line of text/commands oriented.

We had some heated discussion on where LLMs get the data to train with. Keith was fine with them using his data. I was not. Jason was neutral.

We then turned to what this means to the white collar workers who are coding and writing text. Keith made the point that this has been a concern throughout history, at least since the industrial revolution.

Machines come along, displace work that was done by hand, increase production immensely, reduce costs. Organizations benefit, but people doing those jobs need to up level their skills, to take advantage of the new capabilities.

Easy for us to say, as we, except for Jason, in his present job, are essentially entrepreneurs and anything that helps us deliver more value, faster, easier or less expensively, is a boon for our businesses.

Jason mentioned, Stephen Wolfram wrote a great blog post discussing LLM technology (see What is ChatGPT doing … and why does it work). Both Jason and Keith thought it did a great job about explaining the science and practice behind LLMs.

We moved on to a topic harder to discuss but of great relevance to our listeners, GenAI’s impact on the enterprise.

It reminds me of when Cloud became most prominent. Then “C” suites tasked their staff to adopt “the cloud” anyway they could. Today, “C” suites are tasking their staff to determine what their “AI strategy” is and when will it be implemented.

Keith mentioned that this is wrong headed. The true path forward (for the enterprise) is to focus on what are the business problems and how can (Gen)AI address (some of) them.

AI is so varied and its capabilities across so many fields, is so good nowadays ,that organizations should really look at AI as a new facility that can recognize patterns, index/analyze/transform images, summarize/understand/transform text/code, etc., in near real-time and see where in the enterprise that could help.

We talked about how enterprises can size AI infrastructure needed to perform these activities. And it’s more than just a gaggle of GPUs.

MLcommons’s MLperf benchmarks can help show the way, for some cases, but they are not exhaustive. But it’s a start.

The consensus was maybe deploy in the cloud first and when the workload is dialed in there, re-home it later. With the proviso that hardware needed is available.

Our final topic was the Broadcom VMware acquisition. Keith mentioned their recent subscription pricing announcements vastly simplified VMware licensing, that had grown way too complex over the decades.

And although everyone hates the expense of VMware solutions, they often forget the real value VMware brings to enterprise IT.

Yes hyperscalars and their clutch of coders, can roll their own hypervisor services stacks, using open source virtualization. But the enterprise has other needs for their developers. And the value of VMware virtualization services, now that 128 Core CPUs are out, is even higher.

We mentioned the need for hybrid cloud and how VCF can get you part of the way there. Keith said that dev teams really want something like “AWS software” services running on GCP or Azure.

Keith mentioned that IBM Cloud is the closest he’s seen so far to doing what Dev wants in a hybrid cloud.

We all thought when DNN’s came out and became trainable, and reinforcement learning started working well, that AI had turned a real corner. Turns out, that was just a start. GenAI has taken DNNs to a whole other level and Deepmind and others are doing the same with reinforcement learning.

This time AI may actually help advance mankind, if it doesn’t kill us first. On the latter topic you may want to checkout my RayOnStorage AGI series of blog posts (latest … AGI part-8)

Jason Collier, Principal Member Of Technical Staff at AMD, Data Center and Embedded Solutions Business Group

Jason Collier (@bocanuts) is a long time friend, technical guru and innovator who has over 25 years of experience as a serial entrepreneur in technology.

He was founder and CTO of Scale Computing and has been an innovator in the field of hyperconvergence and an expert in virtualization, data storage, networking, cloud computing, data centers, and edge computing for years.

He’s on LinkedIN. He’s currently working with AMD on new technology and he has been a GreyBeards on Storage co-host since the beginning of 2022

Keith Townsend, President of The CTO Advisor a Futurum Group Company

Keith Townsend (@CTOAdvisor) is a IT thought leader who has written articles for many industry publications, interviewed many industry heavyweights, worked with Silicon Valley startups, and engineered cloud infrastructure for large government organizations.

Keith is the co-founder of The CTO Advisor, blogs at Virtualized Geek, and can be found on LinkedIN.

December 7, 2023December 6, 2023

158: GreyBeards talk software defined storage with Brian Dean, Tech. Mkt., Dell PowerFlex

Brian Dean, Dell PowerFlex Technical Marketing

Brian is a 16+ year veteran of the technology industry, and before that spent a decade in higher education. Brian has worked at EMC and Dell for 7 years, first as Solutions Architect and then as TME, focusing primarily on PowerFlex and software-defined storage ecosystems.

Prior to joining EMC, Brian was on the consumer/buyer side of large storage systems, directing operations for two Internet-based digital video surveillance startups.

When he’s not wrestling with computer systems, he might be found hiking and climbing in the mountains of North Carolina.

November 21, 2023November 21, 2023

157: GreyBeards talk commercial cloud computer with Bryan Cantrill, CTO, Oxide Computer

Bryan Cantrill (@bcantrill), CTO, Oxide Computer was a hard man to interrupt once started but the GreyBeards did their best to have a conversation. Nonetheless, this is a long podcast. Oxide are making a huge bet on rack scale computing and have done everything they can to make their rack easy to unbox, setup and deploy VMs on.

They use commodity parts (AMD EPYC CPUs) and package them in their own designed hardware (server) sleds, which blind mate to networking and power in the back of the own designed rack. They use their own OS Helios (OpenSolaris derivative) with their own RTOS, Hubris, for system bringup, monitoring and the start of their hardware root of trust. And of course, to make it all connect easie,r they designed and developed their own programmable networking switch. Listen to the podcast to learn more.

Podcast: Play in new window | Download (Duration: 56:24 — 77.5MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

Oxide essentially provides rack hardware which supports EC2-like compute and EBS-like storage to customers. It also has Terraform plugins to support infrastructure as code. In addition, all their software is completely API driven.

Bryan said time and time again, developing their own hardware and software made everything easier for them and their customers. Customers pay for hardware but there’s absolutely NO SOFTWARE LICENSING FEEs, because all their software is open source.

For example, the problem with AMI bios and UEFIs is their opacity, There’s really no way to understand what packages are included in its root of trust because it’s proprietary. Brian said one company UEFI they examined, had URL’s embedded in firmware. It seemed odd to have another vendor’s web pages linked to their root of trust.

Bryan said they did their own switch to reduce integration and validation test time. The Oxide rack supports all internal networking, compute sled to compute sled, and ToR switch (with no external cabling) and has 32 networking ports to connect the rack to the data center’s core networking.

As for storage, Bryan said each of the 10 U.2 NVMe drives in their compute sled is a separate, ZFS file system and customer data is 3 way mirrored across any of them. ZFS also provides end to end checksumming across all customer data for IO integrity.

Bryan said Oxide Computer rack bring up is 1) plug it in to core networking and power, 2) power it on, 3) attach a laptop to their service processor, 4) SSH into it, 5) Run a configuration script and your ready to assign VMs. He said that from the time an Oxide Rack hits your dock until you are up and firing up VMs, could be as short as an HOUR.

The Rust programming language is the other secret to Oxide’s success. More to the point their company is named after Rust (oxide get it). Apparently just about any software they developed is written in Rust.

The question for Oxide and every other computer and storage vendor is – do you believe that on premises computing will continue for the foreseeable future. The GreyBeards and Oxide believe yes. If not for compliance and better latency but also because it often costs less.

Bryan mentioned they have their own podcast, Oxide and Friends. On their podcast, they did a board bring up series (Tales from the Bring-Up Lab) and a series on taking their rack through FCC compliance (Oxide and the Chamber of Mysteries).

Bryan Cantrill, CTO, Oxide Computers

Bryan Cantrill is a software engineer who has spent over a quarter of a century at the hardware/software interface. He is the co-founder and CTO of Oxide Computer Company, the creator of the world’s first commercial cloud computer.

Prior to Oxide, he spent nearly a decade at Joyent, a cloud computing pioneer; prior to Joyent, he spent 14 years at Sun Microsystems.

Bryan received the Sc.B. magna cum laude with honors in Computer Science from Brown University, and is a MIT Technology Review 35 Top Young Innovators alumnus.

You can learn more about his work with Oxide at oxide.computer, or listen in on their weekly live show, Oxide and Friends (link above), on Discord or anywhere you get your podcasts.

October 12, 2023October 12, 2023

156: GreyBeards talk data security with Jonathan Halstuch, Co-Founder and CTO, RackTop Systems

Jonathan Halstuch, Co-Founder and CTO, RackTop Systems

Jonathan Halstuch is the Chief Technology Officer and Co-Founder of RackTop Systems. He holds a bachelor’s degree in computer engineering from Georgia Tech as well as a master’s degree in engineering and technology management from George Washington University.

With over 20-years of experience as an engineer, technologist, and manager for the federal government, he provides organizations the most efficient and secure data management solutions to accelerate operations while reducing the burden on admins, users, and executives.

October 9, 2023October 25, 2023

155: GreyBeards SDC23 wrap up podcast with Dr. J Metz, Technical Dir. of Systems Design AMD and Chair of SNIA BoD

Dr. J Metz (@drjmetz, blog), Technical Director of Systems Design at AMD and Chair of SNIA BoD, has been on our show before discussing SNIA research directions. We decided this year to add an annual podcast to discuss highlights from their Storage Developers Conference 2023 (SDC23).

Dr, J is working at AMD to help raise their view from a pure components perspective to a systems perspective. On the other hand, at SNIA, we can see them moving out of just storage interface technology into memory (of all things) and real long term, storage archive technologies.

SDC is SNIA’s main annual conference, which brings storage developers together with storage users to discuss all the technologies underpinning storing the data we all care so much about. Listen to the podcast to learn more

Podcast: Play in new window | Download (Duration: 48:15 — 66.3MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

SNIA is trying to get their hands around trends impacting the IT industry today. These days, storage, compute and networking are all starting to morph into one another and the boundary lines, always tenuous at best, seem to be disappearing.

Aside from industry standards work that SNIA has always been known for, they are also deeply involved in education. One of their more popular artifacts is the SNIA Dictionary (recently moved online only), which provides definitions for probably over a 1000 storage terms. But SDC also has a lot of tutorials and other educational sessions worthy of time and effort. And all SDC sessions will be available online, at some point. (Update 10/25/23: they are all available now at Sessions | SDC 2023 website)

SNIA also presented at SFD26, while SDC23 was going on. At SFD26, SNIA discussed DNA data storage which is a recent technical affiliate and a new Smart Data Transfer Interface (SDXI), a software defined interface to perform memory to memory DMA.

First up, DNA storage, the DNA team said that they pretty much are able to store and access GB of DNA data storage today, without breaking a sweat and are starting to consider how to scale that up to TB of DNA storage. We’ve discussed DNA data storage before on GBoS podcasts (see: 108: GreyBeards talk DNA storage... )

The talk at SFD26 was pretty detailed. Turns out the DNA data storage team have to re-invent a lot of standard storage technologies (catalogs/Indexes, metadata, ECC, etc) in order to support a DNA data soup of unstructured data.

For exampe, ECC for DNA segments (snippets) would be needed to correctly store and retrieve DNA data segments, And these segments could potentially be replicated 1000s of times in a DNA storage cell. And all DNA data segments would be tagged with file oriented metadata indicating (segment) address within file, file name or identifier, date created, etc.

As far as what an application for DNA storage would look like, Dr. J mentioned write once and read VERY infrequently. It turns out while making 1000s of copies of DNA data segments is straightforward, inexpensive and trivial, reading it is another matter entirely. And as was discussed at SFD26, reading DNA storage, as presently conceived, is destructive. (So maybe having lots of copies is a good and necessary idea.)

But the DNA guru’s really have to a come up with methods for indexing, searching, and writing/reading data quickly. Todays disks have file systems that are self-defining. If you hand someone an HDD, it’s fairly straightforward to read information off of it and determine the file system used to create it. These days, with LTO-FS, the same could be said for LTO tape.

DNA is intended to be used to store data for 1000s of years. They have retrieved intact DNA from a number of organisms that are over 50K years old. Retaining applications that can access, format and process data after a 1000 years is yet another serious problem someone will need to solve.

Next up was SDXI, a software defined DMA solution, that any application can use to move data from one memory to another without having to resort to 20 abstraction layers to do it. SDXI is just about moving data between memory banks.

Today, this is all within one system/server, but as CXL matures and more and more hardware starts supporting CXL 2 and 3, shared memory between servers will become more pervasive all on a CXL memory interface.

Keith tried bringing it home to moving data between containers or VMs and all that’s possible today within the same memory and sometime in the future between shared memory and local memory.

Memory to memory transfers have to be done securely. It’s not like accessing memory from some other process hasn’t been frought with security exposures in the past. And Dr. J assured me that SDXI was built from the ground up with security considerations front and center.

To bring it all back home. SNIA has always been and always will be concerned with data. Whether that data resides on storage, memory or god forbid, in transit somewhere over a network. Keith went as far as to say that the network was storage, I felt that was a step too far.

Dr. J Metz, Technical Director of Systems at AMD, Chair of SNIA BoD

J is the Chair of SNIA’s (Storage Networking Industry Association) Board of Directors and Technical Director for Systems Design for AMD where he works to coordinate and lead strategy on various industry initiatives related to systems architecture. Recognized as a leading storage networking expert, J is an evangelist for all storage-related technology and has a unique ability to dissect and explain complex concepts and strategies. He is passionate about the innerworkings and application of emerging technologies.

J has previously held roles in both startups and Fortune 100 companies as a Field CTO, R&D Engineer, Solutions Architect, and Systems Engineer. He has been a leader in several key industry standards groups, sitting on the Board of Directors for the SNIA, Fibre Channel Industry Association (FCIA), and Non-Volatile Memory Express (NVMe). A popular blogger and active on Twitter, his areas of expertise include NVMe, SANs, Fibre Channel, and computational storage.

J is an entertaining presenter and prolific writer. He has won multiple awards as a speaker and author, writing over 300 articles and giving presentations and webinars attended by over 10,000 people. He earned his PhD from the University of Georgia.

	GreyBeards talk Agen… on 169: GreyBeards talk AgenticAI…
	Computational (DNA)… on 155: GreyBeards SDC23 wrap up…
	155: GreyBeards SDC2… on 155: GreyBeards SDC23 wrap up…
	J Metz on 134: GreyBeards talk (storage)…
	Administrator on 68: GreyBeards talk NVMeoF/TCP…

Author: Administrator