173: GreyBeards Year End 2025 podcast

Well this year went fast. Keith, Jason and I sat down to try to make some sense of it all.

AI is still on a tear and shows no end in sight. Questions abound on whether we are seeing signs of a bubble or not, our answer – maybe. We see it in GPU pricing. in AI startup valuations, and in enterprise interest. Some question whether the Enterprise is seeing any return from investments in AI but there’s no doubt they are investing. Inferencing on prem with training/fine tuning done in neo-clouds, has become the new norm. I thought we’d be mostly discussing agentic AI but it’s too early to for that yet.

In other news, the real Broadcom VMware play is starting to emerge (if ever in doubt). It’s an all out focus on the (highly profitable) high end enterprises, and abandon the rest. And of course the latest weirdness to hit IT is DRAM pricing, but in reality it’s the price of anything going into AI mega-data centers that’s spiking. Listen to the podcast to learn more

AI

GPU pricing is still high, although we are starting to see some cracks in NVIDIA’s moat.

AMD GPUs made a decent splash in the latest MLperf Training results and Google TPUs are starting to garner some in the enterprise. And NVIDIA GPUs are becoming less of a compute monster by focusing more with their latest GPU offerings on optimization for low precision compute, FP2 anyone, rather than just increasing compute. It seems memory bandwidth (in GPUs) is becoming more of a bottleneck than anything else IMHO.

But NVIDIA CUDA is still an advantage. Grad students grew up on it, trained on it and are so familiar with it, it will take a long time to displace. Yeah, RoCM helps but, IT needs more. Open Sourcing all the CUDA code and its derivatives could be an answer, if anybody’s listening.

Jason talked about AI rack and data center power requirements going through the roof and mentioned SMR (small modular [nuclear] reactors) as one solution. When buying a nuclear power plant is just not an option, SMRs can help. They can be trucked and installed (mostly) anywhere. Keith saw a truckload of SMRs on the highway on one of his road trips.

And last but not least, Apple just announced RDMA over Thunderbolt. And the (Youtube) airwaves have been lighting up with studio Macs being clustered together with sufficient compute power to rival a DGX. Of course it’s Apples MLX running rather than CUDA, and only so many models work on MLX, but it’s a start at democratizing AI.

VMware

Broadrcom’s moves remind Jason of what IBM did with Z. Abandoning the low end, milk the high end forever. If you want vSphere better think about purchasing VCF.

Keith mentioned if a company has a $100M cloud spend, they could save some serious money (~20%), going to VCF. But it’s not a lift and shift. Running a cloud on prem requires a different mindset than running apps in the cloud. Welcome to the pre-cloud era, where every IT shop did it all.

Component Pricing

Jason said that DRAM pricing has gone up 600% in a matter of weeks. Our consensus view is it’s all going to AI data centers. With servers having a TB of DRAM, GPUs with 160GB of HBM per, and LPDDR being gobbled up for mobile/edge compute everywhere is it any doubt that critical server (sub-) components are in high demand.

Hopefully, the Fabs will start to produce more. But that assumes Fab’s have spare capacity and DRAM demand is function of price. There are hints that neither of these are true anymore. Mega data centers are not constrained by capital, yet, and most Fabs are operating flat out producing as many chips as they can. So DRAM pricing may continue to be a problem for some time to come.

Speaking of memory, there was some discussion on memory tiering startups taking off with high priced memory. One enabler for that is the new UALink interconnect. It’s essentially an open source, chip-to-chip interconnect technology, over PCIe or Ethernet. UAlink solutions can connect very high speed components beyond the server itself to support a scale out network of accelerators, memory and CPUs in a single rack. It’s early yet but Meta specs for an OCP wide form factor rack was released in the AMD Helios OGP 72GPU rack that uses UALink tech today, More to come we’re sure.

Keith Townsend, The CTO Advisor, Founder & Executive Strategist | Advisor to CIOs, CTOs & the Vendors Who Serve Them

Keith Townsend (@CTOAdvisor) is a IT thought leader who has written articles for many industry publications, interviewed many industry heavyweights, worked with Silicon Valley startups, and engineered cloud infrastructure for large government organizations. Keith is the co-founder of The CTO Advisor, blogs at Virtualized Geek, and can be found on LinkedIN.

Jason Collier, Principal Member Of Technical Staff at AMD, Data Center Solutions Group

Jason Collier (@bocanuts) is a long time friend, technical guru and innovator who has over 25 years of experience as a serial entrepreneur in technology.

He was founder and CTO of Scale Computing and has been an innovator in the field of hyperconvergence and an expert in virtualization, data storage, networking, cloud computing, data centers, and edge computing for years.

He’s on LinkedIN. He’s currently working with AMD on new technology and he has been a GreyBeards on Storage co-host since the beginning of 2022

172: Greybeards talk domain specific AI with Dr. Arun Subramaniyan, Founder & CEO, Articul8 AI

Keith and I attended AIFD7 a couple of weeks back and Articul8 AI presented at one session (see videos of their session here). Given all the press on LLMs and GenAI, there are only a few non-GenAI solutions in the market today. Articul8 is one of these and represents a different way of deploying AI for industry. Dr. Arun Subramaniyan, Founder and CEO of Articul8 (LinkedIn), discussed their approach on AI for industries at their session.

With all the press on GenAI, agentic AI and LLMs, it’s hard to remember that AI has had a long history in helping various verticals address their challenges. Articul8 AI was founded only 2 years ago, but already has significant footprints in a number of industry sectors, such as aerospace, telecom, (electrical) energy, etc. Articul8 AI is all about deploying domain specific models trained to focus on select industry challenges. Listen to the podcast to learn more.

Articul8 AI can operate on prem, in your VPC or in their own cloud . The solution is deployed on infrastructure sized to your organizations specific requirements and starts (processing) ingesting corporate data the moment it’s enabled. It can run on something as small as an 8GPU server to large clusters with many (1000s of) GPUs.

Articul8 doesn’t host or store any corporate data on this infrastructure, just metadata describing data and relationships between them. Arun said that within 24 hours, Articul8 AI has enough of an organizations knowledge map, what they call the shape of data, to process up to 95% of an organizations requests.

Their AIFD7 demo shows a sort of 3D visual of a knowledge map. And it’s interesting that every query or request changes the knowledge map in subtle ways.

For the industries it supports, Articul8 AI is typically embedded into “systems of record”. There are very few AI solutions today like this. perhaps coding agents for software development firms and recommendation engines for online retailers, but that’s about it.

For Articul8 AI to support a new domain or vertical, takes significant domain expertise and data. In some cases, they have partnered with industry associations to gain expertise and data. For those organizations that have contributed data or IP to support a new domain, Articul8 AI can share revenue from other organizations that adopt their solutions.

One can see the current verticals Articul8 AI supports. One item of interest is their cross domain models. They have one cross-domain model trained to interpret and understand tables/spreadsheets/”structured image data”, another to understand logs or time series data and a third focused on converting text to database queries. Most GenAI/LLMs struggle to understand tables and spreadsheet data well.

The other thing about tables and spreadsheets is that most corporations could not exist without them. By providing a cross domain table understanding model they have opened up vast troves of corp data which was just too inscrutable for LLM AI to understand and process before.

Finally, Artucul8 AI has two offerings currently available on AWS Marketplace one of which is a LLM evaluation tool and the other a network topology log analyzer tool. The LLM evaluator, when provided a prompt, will return which current LLM could handle that prompt best and is callable via API. The topology service can analyze time series logs from networking and other gear and show network topology from logs alone.

Dr. Arun Subramaniyan, Founder & CEO Articul8 AI

Arun Subramaniyan is the founder & CEO of Articul8, where he is building a domain-specific GenAI Platform. Previously, he led the Cloud & AI Strategy team at Intel where he was responsible for establishing and driving the overall AI strategy globally, and was focused on democratizing AI in a sustainable fashion.

Arun joined Intel from Amazon Web Services (AWS), where he led the Extreme-scale computing solution team spanning Machine Learning, Quantum Computing, High Performance Computing (HPC), Autonomous Vehicles, and Autonomous Computing. His team was responsible for developing solutions across all areas of HPC, quantum computing and large-scale machine learning applications, spanning a $1B+ portfolio, and he grew the businesses 2-3x over two years.

Arun’s primary areas of research focus are Bayesian methods, global optimization, probabilistic deep learning for large scale applications, and distributed computing. He is an Executive Fellow at Harvard Business School, where he teaches courses on Generative AI for Business Leaders. He enjoys working at the intersection of massively parallel computing and modeling large-scale systems.

Arun is a prolific researcher with a Ph.D. in Aerospace Engineering from Purdue University with 34 granted patents (60+ filed), 50+ international publications that have been cited more than 1600 times with a h-index of 16. He is also a recipient of the Hull Award from GE, which honors technologists for their outstanding technical impact.

168: GreyBeards Year End 2024 podcast

It’s time once again for our annual YE GBoS podcast. This year we have Howard back making a guest appearance with our usual cast of Jason and Keith in attendance. And the topic de jour seemed to be AI rolling out to the enterprise and everywhere else in the IT world. 

We led off with our discussion from last year, AI (again) but then it was all about new announcements, new capabilities and new functionality. This year it’s all about starting to take AI tools and functionality and make them available to help optimize organizational functionality.

We talked some about RAGs and Chatbots but these seemed almost old school.

Agentic AI

Keith mentioned Agentic AI which purports to improve businesses by removing/optimizing intermediate steps in business processes. If one can improve human and business productivity by 10%, the impact on the US and world’s economies would  be staggering.

And we’re not just talking about knowledge summarization, curation, or discussion, agentic AI takes actions that would have been previously done by a human, if done at all.  

Manufacturers could use AI agents to forecast sales, allowing the business to optimize inventory positioning to better address customer needs. 

Most, if not all, businesses have elaborate procedures which require a certain amount of human hand holding. Reducing human hand holding, even a little bit, with AI agents, that never slees, and can occasionally be trained to do better, could seriously help the bottom and top lines for any organization 

We can see evidence of Agentic AI proliferating in SAAS solutions, i.e., SalesForce, SAP, Oracle and all others are spinning out Agentic AI services.

I think it was Jason that mentioned GEICO, a US insurance company, is re-factoring, re-designing and re-implementing all their applications to take advantage of Agentic AI and other AI options. 

AI’s impact on HW & SW infrastructure

The AI rollout is having dramatic impacts on both software and hardware infrastructure. For example, customers are building their own OpenStack clouds to support AI training and inferencing.

Keith mentioned that AWS just introduced S3 Tables, a fully managed services meant to store and analyze massive amounts of tabular data for analytics. Howard mentioned that AWS’s S3 Tables had to make a number of tradeoffs to use immutable S3 object storage. VAST’s Parquet database provides the service without using immutable objects.

Software impacts are immense as AI becomes embedded in more and more applications and system infrastructure. But AI’s hardware impacts may be even more serious.

Howard made mention of the power zero sum game, meaning that most data centers have a limited amount of power they support. Any power saved from other IT activities are immediately put to use to supply more power to AI training and infererencing.

Most IT racks today support equipment that consumes 10-20Kw of power. AI servers will require much more

Jason mentioned one 6u server with 8 GPUS that cost on the order of 1 Ferrari ($250K US), draws 10Kw of power, with each GPU having 2-400 GigE links not to mention the server itself having 2-400 GigE links. So a single 6U (GPU) server has 18-400GbE links or could need 7.2Tb of bandwidth.

Unclear how many of these one could put in a rack but my guess is it’s not going to be fully populated. 6 of these servers would need >42Tb of bandwidth and over 60Kw of power and that’s not counting the networking and other infrastructure required to support all that bandwidth.  

Speaking of other infrastructure, cooling is the other side of this power problem. It’s just thermodynamics, power use generates heat, that heat needs to be disposed of. And with 10Kw servers we are talking a lot of heat. Jason mentioned that at this year’s SC24 conference, the whole floor was showing off liquid cooling.  Liquid cooling was also prominent at OCP.

At the OCP summit this year Microsoft was talking about deploying near term 150Kw racks and down the line 1Mw racks. AI’s power needs are why organizations around the world are building out new data centers in out of the way places that just so happen to have power and cooling nearby. 

Organizations have an insatiable appetite for AI training data. And good (training) data is getting harder to find. Solidigm latest 122TB SSD may be coming along just when the data needs for AI are starting to take off.

SCI is pivoting

We could have gone on for hours on AI’s impact on IT infrastructure, but I had an announcement to make.

Silverton Consulting will be pivoting away from storage to a new opportunity that is based in space. I discuss this on SCI’s website but the opportunities for LEO and beyond services are just exploding these days and we want to be a part of that. 

What that means for GBoS is TBD. But we may be transitioning to something more broader than just storage. But heck we have been doing that for years.

Stay tuned, it’s going to be one hell of a ride

Jason Collier, Principal Member Of Technical Staff at AMD, Data Center and Embedded Solutions Business Group

Jason Collier (@bocanuts) is a long time friend, technical guru and innovator who has over 25 years of experience as a serial entrepreneur in technology.

He was founder and CTO of Scale Computing and has been an innovator in the field of hyperconvergence and an expert in virtualization, data storage, networking, cloud computing, data centers, and edge computing for years.

He’s on LinkedIN. He’s currently working with AMD on new technology and he has been a GreyBeards on Storage co-host since the beginning of 2022

Howard Marks, Technologist Extraordinary and Plenipotentiary at VAST Data

Howard Marks is Technologist Extraordinary and Plenipotentiary at VAST Data, where he explains engineering to customers and customer requirements to engineers.

Before joining VAST, Howard was an independent consultant, analyst, and journalist, writing three books and over 200 articles on network and storage topics since 1987 and, most significantly, a founding co-host of the Greybeards on Storage podcast.

Keith Townsend, President of The CTO Advisor, a Futurum Group Company

Keith Townsend (@CTOAdvisor) is a IT thought leader who has written articles for many industry publications, interviewed many industry heavyweights, worked with Silicon Valley startups, and engineered cloud infrastructure for large government organizations. Keith is the co-founder of The CTO Advisor, blogs at Virtualized Geek, and can be found on LinkedIN.

167: GreyBeards talk Distributed S3 storage with Enrico Signoretti, VP Product & Partnerships, Cubbit

Long time friend, Enrico Signoretti (LinkedIn), VP Product and Partnerships, Cubbit, used to be a common participant at Storage Field Day (SFD) events and I’ve known him since we first met there. Since then, he’s worked for a startup and a prominent analyst firms. But he’s back at another startup and this one looks like it’s got legs.

Cubbit offers Distributed S3 compatible object storage that offers geo-distribution and geo-fencing for object data, in which the organization owns the hardware and Cubbit supplies the software. There’s a management component, the Coordinator, which could run on your hardware or as a SaaS service they provide but other than that, IT controls the rest of the system hardware. Listen to the podcast to learn more.

Cubbit comes in 3 components:

  • One or more Storage nodes which includes their agent software running ontop of a linux system with direct attached storage.
  • One or more Gateway nodes which provides S3 protocol acces to the objects stored on storage nodes. Typical S3 access points https://S3.company_name, com/… points to either a load balancer, front end or one or more Gateway nodes. Gateway nodes provide the mapping between the bucket name/object identifier and where the data currently resides or will reside.
  • One Coordinator node which provides the metadata to locate the data for objects, manage the storage nodes, gateways and monitor the service. The Coordinator node can be a SaaS service supplied by Cubbit or a VM/bare metal node running Cubbit Coordinator software. Metadata is protected internally within the Coordinator node.

With these three components one can stand up a complete, geo-distributed/geo-fenced, S3 object storage system which the organization controls.

Cubbit encrypts data as it at the gateway and decrypts data when accessed. Sign-on to the system uses standard security offerings. Security keys can be managed by Cubbit or by standard key management systems.

All data for an object is protected by nested erasure codes. That is 1) erasure code within a data center/location over its storage drives and 2) erasure code across geographical locations/data centers..

With erasure coding across locations, customer with say 10 data center locations can have their data stored in such a fashion that as long as at least 8 data centers are online they still have access to their data, that is the Cubbit storage system can still provide data availability.

Similarly for erasure coding within the data center/location or across storage drives, say with 12 drives per stripe, one could configure lets say 9+3 erasure coding, where as long as 9 of the drives still operate, data will be available.

Please note the customer decides the number of locations to stripe across for erasure coding, and diet for the number of storage drives.

The customer supplies all the storage node hardware. Some customers start with re-purposed servers/drives for their original configuration and then upgrade to higher performing storage-servers-networking as performance needs change. Storage nodes can be on prem, in the cloud or at the edge.

For adequate performance gateways and storage nodes (and coordinator nodes) should be located close to one another. Although Coordinator nodes are not in the data path they are critical to initial object access.

Gateways can provide a cache for faster local data access.. Cubbit has recommendations for Gateway server hardware. And similar to storage nodes, Gateways can operate at the edge, in the cloud or on prem.

Use cases for the Distributed S3 storage include:

  • As a backup target for data elsewhere
  • As a geographically distributed/fenced object store.
  • As a locally controlled object storage to feed AI training/inferencing activity.

Most backup solutions support S3 object storage as a target for backups.

Geographically distributed S3 storage means that customers control where object data is located. This could be split across a number of physical locations, the cloud or at the edge.

Geographically fenced S3 storage means that the customer controls which of its many locations to store an object. For GDPR countries with multi-nation data center locations this could provide the compliance requirements to keep customer data within country.

Cubbit’s distributed S3 objects storage is strongly consistent in that an object loaded into the system at any location is immediately available to any user accessing it through any other gateway. Access times vary but the data will be the same regardless of where you access it from.

The system starts up through an Ansible playbook which asks a bunch of questions and loads and sets up the agent software for storage nodes, gateway nodes and where applicable, the coordinator node.

At any time, customers can add more gateways or storage nodes or retire them. The system doesn’t perform automatic load balancing for new nodes but customers can migrate data off storage nodes and onto other ones through api calls/UI requests to the Coordinator.

Cubbit storage supports multi-tenancy, so MSPs can offer their customers isolated access.

Cubbit charges for their service on data storage under management. Note it has no egress charges, and you don’t pay for redundancy. But you do supply all the hardware used by the system. They offer a discount for M&E customers as the metadata to data ratio is much smaller (lots of large files) than most other S3 object stores (mix of small and large files).

Cubbit is presently available only in Europe but will be coming to USA next year. So, if you are interested in geo-distributed/geo-fenced S3 object storage that you control and can be had for much cheaper than hyperscalar object storage, check it out.

Enrico Signoretti, VP Products & Partnerships

Enrico Signoretti has over 30 years of experience in the IT industry, having held various roles including IT manager, consultant, head of product strategy, IT analyst, and advisor.

He is an internationally renowned visionary author, blogger, and speaker on next-generation technologies. Over the past four years, Enrico has kept his finger on the pulse of the evolving storage industry as the Head of Research Product Strategy at GigaOm. He has worked closely and built relationships with top visionaries, CTOs, and IT decision makers worldwide.

Enrico has also contributed to leading global online sites (with over 40 million readers) for enterprise technology news.

162: GreyBeards talk cold storage with Steffen Hellmold, Dir. Cerabyte Inc.

Steffen Hellmold, Director, Cerabyte Inc. is extremely knowledgeable about the storage device business. He has worked for WDC in storage technology and possesses an in-depth understanding of tape and disk storage technology trends.

Cerabyte, a German startup, is developing cold storage. Steffen likened Cerabyte storage to ceramic punch cards that dominated IT and pre-IT over much of the last century. Once cards were punched, they created near-WORM storage that could be obliterated or shredded but was very hard to modify. Listen to the podcast to learn more.

Cerabyte uses a unique combination of semiconductor (lithographic) technology, ceramic coated glass, LTO tape (form factor) cartridge and LTO automation in their solution. So, for the most part, their critical technologies all come from somewhere else.

Their main technology uses a laser-lithographic process to imprint onto a sheet (ceramic coated glass) a data page (block?). There are multiple sheets in each cartridge.

Their intent is to offer a robotic system (based on LTO technology) to retrieve and replace their multi-sheet cartridges and mount them in their read-write drive.

As mentioned above, the write operation is akin to a lithographic data encoded mask that is laser imprinted on the glass. Once written, the data cannot be erased. But it can be obliterated, by something akin to writing all ones or it can be shredded and recycled as glass.

The read operation uses a microscope and camera to take scans of the sheet’s imprint and convert that into data.

Cerabyte’s solution is cold or ultra-cold (frozen) storage. If LTO robotics are any indication, a Cerabyte cartridge with multiple sheets can be presented to a read-write drive in a matter of seconds. However, extracting the appropriate sheet in a cartridge, and mounting it in a read-write drive will take more time. But this may be similar in time to an LTO tape leader being threaded through a tape drive, again a matter of seconds

Steffen didn’t supply any specifications on how much data could be stored per sheet other than to say it’s on the order of many GB. He did say that both sides of a Cerabyte sheet could be recording surfaces.

With their current prototype, an LTO form factor cartridge holds less than 5 sheets of media but they are hoping that they can get this to a 100 or more. in time.

We talked about the history of disk and tape storage technology. Steffen is convinced (as are many in the industry) that disk-tape capacity increases have slowed over time and that this is unlikely to change. I happen to believe that storage density increases tend to happen in spurts, as new technology is adopted and then trails off as that technology is built up. We agreed to disagree on this point.

Steffen predicted that Cerabyte will be able to cross over disk cost/capacity this decade and LTO cost/capacity sometime in the next decade.

We discussed the market for cold and frozen storage. Steffen mentioned that the Office of the Director of National Intelligence (ODNI) has tasked the National Academies of Sciences, Engineering, and Medicine to conduct a rapid expert consultation on large-scale cold storage archives. And that most hyperscalers have use for cold and frozen storage in their environments and some even sell this (Glacier storage) to their customers.

The Library of Congress and similar entities in other nations are also interested in digital preservation that cold and frozen technology could provide. He also thinks that medical is a prime market that is required to retain information for the life of a patient. IBM, Cerabyte, and Fujifilm co-sponsored a report on sustainable digital preservation.

And of course, the media libraries for some entertainment companies represent a significant asset that if on tape has to be re-hosted every 5 years or so. Steffen and much of the industry are convinced that a sizeable market for cold and frozen storage exists.

I mentioned that long archives suffer from data format drift (data formats are no longer supported). Steffen mentioned there’s also software version drift (software that processed that data is no longer available/runnable on current OSs). And of course the current problem with tape is media drift (LTO media formats can be read only 2 versions back).

Steffen seemed to think format and software drift are industry-wide problems and they are being worked on. Cerabyte seems to have a great solution for media drift. As it can be read with a microscope. And the (ceramic glass) media has a predicted life of 100 years or more.

I mentioned the “new technology R&D” problem. Historically, as new storage technology has emerged, they have always end up being left behind (in capacity), because disk-tape-NAND R&D ($Bs each) over spends them. Steffen said it’s certainly NOT B$ of R&D for tape and disk.

Steffen countered by saying that all storage technology R&D spending pales in comparison to semiconductor R&D spending focused on reducing feature size. And as Cerabyte uses semiconductor technologies to write data, sheet capacity is directly a function of semiconductor technology. So, Cerabyte’s R&D technology budget should not be a problem. And in fact they have been able to develop their prototype, with just $7M in funding.

Steffen mentioned there is an upcoming Storage Technology Showcase conference in early March where Cerabyte will be at.

Steffen Hellmold, Director, Cerabyte Inc.

Steffen has more than 25 years of industry experience in product, technology, business & corporate development as well as strategy roles in semiconductor, memory, data storage and life sciences.

He served as Senior Vice President, Business Development, Data Storage at Twist Bioscience and held executive management positions at Western Digital, Everspin, SandForce, Seagate Technology, Lexar Media/Micron, Samsung Semiconductor, SMART Modular and Fujitsu.

He has been deeply engaged in various industry trade associations and standards organizations including co-founding the DNA Data Storage Alliance in 2020 as well as the USB Flash Drive Alliance, serving as their president from 2003 to 2007.

He holds an economic electrical engineering degree (EEE) from the Technical University of Darmstadt, Germany.