173: GreyBeards Year End 2025 podcast

Well this year went fast. Keith, Jason and I sat down to try to make some sense of it all.

AI is still on a tear and shows no end in sight. Questions abound on whether we are seeing signs of a bubble or not, our answer – maybe. We see it in GPU pricing. in AI startup valuations, and in enterprise interest. Some question whether the Enterprise is seeing any return from investments in AI but there’s no doubt they are investing. Inferencing on prem with training/fine tuning done in neo-clouds, has become the new norm. I thought we’d be mostly discussing agentic AI but it’s too early to for that yet.

In other news, the real Broadcom VMware play is starting to emerge (if ever in doubt). It’s an all out focus on the (highly profitable) high end enterprises, and abandon the rest. And of course the latest weirdness to hit IT is DRAM pricing, but in reality it’s the price of anything going into AI mega-data centers that’s spiking. Listen to the podcast to learn more

AI

GPU pricing is still high, although we are starting to see some cracks in NVIDIA’s moat.

AMD GPUs made a decent splash in the latest MLperf Training results and Google TPUs are starting to garner some in the enterprise. And NVIDIA GPUs are becoming less of a compute monster by focusing more with their latest GPU offerings on optimization for low precision compute, FP2 anyone, rather than just increasing compute. It seems memory bandwidth (in GPUs) is becoming more of a bottleneck than anything else IMHO.

But NVIDIA CUDA is still an advantage. Grad students grew up on it, trained on it and are so familiar with it, it will take a long time to displace. Yeah, RoCM helps but, IT needs more. Open Sourcing all the CUDA code and its derivatives could be an answer, if anybody’s listening.

Jason talked about AI rack and data center power requirements going through the roof and mentioned SMR (small modular [nuclear] reactors) as one solution. When buying a nuclear power plant is just not an option, SMRs can help. They can be trucked and installed (mostly) anywhere. Keith saw a truckload of SMRs on the highway on one of his road trips.

And last but not least, Apple just announced RDMA over Thunderbolt. And the (Youtube) airwaves have been lighting up with studio Macs being clustered together with sufficient compute power to rival a DGX. Of course it’s Apples MLX running rather than CUDA, and only so many models work on MLX, but it’s a start at democratizing AI.

VMware

Broadrcom’s moves remind Jason of what IBM did with Z. Abandoning the low end, milk the high end forever. If you want vSphere better think about purchasing VCF.

Keith mentioned if a company has a $100M cloud spend, they could save some serious money (~20%), going to VCF. But it’s not a lift and shift. Running a cloud on prem requires a different mindset than running apps in the cloud. Welcome to the pre-cloud era, where every IT shop did it all.

Component Pricing

Jason said that DRAM pricing has gone up 600% in a matter of weeks. Our consensus view is it’s all going to AI data centers. With servers having a TB of DRAM, GPUs with 160GB of HBM per, and LPDDR being gobbled up for mobile/edge compute everywhere is it any doubt that critical server (sub-) components are in high demand.

Hopefully, the Fabs will start to produce more. But that assumes Fab’s have spare capacity and DRAM demand is function of price. There are hints that neither of these are true anymore. Mega data centers are not constrained by capital, yet, and most Fabs are operating flat out producing as many chips as they can. So DRAM pricing may continue to be a problem for some time to come.

Speaking of memory, there was some discussion on memory tiering startups taking off with high priced memory. One enabler for that is the new UALink interconnect. It’s essentially an open source, chip-to-chip interconnect technology, over PCIe or Ethernet. UAlink solutions can connect very high speed components beyond the server itself to support a scale out network of accelerators, memory and CPUs in a single rack. It’s early yet but Meta specs for an OCP wide form factor rack was released in the AMD Helios OGP 72GPU rack that uses UALink tech today, More to come we’re sure.

Keith Townsend, The CTO Advisor, Founder & Executive Strategist | Advisor to CIOs, CTOs & the Vendors Who Serve Them

Keith Townsend (@CTOAdvisor) is a IT thought leader who has written articles for many industry publications, interviewed many industry heavyweights, worked with Silicon Valley startups, and engineered cloud infrastructure for large government organizations. Keith is the co-founder of The CTO Advisor, blogs at Virtualized Geek, and can be found on LinkedIN.

Jason Collier, Principal Member Of Technical Staff at AMD, Data Center Solutions Group

Jason Collier (@bocanuts) is a long time friend, technical guru and innovator who has over 25 years of experience as a serial entrepreneur in technology.

He was founder and CTO of Scale Computing and has been an innovator in the field of hyperconvergence and an expert in virtualization, data storage, networking, cloud computing, data centers, and edge computing for years.

He’s on LinkedIN. He’s currently working with AMD on new technology and he has been a GreyBeards on Storage co-host since the beginning of 2022

Leave a Reply

Discover more from Grey Beards on Systems

Subscribe now to keep reading and get access to the full archive.

Continue reading