HPC storage – Grey Beards on Systems

April 1, 2024April 2, 2024

163: GreyBeards talk Ultra Ethernet with Dr J Metz, Chair of UEC steering committee, Chair of SNIA BoD, & Tech. Dir. AMD

Dr J Metz, (@drjmetz, blog) has been on our podcast before mostly in his role as SNIA spokesperson and BoD Chair, but this time he’s here discussing some of his latest work on the Ultra Ethernet Consortium (UEC) (LinkedIN: @ultraethernet, X: @ultraethernet)

The UEC is a full stack re-think of what Ethernet could do for large single application environments. UEC was originally focused on HPC, with 400-800 Gbps networks and single applications like simulating a hypersonic missile or airplane. But with the emergence of GenAI and LLMs, UEC could also be very effective for large AI model training with massive clusters doing a single LLM training job over months. Listen to the podcast to learn more.

Podcast: Play in new window | Download (Duration: 47:40 — 65.5MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

The UEC is outside the realm of normal enterprise environments. But as AI training becomes more ubiquitous, who knows whether UEC may not find a place in the enterprise. However, it’s not intended for mixed network environments with multiple applications. It’s a single application network.

One wouldn’t think, HPC was a big user of Ethernet for their main network. But Dr J pointed out that the top 3 of the HPC 500, all use Ethernet and more are looking to use it in the future.

UEC is essentially an optimized software stack and hardware for networking used by single application environments. These types of workloads are constantly pushing the networking envelope. And by taking advantage of the “special networking personalities” of these workloads, UEC can significantly reduce networking overheads, boosting bandwidth and workload execution.

The scale of networks is extreme. The UEC is targeting up to a million endpoints, over >100K servers, with each network link >100Gbps and more likely 400-800Gpbs. With the new (AMD and others) networking cards coming out that support 4 400/800Gbps network ports, having a pair of these on each server, with 100K server cluster gives one 800K endpoints. A million is not that far away when you think of it at that scale.

Moreover, LLM training and HPC work are starting to look more alike these days. Yes there are differences but the scale of their clusters are similar, and the way work is sometimes fed to them is similar, which leads to similar networking requirements

UEC is attempting to handle a 5% problem. That is 95% of the users will not have 1M endpoints in their LAN, but maybe 5% will and for these 5%, a more mixed networking workload is unnecessary. In fact, a mixed network becomes a burden slowing down packet transmission.

UEC is finding that with a few select networking parameters, almost like workload fingerprints, network stacks can be much more optimized than current Ethernet and thereby support reduced packet overheads, and more bandwidth.

AI and HPC networks share a very limited set of characteristics which can be used as fingerprints. These characteristics are like reliable or unreliable transport, ordered or unordered delivery, multi-path packet spraying or not, etc, With a set of these types of parameters, selected for an environment, UEC can optimize a network stack to better support a million networking endpoints

We asked where CXL fits in with UEC? DrJ said it could potentially be an entity on the network but he sees CXL more as a within server or between a tight (limited) cluster of servers, solution rather than something on a UEC network.

Just 12 months ago the UEC had 10 members or so and this past week they were up to 60. UEC seems to have struck a chord.

The UEC plans to release a 1.0 specification, near the end of this year. UEC 1.0 is intended to operate on current (>100Gbps) networking equipment with firmware/software changes.

Considering the UEC was just founded in 2023, putting out their 1.0 technical spec. within 1.5 years is astonishing. But also speaks volumes to the interest in the technology.

The UEC has a blog post which talks more about UEC 1.0 specification and the technology behind it.

Dr J Metz, Chair of UEC Steering Committee, Chair of SNIA BoD, Technical Director of Systems Design, AMD

J works to coordinate and lead strategy on various industry initiatives related to systems architecture. Recognized as a leading storage networking expert, J is an evangelist for all storage-related technology and has a unique ability to dissect and explain complex concepts and strategies. He is passionate about the innerworkings and application of emerging technologies.

J has previously held roles in both startups and Fortune 100 companies as a Field CTO, R&D Engineer, Solutions Architect, and Systems Engineer. He has been a leader in several key industry standards groups, sitting on the Board of Directors for the SNIA, Fibre Channel Industry Association (FCIA), and Non-Volatile Memory Express (NVMe). A popular blogger and active on Twitter, his areas of expertise include NVMe, SANs, Fibre Channel, and computational storage.

J is an entertaining presenter and prolific writer. He has won multiple awards as a speaker and author, writing over 300 articles and giving presentations and webinars attended by over 10,000 people. He earned his PhD from the University of Georgia.

May 31, 2023May 26, 2023

149: GreyBeards talk HPC storage with Dustin Leverman, Group Leader, HPC storage at ORNL

Ran across an article discussing ORION, ORNL’s new storage system which had 100s of PB of file storage and supported TB/sec of bandwidth, so naturally I thought the GreyBeards need to talk to these people. I reached out and Dustin Leverman, Group Leader HPC storage at Oak Ridge National Labs (ORNL)answered the call. Dustin has been in HPC storage for a long time and at ORNL, he has helped deploy Orion, an almost 670PB, multi-tier file storage system for Frontier supercomputer users.

Orion is a LUSTRE file system based on HPE (Cray) ClusterStor with ~10PB of metadata, 11PB of NVMe flash and 649PB of disk. How the system handles, multi-tiering is unique AFAIK. It performs 11TB/sec of write IO and 14TB/sec of read IO. Note, that’s TeraBytes/sec not TeraBits. Listen to the podcast to learn more

Podcast: Play in new window | Download (Duration: 49:13 — 67.6MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

While designing Orion, ORNL found their users have a very bi-(tri-?)modal file size distribution. That is, many of their files are under 256KB, a lot under are 8MB and the remaining all over 8MB. As a result they added Progressive File Placement to support multi-tiering on LUSTRE.

Orion has 3 tiers of data storage. The 1st tier is 10PB NVMe SSD storage metadata tier. Orion also uses Data on Metadata, which stores the 1st 256KB of every file along with the file metadata. So, accessing very small files (<256GB) is all done out of the metadata tier. But what’s interesting is that the first 256GB of every file on ORION is located on the metadata tier

Orion’s 2nd tier is 11PB NVMe SSD flash tier. On this tier they store all file data over 256GB and under 8MB. NVMe flash tier is not as fast as the metadata tier but it supports another large chunk of ORNL files.

The final Orion tier is 649PB of spinning disk storage. Here it stores all file data that is larger than 8MB. Yes it’s slower than the other 2 tiers, but it makes that up in volume. Very large files, will find they can predictably access the first 256GB, the next 8MB (- 256GB) of data and then have to use disk to access any file data after that.

It’s important to note that Orion doesn’t support hot data in the upper tiers and cold data in the lowest tier as many multi-tier storage systems do. Rather Orion multi-tiering just tiers different segment of all file data on different tiers depending on where that data resides in the file’s storage space.

In addition to Orion file storage, ORNL also has an archive storage that uses HPSS and Spectrum Archive. Dustin mentioned that ORNL’s HPC data archive is accessed more frequently than typical archive storage, so there’s lots of data movement going between archive and Orion.

Orion supports metadata nodes and object storage targets (OSTs, storage nodes). Each OST has 1 flash target (made up of many SSDs) and 2 disk targets (made up of many disk drives).

Dustin mentioned that Orion has 450 OSTs, which in aggregate support 5.4K-3.84TB NVMe SSDs and 47.7K-18TB disk drives. Doing the math, that’s 20.7PB of NVMe flash and 858.6PB of disk storage.

ORION data is protected using ZFS RAID2, or can sustain up to 2 drive failures without losing data. Their stripe has 8 data and 2 parity drives plus 2 spares.

Keith asked how does one manage 670PB of LUSTRE storage. Dustin said they have a team of people with many software tools to support it. First and foremost, they take lot’s of telemetry off of all OSTs and metadata servers to understand what’s going on in the storage cluster. They use SMART data to predict which drives will go bad before they actually go bad. He mentioned that using telemetry, they can tell what kind of performance an app is driving and can use this to tweak what file systems an app uses.

I asked Dustin how he updates a 450 OST + [N] metadata node storage system. They take the cluster down when it needs to be updated. But before that, they regression test any update in their lab and when ready, roll it out to the whole cluster. Dustin said many problems only show up at scale, which means that an update can only truly be tested , when the whole cluster is in operation.

I asked Dustin whether they were doing any AI/ML work at ORNL. He said yes, but this is not on Orion directly but uses compute server mirrored DAS NVMe storage. He said that AI/ML workloads don’t require lot’s of data and using DAS makes it go as quick as possible.

Dustin mentioned that ORNL is a DoE funded lab so any changes they make to LUSTREare submitted back to the repository for inclusion into next release of LUSTRE.

Dustin Leaverman, Group Leader HPC storage at Oak Ridge National Labs

Researcher profile – Dustin Leverman with Frontiers Orion file system, January 10, 2023.

Dustin Leverman is the Group Leader for HPC Storage and Archive Group of the National Center for Computational Sciences (NCCS) at Oak Ridge National Laboratory (ORNL). The NCCS is home to Orion, the 700 petabyte file system that supports Frontier, the world’s first exascale supercomputing system and fastest computer in the world.

Dustin began his career at ORNL in 2009. He was previously a team leader in the HPC and Data Operations Group. In his current role, Dustin oversees procurement, administration, and support of high-speed parallel file systems and archive capabilities to enable the National Center for Computational Sciences’ overall mission of leadership-class and scalable computing programs.

November 8, 2022November 8, 2022

139: GreyBeards talk HPC file systems with Marc-André Vef and Alberto Miranda of GekkoFS

In honor of SC22 conference this month in Dallas, we thought it time to check in with our HPC brethren to find out what’s new in storage for their world. We happened to see that IO500 had some recent (ISC22) results using a relative new comer, GekkoFS (@GekkoFS). So we reached out to the team to find out how they managed to crack into the top 10. We contacted Marc-André Vef (@MarcVef), a Ph.D. student at Johannes Guttenberg University Mainz and Alberto Miranda (@amiranda_hpc) Ph.D. of Barcelona Supercomputing Center two of the authors on the GekkoFS paper.

GekkoFS is a new burst file system that is tailor made to create, process and tear down scratch data sets for HPC workloads. It turns out that HPC does lots of work using scratch files as working data sets. Burst file systems typically use another parallel file systems to (stage) read (permanent) data into the scratch files and write (permanent) result data out. But during processing, the burst file system handles all scratch data access. Listen to the podcast to learn more

Podcast: Play in new window | Download (Duration: 44:12 — 60.7MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

We had never heard of a burst file system before but it’s been around for a while now in HPC. For example, BeeGFS provides one (check out our GreyBeards podcast on BeeGFS). BeeGFS supports both a PFS and a burst file system. GekkoFS only offers a burst file system.

GekkoFS is a distributed burst file systems which operates across nodes to stitch together a single global file system. GekkoFS is strictly open source at the moment and can be downloaded (see: GekkoFS Gitlab) and used by anyone.

They are considering in the future of supplying professional support but at the moment if you have an issue, Marc and Alberto suggest you use the GekkoFS GitLab incident tracking system to tell them about it.

Turns out Lustre, IBM Spectrum Scale, DAOS and other HPC file systems take gobs of overhead to create scratch files. And even though it takes a lot of IO to load scratch file data and write out results, there’s a whole lot more IO that gets done to scratch files during HPC jobs.

This sort of IO also occurs for AI/ML/DLL where training data is staged into a sort of scratch area (typically in memory, depending on size) and then repeatedly (re-)processed there. GekkoFS can offer significant advantages to AI/ML/DL work when training data is very large. Normally without a burst file system, one would need to shard this data across nodes and then deal with the partial training that results. But with GekkoFS, all you need do is stage it into the burst file system and read it from there.

GekkoFS is partially posix compliant. They install a client-side interposer library that intercepts those posix requests destined for GekkoFS files.

GekkoFS has no central metadata server, which means that all nodes in the GekkoFS cluster support metadata services. Filenames are hashed to tell GekkoFS which node has its (metadata &) data.

GekkoFS stores their data and metadata on local disks, SSDs or in memory (tempfs) storage. All local node storage in the cluster is stitched together into a single global file system.

GekkoFS supports strict consistency for IO and file creation/deletion within nodes. They use an internal transaction database to enforce this strict consistency.

Across nodes they support eventual consistency. Which means files created on one node may not be immediately viewable/accessible by other nodes in the cluster for a short period of time while (meta) data updates are propagated across the cluster.

As part of their consistency paradigm, GekkoFS doesn’t support directory locking. Jason mentioned that HPC “LS” (directory listings) commands can sometimes take forever due to directory locking No directory locking makes LS commands happen faster but may show inconsistent results (due to eventual consistency).

We had some discussion on this lack of directory locking and eventual consistency in file systems, but we agreed to disagree. They did say that for the HPC workloads (and probably AI/ML/DLL) workloads, their approach seems appropriate as they are way more read intensive than write intensive.

In any case, they must be doing something right as they have a screaming scratch file system for HPC work.

Marc will be attending SC22 in Dallas this month, so if your attending please look him up and say hello from us.

Marc-André Vef, Ph.D. student

Marc-André Vef is a Ph.D. candidate at the Johannes Gutenberg University Mainz. He started his Ph.D. in 2016 after receiving his B.Sc. and M.Sc. degrees in computer science from the Johannes Gutenberg University Mainz. His master’s thesis was in cooperation with IBM Research about analyzing file create performance in the IBM Spectrum Scale parallel file system (formerly GPFS).

During his Ph.D., he has worked on several projects focusing on file system tracing (in collaboration with IBM Research) and distributed file systems, among others. Most notably, he designed two ad-hoc distributed file systems: DelveFS (in collaboration with OpenIO), which won the Best Paper in its category, and GekkoFS (in collaboration with the Barcelona Supercomputing Center). GekkoFS placed fourth in its first entry in the 10-node challenge of the IO500 benchmark. The file system is actively developed in the scope of the EuroHPC ADMIRE project.

His research interests focus on file systems and system analytics.

Alberto Miranda, Ph.D., Senior Researcher, Barcelona Supercomputing Center

Dr. Eng. Alberto Miranda is a Senior Researcher in
advanced storage systems in the Computer Science Department of the Barcelona Supercomputing Center (BSC) and co-leader of the Storage Systems Research Group since 2019. Dr. Eng. Miranda received a diploma in Computer Engineering (2004), a M.Sc. degree in Computer Science (2006) and a M.Sc. degree in Computer Architectures, Networks and Systems (2008) from the Technical University of Catalonia (UPC-BarcelonaTech). He later received a Ph.D. degree Cum Laude in Computer Science from the Technical University of Catalonia in 2014 with his thesis “Scalability in Extensible and Heterogeneous Storage Systems”.

His current research interests include efficient file and storage systems, operating systems, distributed system architectures, as well as information retrieval systems. Since he started his work at BSC in 2007, he has published 14 papers in international conferences and journals, as well as 5 white papers and technical reports and 1 book chapter. Dr. Eng. Miranda is currently involved in several European and national research projects and has participated in competitively funded EU projects XtreemOS, IOLanes, Prace2IP, IOStack, Mont-Blanc 2, EUDAT2020, Mont-Blanc 3, and NEXTGenIO.

August 20, 2021September 1, 2021

122: GreyBeards talk big data archive with Floyd Christofferson, CEO StrongBox Data Solutions

The GreyBeards had a great discussion with Floyd Christofferson, CEO, StrongBox Data Solutions on their big data/HPC file and archive solution. Floyd’s is very knowledgeable on problems of extremely large data repositories and has been around the HPC and other data intensive industries for decades.

StrongBox’s StrongLink solution offers a global namespace file system that virtualizes NFS, SMB, S3 and Posix file environments and maps this to a software-only, multi-tier, multi-site data repository that can span onsite flash, disk, S3 compatible or Azure object and LTFS tape iibrary storage as well as offsite versions of all the above tiers.

Typical StrongLink customers range in the 10s to 100s of PB, and ingesting or processing PBs a day. 200TB is a minimum StrongLink configuration, but Floyd said any shop with over 500TB has problems with data silos and other issues, but may not understand it yet. StrongLink manages data placement and movement, throughout this hierarchy to better support data access and economical storage. In the process StrongLink eliminates any data silos due to limitations of NAS systems while providing the most economic placement of data to meet user performance requirements.

Podcast: Play in new window | Download (Duration: 44:43 — 61.4MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

Floyd said that StrongLink first installs in customer environment and then operates in the background to discover and ingest metadata from the primary customers file storage environment. Some point later the customer reconfigures their end-users share and mount points to StrongLink servers and it’s up and starts running.

The minimal StrongLink, HA environment consists of 3 nodes. They use a NoSQL metadata database which is replicated and sharded across the nodes. It’s shared for performance load balancing and fully replicated (2-way or 3-way) across all the StrongLink server nodes for HA.

The StrongLink nodes create a cluster, called a star in StrongBox vernacular. Multiple clusters onsite can be grouped together to form a StrongLink constellation. And multiple data center sites, can be grouped together to form a StrongLink galaxy. Presumably if you have a constellation or a galaxy, the same metadata is available to all the star clusters across all the sites.

They support any tape library and any NFS, SMB, S3 orAzure compatible object or file storage. Stronglink can move or copy data from one tier/cluster to another based on policies AND the end-users never sees any difference in their workflow or mount/share points.

One challenge with typical tape archives is that they can make use of proprietary tape data formats which are not accessible outside those systems. StrongLink has gone with a completely open-source, LTFS file format on tape, which is well documented and is available to anyone.

Floyd also made it a point of saying they don’t use any stubs, or soft links to provide their data placement magic. They only use standard file metadata.

File data moves across the hierarchy based on policies or by request. One of the secrets to StrongLink success is all the work they have done to ensure that any data movement can occur at line rate speeds. They heavily parallelize any data movement that’s required to support data placement across as many servers as the customer wants to throw at it. StrongBox services will help right-size the customer deployment to support any data movement performance that is required.

StrongLink supports up to 3-way replication of a customer’s data archives. This supports a primary archive and 3 more replicas of data.

Floyd mentioned a couple of big customers:

One autonomous automobile supplier, was downloading 2PB of data from cars in the field, processing this data and then moving it off their servers to get ready for the next day’s data load.
Another weather science research organization, had 150PB of data in an old tape archive and they brought in StrongLink to migrate all this data off and onto LTFS tape format as well as support their research activities which entail staging a significant chunk of file data on research servers to do a climate run/simulation.

NASA, another StrongLink customer, operates slightly differently than the above, in that they have integrated StrongLink functionality directly into their applications by making use of StrongBox’s API.

StrongLink can work in three ways.

Using normal file access services where StrongLink virtualizes your NFS, SMB, S3 or Posix file environment. For this service StrongLink is in the data path and you can use policy based management to have data moved or staged as the need arises.
Using StrongLink CLI to move or copy data from one tier to another. Many HPC customers use this approach through SLURM scripts or other orchestration solutions.
Using StrongLink API to move or copy data from one tier to another. This requires application changes to take advantage of data placement.

StrongBox customers can of course, use all three modes of operation, at the same time for their StrongLink data galaxy. StrongLink is billed by CPU/vCPU level and not for the amount of data customers throw into the archive. This has the effect of Customers gaining a flat expense cost, once StrongLink is deployed, at least until they decide to modify their server configuration.

Floyd Christofferson, CEO StrongBox Data Solutions

As a professional involved in content management and storage workflows for over 25 years, Floyd has focused on methods and technologies needed to manage massive volumes of data across many different storage types and use cases.

Prior to joining SBDS, Floyd worked with software and hardware companies in this space, including over 10 years at SGI, where he managed storage and data management products. In that role, he was part of the team that provided solutions used in some of the largest data environments around the world.

Floyd’s background includes work at CBS Television Distribution, where he helped implement file-based content management and syndicated content distribution strategies, and Pathfire (now ExtremeReach), where he led the team that developed and implemented a satellite-based IP-multicast content distribution platform that manages delivery of syndicated content to nearly 1,000 TV stations throughout the US.

Earlier in his career, he ran Potomac Television, a news syndication and production service in Washington DC, and Manhattan Center Studios, an audio, video, graphics, and performance facility in New York.

April 19, 2021April 19, 2021

117: GreyBeards talk HPC file systems with Frank Herold, CEO of ThinkParQ, makers of BeeGFS

We return back to our storage thread with a discussion of HPC file systems with Frank Herold, (@BeeGFS) CEO of ThinkParQ GmbH, the makers of BeeGFS. I’ve seen BeeGFS start to show up in some IO500 top storage benchmark results and as more and more data keeps coming online every day, we thought it time to start finding out how our friends in the HPC world handle their data deluge.

Frank’s a former rocket scientist, that’s been in and around the storage industry for years, and was very knowledgeable about BeeGFS’s software defined, parallel file system. He seemed to have a great grasp of the IO requirements in HPC, Life Sciences and other HPC-like applications. Listen to the podcast to learn more.

Podcast: Play in new window | Download (Duration: 37:00 — 50.8MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

Turns out that ThinkParQ is a spinoff of the research institute in Germany that originally developed BeeGFS parallel file system. There are apparently two version of their product one which is publicly available (downloadable from their website) and another with commercial support. It’s not quite 100% open source but it’s got a lot of open source in it and their GIT repository is available

BeeGFS was primarily focused on HPC workloads but as this type of work has become more mainstream, they have moved beyond HPC and now have significant installations in Life Sciences, Oil&Gas and many other big data environments.

It runs on x86/AMD, OpenPower, and ARM CPUs. BeeGFS comes as a number of services, one of which is a storage service which uses a backend with ZFS or XFS file system. It also uses (POSIX compliant) host client software to access their system. There’s also a metadata and monitoring service. Most of the time these services run on separate servers but BeeGFS also supports a “converged mode”, where all these services run on a single server. And you can have multiple converged mode servers in a cluster.

BeeGFS is a parallel file system. This means that it intrinsically supports multiple metadata services/servers and multiple storage servers which allow it to scale up storage bandwidth and performance considerably beyond single appliance systems. Data is automatically distributed across all the storage servers in the configuration, unless you specify that data reside on specific, say all flash storage servers. Similarly, metadata is automatically distributed across all metadata servers in the system.

They don’t support any specific RAID protection other than mirroring and that really to speed up read throughput. Rather they depend on the underlying XFS/ZFS file system to provide drive failure protection (RAID5/6).

One of BeeGFS’s selling points is that it has few tuning parameters that a customer needs to fiddle with. Frank said it runs quite well right out of the box.

BeeGFS offers a single name space that spans the cluster (of metadata servers/storage servers). But customers can elect to split this name space across a subset of these metadata and storage servers, and by doing so they create multiple BeeGFS clusters.

There’s no inherent support for NFS or SMB but customers can configure NFS or SAMBA servers that use BeeGFS as backend storage. Also, there’s no data reduction built into BeeGFS and no automatic data tiering across the backend storage (file systems).

But as noted above, customers can direct which backend storage to use to hold their data. And they do offer a CLI data movement primitive and customers can use this in conjunction with other software to implement storage tiering or do it themselves.

Metadata performance is extremely important for small files and for large multi Billion object file systems. BeeGFS uses extensive metadata caching to provide faster access to this information.

Speaking of small file performance, we had a decent discussion on the tradeoffs involved between small and large file performance. And although BeeGFS has decent small file performance it’s not a be all for every small file intensive application. According to Frank, not every small file workload is optimal for BeeGFS.

They offer BeeOND which is BeeGFS on demand. This is an integration with Slurm workload scheduler (HPC work scheduler) that allows customers to spin up a scratch BeeGFS parallel file system across compute servers with storage.

Slurm’s BeeOND integration brings all BeeGFS services up and deploys them on compute nodes you specify. At this point you have a fully installed BeeGFS (scratch) parallel file system. Customers may use this scratch file system to support any compute-data intensive workload theyneed to run. When no longer needed, Slurm can be directed to automatically dismantle the BeeGFSl file system.

We talked about BeeGFS partners. They have a number of regional partners that provide installation and onsite support and a number of technical partners, such as NetApp, Dell, HPE and INSPUR, that supply BeeGFS configured servers and systems for deployment/installation.

Frank Herold, CEO ThinkparQ

Frank Herold is the CEO of ThinkParQ GmbH – the company behind BeeGFS. He actively leads the company and the product strategy of BeeGFS as a global player for parallel high-performance file systems.

Prior to joining ThinkParQ, he held various senior management positions within ADIC and Quantum Corporation, responsible for market segments within the academic and scientific research, oil and gas, broadcast and video surveillance sectors, focusing on large scale, high-performance and enterprise accounts within EMEA.

Frank has over 25 years of experience in the IT industry and holds a master’s degree in engineering (Dipl. -Ing.) in rocket science.

	174: GreyBeards talk… on 174: GreyBeards talk SDN chips…
	GreyBeards talk Agen… on 169: GreyBeards talk AgenticAI…
	Computational (DNA)… on 155: GreyBeards SDC23 wrap up…
	155: GreyBeards SDC2… on 155: GreyBeards SDC23 wrap up…
	J Metz on 134: GreyBeards talk (storage)…