117: GreyBeards talk HPC file systems with Frank Herold, CEO of ThinkParQ, makers of BeeGFS

We return back to our storage thread with a discussion of HPC file systems with Frank Herold, (@BeeGFS) CEO of ThinkParQ GmbH, the makers of BeeGFS. I’ve seen BeeGFS start to show up in some IO500 top storage benchmark results and as more and more data keeps coming online every day, we thought it time to start finding out how our friends in the HPC world handle their data deluge.

Frank’s a former rocket scientist, that’s been in and around the storage industry for years, and was very knowledgeable about BeeGFS’s software defined, parallel file system. He seemed to have a great grasp of the IO requirements in HPC, Life Sciences and other HPC-like applications. Listen to the podcast to learn more.

Turns out that ThinkParQ is a spinoff of the research institute in Germany that originally developed BeeGFS parallel file system. There are apparently two version of their product one which is publicly available (downloadable from their website) and another with commercial support. It’s not quite 100% open source but it’s got a lot of open source in it and their GIT repository is available

BeeGFS was primarily focused on HPC workloads but as this type of work has become more mainstream, they have moved beyond HPC and now have significant installations in Life Sciences, Oil&Gas and many other big data environments.

It runs on x86/AMD, OpenPower, and ARM CPUs. BeeGFS comes as a number of services, one of which is a storage service which uses a backend with ZFS or XFS file system. It also uses (POSIX compliant) host client software to access their system. There’s also a metadata and monitoring service. Most of the time these services run on separate servers but BeeGFS also supports a “converged mode”, where all these services run on a single server. And you can have multiple converged mode servers in a cluster.

BeeGFS is a parallel file system. This means that it intrinsically supports multiple metadata services/servers and multiple storage servers which allow it to scale up storage bandwidth and performance considerably beyond single appliance systems. Data is automatically distributed across all the storage servers in the configuration, unless you specify that data reside on specific, say all flash storage servers. Similarly, metadata is automatically distributed across all metadata servers in the system.

They don’t support any specific RAID protection other than mirroring and that really to speed up read throughput. Rather they depend on the underlying XFS/ZFS file system to provide drive failure protection (RAID5/6).

One of BeeGFS’s selling points is that it has few tuning parameters that a customer needs to fiddle with. Frank said it runs quite well right out of the box.

BeeGFS offers a single name space that spans the cluster (of metadata servers/storage servers). But customers can elect to split this name space across a subset of these metadata and storage servers, and by doing so they create multiple BeeGFS clusters.

There’s no inherent support for NFS or SMB but customers can configure NFS or SAMBA servers that use BeeGFS as backend storage. Also, there’s no data reduction built into BeeGFS and no automatic data tiering across the backend storage (file systems).

But as noted above, customers can direct which backend storage to use to hold their data. And they do offer a CLI data movement primitive and customers can use this in conjunction with other software to implement storage tiering or do it themselves.

Metadata performance is extremely important for small files and for large multi Billion object file systems. BeeGFS uses extensive metadata caching to provide faster access to this information.

Speaking of small file performance, we had a decent discussion on the tradeoffs involved between small and large file performance. And although BeeGFS has decent small file performance it’s not a be all for every small file intensive application. According to Frank, not every small file workload is optimal for BeeGFS.

They offer BeeOND which is BeeGFS on demand. This is an integration with Slurm workload scheduler (HPC work scheduler) that allows customers to spin up a scratch BeeGFS parallel file system across compute servers with storage.

Slurm’s BeeOND integration brings all BeeGFS services up and deploys them on compute nodes you specify. At this point you have a fully installed BeeGFS (scratch) parallel file system. Customers may use this scratch file system to support any compute-data intensive workload theyneed to run. When no longer needed, Slurm can be directed to automatically dismantle the BeeGFSl file system.

We talked about BeeGFS partners. They have a number of regional partners that provide installation and onsite support and a number of technical partners, such as NetApp, Dell, HPE and INSPUR, that supply BeeGFS configured servers and systems for deployment/installation.

Frank Herold, CEO ThinkparQ

Frank Herold is the CEO of ThinkParQ GmbH – the company behind BeeGFS. He actively leads the company and the product strategy of BeeGFS as a global player for parallel high-performance file systems.

Prior to joining ThinkParQ, he held various senior management positions within ADIC and Quantum Corporation, responsible for market segments within the academic and scientific research, oil and gas, broadcast and video surveillance sectors, focusing on large scale, high-performance and enterprise accounts within EMEA. 

Frank has over 25 years of experience in the IT industry and holds a master’s degree in engineering (Dipl. -Ing.) in rocket science.