119: GreyBeards talk distributed cloud file systems with Glen Shok, VP Alliances,Panzura

This month we turn to distributed (cloud) filesystems as we talk with Glen Shok (@gshok), VP of Alliances for Panzura. Panzura uses backend (cloud or onprem, S3 compatible) object store with a ring of software (VMs) or hardware (appliance) gateways that provides caching for local files as well as managing and maintaining metadata which creates a global NFS and SMB file system with near local access times.

Glen is an industry (without the grey beard) veteran with the knowledge to back that up. He’s been in the industry so long that we could probably have spent an hour just talking about where people are that we both know. Listen to the podcast to learn more

The interesting part about Panzura is their gateway ring. It not only manages local file caching and metadata maintenance/access, but it provides an out-of-(data path)-band file (byte range) lock coordination service, cache coherency (via delta block changes) and other services. All the metadata (and data) is backed up on backend object storage, but it’s the direct access to the metadata and its out of band control path as well as its caching service that supplies the near local access times for data.

Panzura supports any public (AWS, Azure, GCP & IBM) cloud object storage for backend data storage as well as a few, on prem, solutions (I think Glen mentioned IBM COS & Cloudian and their website mentions Wasabi, Scality and NetApp StorageGrid). Glen said they are on each of the public cloud’s marketplaces and with virtual gateways, its very easy to spin up and try.

Their system provides global (local, at the gateway) dedupe to reduce backend storage footprint and (both out of band and from backend storage) delta block changes for local cache updates. So in the event that an old version of the file happens to be present in their local cache gateway, it only needs retrieve the changed data from the object storage backend (or another gateway). All this local caching, dedupe and changed block tracking, helps to reduce cloud egress charges.

Data written to backend storage is immutable and versioned. So customers can retrieve any version of any file that was ever destaged to their backend. Glen said they write huge objects, presumably to help reduce storage footprint, IO overhead and API calls.

Glen claimed what with 3-way replication within a cloud region and 1-way replication outside the cloud region, customers no longer have to backup data. I respectively disagreed. He believes over time, customers will come to realize their use of backups for restores, becomes so rare that they can reduce backup frequency, if not eliminate it altogether. Some follow on discussion ensued, but in the end we seemed to agree to disagree on this topic.

Panzura also supports cross cloud mirroring. So, one could have their data mirrored from one cloud to another. One of these clouds will be used as a primary and only in the event that a majority of the gateway rings agree that the primary is DOWN and the secondary is UP, will they all automatically cut over to using the secondary storage cloud. While failover is automated, fail back requires operator intervention.

Panzura is charged for on managed data capacity. But cloud or on prem object storage is in addition to this and is charged for separately by the object storage provider.

As far what size file systems they support, Glen mentioned that they are ZFS internally, so any size imaginable. But he did concede, that at some point, metadata management becomes a problem and that they often suggest splitting apart 20PB file systems into 2 10PB (gateway rings) file systems to deal with this issue.

As for other solutions offered by Panzura, they have a K8s container block storage for persistent volumes that scales in capacity/performance using K8s services/resources.

Glen Shok, VP Alliances, Panzura

Glen Shok has been in the data center and storage industry for over 20 years.

Starting his career at Cisco in the late 90s. Moving to a few startups which were acquired by Brocade and Oracle. Glen has held positions in sales, sales leadership, product management and marketing, and Office of the CTO at Zones, prior to coming to Panzura.

He can’t decide what he likes to do, but at Panzura, he’s the VP of Strategic Alliances.

117: GreyBeards talk HPC file systems with Frank Herold, CEO of ThinkParQ, makers of BeeGFS

We return back to our storage thread with a discussion of HPC file systems with Frank Herold, (@BeeGFS) CEO of ThinkParQ GmbH, the makers of BeeGFS. I’ve seen BeeGFS start to show up in some IO500 top storage benchmark results and as more and more data keeps coming online every day, we thought it time to start finding out how our friends in the HPC world handle their data deluge.

Frank’s a former rocket scientist, that’s been in and around the storage industry for years, and was very knowledgeable about BeeGFS’s software defined, parallel file system. He seemed to have a great grasp of the IO requirements in HPC, Life Sciences and other HPC-like applications. Listen to the podcast to learn more.

Turns out that ThinkParQ is a spinoff of the research institute in Germany that originally developed BeeGFS parallel file system. There are apparently two version of their product one which is publicly available (downloadable from their website) and another with commercial support. It’s not quite 100% open source but it’s got a lot of open source in it and their GIT repository is available

BeeGFS was primarily focused on HPC workloads but as this type of work has become more mainstream, they have moved beyond HPC and now have significant installations in Life Sciences, Oil&Gas and many other big data environments.

It runs on x86/AMD, OpenPower, and ARM CPUs. BeeGFS comes as a number of services, one of which is a storage service which uses a backend with ZFS or XFS file system. It also uses (POSIX compliant) host client software to access their system. There’s also a metadata and monitoring service. Most of the time these services run on separate servers but BeeGFS also supports a “converged mode”, where all these services run on a single server. And you can have multiple converged mode servers in a cluster.

BeeGFS is a parallel file system. This means that it intrinsically supports multiple metadata services/servers and multiple storage servers which allow it to scale up storage bandwidth and performance considerably beyond single appliance systems. Data is automatically distributed across all the storage servers in the configuration, unless you specify that data reside on specific, say all flash storage servers. Similarly, metadata is automatically distributed across all metadata servers in the system.

They don’t support any specific RAID protection other than mirroring and that really to speed up read throughput. Rather they depend on the underlying XFS/ZFS file system to provide drive failure protection (RAID5/6).

One of BeeGFS’s selling points is that it has few tuning parameters that a customer needs to fiddle with. Frank said it runs quite well right out of the box.

BeeGFS offers a single name space that spans the cluster (of metadata servers/storage servers). But customers can elect to split this name space across a subset of these metadata and storage servers, and by doing so they create multiple BeeGFS clusters.

There’s no inherent support for NFS or SMB but customers can configure NFS or SAMBA servers that use BeeGFS as backend storage. Also, there’s no data reduction built into BeeGFS and no automatic data tiering across the backend storage (file systems).

But as noted above, customers can direct which backend storage to use to hold their data. And they do offer a CLI data movement primitive and customers can use this in conjunction with other software to implement storage tiering or do it themselves.

Metadata performance is extremely important for small files and for large multi Billion object file systems. BeeGFS uses extensive metadata caching to provide faster access to this information.

Speaking of small file performance, we had a decent discussion on the tradeoffs involved between small and large file performance. And although BeeGFS has decent small file performance it’s not a be all for every small file intensive application. According to Frank, not every small file workload is optimal for BeeGFS.

They offer BeeOND which is BeeGFS on demand. This is an integration with Slurm workload scheduler (HPC work scheduler) that allows customers to spin up a scratch BeeGFS parallel file system across compute servers with storage.

Slurm’s BeeOND integration brings all BeeGFS services up and deploys them on compute nodes you specify. At this point you have a fully installed BeeGFS (scratch) parallel file system. Customers may use this scratch file system to support any compute-data intensive workload theyneed to run. When no longer needed, Slurm can be directed to automatically dismantle the BeeGFSl file system.

We talked about BeeGFS partners. They have a number of regional partners that provide installation and onsite support and a number of technical partners, such as NetApp, Dell, HPE and INSPUR, that supply BeeGFS configured servers and systems for deployment/installation.

Frank Herold, CEO ThinkparQ

Frank Herold is the CEO of ThinkParQ GmbH – the company behind BeeGFS. He actively leads the company and the product strategy of BeeGFS as a global player for parallel high-performance file systems.

Prior to joining ThinkParQ, he held various senior management positions within ADIC and Quantum Corporation, responsible for market segments within the academic and scientific research, oil and gas, broadcast and video surveillance sectors, focusing on large scale, high-performance and enterprise accounts within EMEA. 

Frank has over 25 years of experience in the IT industry and holds a master’s degree in engineering (Dipl. -Ing.) in rocket science.

113: GreyBeards talk storage for next gen. workloads with Liran Zvibel, Co-Founder & CEO WekaIO

Sponsored By:

I’ve known Liran Zvibel, Co-founder and CEO of Weka IO for many years now and it’s the second time he’s been on our show, (see: Episode 56: GreyBeards talk high performance file storage...). In those days, WekaIO was just coming out and hitting the world with this extremely high-performing, scale out unstructured data solution. Well since then, they’ve just gotten better.

Keith and I had a great time talking with Liran again. Liran has deep knowledge about unstructured data and how enterprises use it these days. WekaIO’s story, over the last two years has gone beyond great performance to real world, hybrid cloud offerings e as well as going after the cloud native app’s (read Kubernetes [K8S]) persistent storage. Listen to the podcast to learn more.

We started with a history lesson on WekaIO. Back in those days (which persists today, I might add) there were many IO workloads that required companies to purchase different solutions for different work. For example, they needed DAS or SAN for performance, NAS for ease of access and object for scale. WekaIO came out with an answer to all these problems in a single, scaleable storage system. That is, they performed IO as fast as DAS or SAN block, had all the ease of access of NAS, and could scale as much as object.

However, the real culprit holding the world back was “NFS”. At the outset NFS was designed (back in the 1990s) with the then current networking speeds available (10-100Mbps), which performed just fine at those speeds. But when 10-100GbE came out in the 2000’s, NFS’s metadata overhead was too chatty to support wire speeds. Thus, any storage that depended on NFS protocols couldn’t supply (small) files fast enough for modern applications.

This is why WekaIO has moved to not only support NFS and SMB but also POSIX and NVIDIA® GPUDirect® Storage interfaces. By offering POSIX, WekaIO is able to plug into standard Linux and Windows server systems and provide excellent small file performance. Of course applications that demand small file performance today are mostly data analytics and AI/ML/DL workloads.

Consequently., NVIDIA came out with their GPUDirect Storage protocol to address getting small file (data) into GPUs faster. With GPUDirect, storage systems can RDMA data directly from storage to GPU memory and vice versa, with no OS intervention (other than to set up the transfer). If you happen to have a small file, high performing storage system attached to your fabric that supports GPUDirect , like WekaIO, you can significantly speed up your AI/ML/DL workloads.

Next we started talking K8S storage. WekaIO usestheir POSIX interface in their CSI plugin to support K8S container persistent storage. Again, supplying high performance for small files seems to be tailor made for K8S container applications that exist today and will for the foreseeable future.

Enter the cloud. Almong other things, WekaIO is a AWS primary storage vendor. It also offers snap to cloud. And with both of these in tandem, it’s just become a lot easier to move and access your unstructured data in the cloud. Liran mentioned that WekaIO primary storage in AWS operates across AZ’s. This means it can be configured to support better availability than EBS.

Large BioPharma companies are using WekaIO in AWS to store and process field data and research data, so that this work can be done around the world. Some companies have run out of compute in a single AZ (unbelievable I know but it’s COVID). By offering multi-AZ support unstructured data access with WekaIO, these companies can spread their compute across AZ’s and region and still access their data. And when their products are ready for gov’t certification, having all this data in the cloud, can make provide an easy way to have gov’t access this same data.

Liran Zvibel, Co-founder and CEO WekaIO

As Co-Founder and CEO, Mr. Liran Zvibel guides long term vision and strategy at WekaIO. Prior to creating the opportunity at WekaIO, he ran engineering at social startup and Fortune 100 organizations including Fusic, where he managed product definition, design, and development for a portfolio of rich social media applications.

Liran also held principal architectural responsibilities for the hardware platform, clustering infrastructure and overall systems integration for XIV Storage System, acquired by IBM in 2007.

Mr. Zvibel holds a BSc.in Mathematics and Computer Science from Tel Aviv University.

111: GreyBeards talk data analytics with Matthew Tyrer, Sr. Mgr. Solutions Mkt & Competitive Intelligence, Commvault

Sponsored by:

I’ve known Matthew Tyrer, Senior Manager Solutions Marketing and Competitive Intelligence, Commvault for quite awhile now and he’s always been knowledgeable about the problems the enterprise has in supporting and backing up large file data repositories. But lately he’s been focused on Commvault Activate their data analytics solution.

We had a great talk with Matthew. He was easy to talk to and knew a lot about how data analytics can ease the operational burden of the enterprise growing file data environments. .Remind me not to have two Matthew’s on the same program ever again. Listen to the podcast to learn more.

Matthew mentioned that their Activate was built on the Commvault platform software stack, which has had a rich and long history of development and customer deployments. It seems that Activate data analytics had been an early part of the platform but recently was split out as a separate solution.

One capability that Activate has that many other data analytics solutions do not, is the ability to examine both online data as well as data in backups. Most analytics solution can do one or the other, only a few do both. But if a solution only has access to online or backup data, they are missing half the story.

In addition, Activate can operate across multiple data centers as well as across multiple public cloud environments to provide analytics for an enterprise’s file data where it may reside.

Given the proliferation of file data these days, data analytics has become a necessity to most large IT shops. In the past, an admin could track some data over time but with the volumes of file data today, this is no longer tenable. At PB or more of file data, located in on prem data centers as well as across multiple clouds, there’s just too much file data to keep track of manually anymore.

Activate also indexes file content to provide more visibility and tracking of the different types of data under management in the enterprise. This is in addition to the extensive metadata that is collected and analyzed so it can better understand data access rights, copies and physical locations around the enterprise.

Activate can help organizations govern their data flows in support of industry as well as government data compliance requirements. Activate Data Governance, one of the three Activate solutions, is focused exclusively on providing enterprises the tools needed to manage any and all data that exists under compliance regulation environments.

Mat Leib had worked in eDiscovery before and it had always been a pain to extract “legally relevant” data from online and backup repositories. With the Activate eDiscovery solution and Activate’s content indexing of all file data, legal can perform their own relevant data searches to create eDiscovery data sets in support of litigation activities. Self service legal extracts like this vastly reduces the admin time and cost needed for eDiscovery.

The Activate File Space Optimization solution was deployed in one environment that had ~20PB of data online. By using File Space Optimization, the customer was able to cut 20PB down to 10PB. Any customer could benefit from such a reduction but customers doing data migration would see even more benefit.

At the end of the podcast, Matthew mentioned some videos that show Activate solution use cases.

Matthew Tyrer, Senior Solutions Marketing and Competitive Intelligence

Having worked at Commvault for over twelve years, after 8 years as a Sales Engineer Matt took that technical knowledge and transitioned to marketing where he is currently serving as a Senior Manager in Commvault’s Solution Marketing team. He is also heavily involved in Competitive Intelligence initiatives, and actively participates in field enablement programs.

He brings over 20 years’ experience in the IT industry, including within the fields of data and information management, cloud, data governance, enterprise storage, disaster recovery, and ultimately both implementing and supporting those projects and endeavours for public and private sector clients across Canada and around the globe.

Matt’s passion, deep product knowledge, and broad field experiences have enabled him to translate Commvault technology and vision such that their value is easily understood in the market and amongst client and partner families.

A self-described geek-dad, Matt is an avid boardgame enthusiast, firmly believes that Han shot first, and enjoys tormenting his girls with bad dad jokes.

108: GreyBeards talk DNA storage with David Turek, CTO, Catalog DNA

The Greybeards get off the beaten (enterprise) path this month, to see what lies ahead with a discussion on DNA storage. David Turek, CTO, Catalog DNA (@CatalogDNA) is a long time IBMer that had been focused on HPC systems at IBM but left and went to Catalog DNA to pursue the commercialization of DNA storage, an “emerging” technology. CatalogDNA is a company out of Boston that had recently closed a round of funding and are focused on bringing DNA storage out into the world of IT.

David was a pleasure to talk and has lot’s of knowledge on HPC and enterprise data center solutions. He also has a good grasp of what it will take to bring DNA storage to market. Keith has had some prior experience with DNA technologies in BioPharma so could talk in more detail about the technology and its ecosystem. [We’re trying out a new format, let us know what you think; The Eds.]

Ray has written about DNA storage in his RayOnStorage Blog, most recently in April of this year and May of last year. It’s been an ongoing blog topic of his for almost a decade now. When Ray was interviewed about the technology he thought it interesting but had serious obstacles with read and write latencies and throughput as well as the size of the storage device.

Well CatalogDNA seems to have got a good handle on write throughput and are seriously working on the rest.

However, DNA storage’- volumetric density was always of exceptional. Early on in the podcast, David mentioned that DNA storage was 6 orders of magnitude (1 million times) more dense in bytes/mm**3 than magnetic tape today. An LTO8 tape device stores 12TB (uncompressed) in a tape cartridge, 14.2 in**3 (230.3 cm**3) or roughly 845GB/in**3 (52GB/cm**3). One million times this, would be 12EB in the same volume.

The challenge with LTO8, disk or SSD storage today is at some point you have to move the data from one device to a more modern device. This could be every 3-5 years (for disk or SSD) or 25-30 years for tape. In either case, at some point IT would need to incur the cost and time to move the data. Not much of a problem for 100TB or so but when you start talking PB or EB of data, it can be a never ending task.

DNA storage

David mentioned Catalog uses “synthetic DNA” in their storage. This means the DNA it uses is designed to be incompatible with natural DNA such that it wouldn’t work in a cell. It has stops or other biological mechanisms to inhibit it’s use in nature. Yes it uses the same sugars, backbones, and other chemistry of biologically active DNA, but it has been specifically modified to inhibit its use by normal cellular machinery. 

DNA storage has a number of unique capabilities :

  • It can be made to last forever, by being dried out (dessicated) and encased in a crystal and takes 0 power/energy to be stored for eons.
  • It can be cheaply and easily replicated, almost an infinite number of times, for only the cost of chemical feedstock, chemical interactions and energy. Yes, this may take time but the process scales up nicely. One could make 2 copies in first cycle, 4 in the 2nd, 8 in the 3rd, etc and by doing this it would only take 20 cycles to create a million copies. If each cycle takes 10 minutes, in 3:20, you could have a million copies of 1EB of data.
  • It can be easily searched for target information. This involves fabricating a DNA search molecule and inserting it into the storage solution. Once there it would match up with the DNA segment that held your key. And of course, the search molecule and the data could be replicated to speed up any search process.
  • We already mentioned the extreme density advantage above.

Speed of DNA storage access

David said they can already write Catalog DNA storage in MB/sec.

The process they use to write is like a conveyer belt which starts off with a polyethylene sheet (web actually). Somewhere, the digital data comes in, is chunked and transformed into DNA strand (25-50 base pairs) molecules or dots. The polyethylene sheet rolls into a machine that uses multiple 3D print heads to deposit dots (the DNA strand data chunks) at web points. This machine/process deposits 100K or more of these dots onto the web. The sheet then moves to the next stage where the DNA molecules are scraped off and drained into a solution. Then a wet process occurs which uses chemistry to make the DNA more readable and enables the separate DNA molecules to connect into a data strand. Then this data strand goes into another process where it gets reduced in volume and so that it is more stable.

If needed, one can add another step that dries out or desiccates the data strand into even a smaller volume which can then be embedded into a crystalline structure which could last for centuries.

David compared the DNA molecules (data chunks) to Legos, only they are the same pieces in a million different colors Each piece represents some segment of data bits/bytes. Using chemistry and proprietary IP each separate DNA molecule self organizes (connects) into a data strand, representing the information you want to store.

Reading DNA involves, off the shelf, DNA sequencers. The one Catalog currently uses is the Oxford NanoPore device, but there are others. David didn’t say how fast they could read DNA data. But current DNA reading devices destroy the data. So making replicas of the data would be required to read it.

David said their current write device is L shaped with one leg about 14’ (4.3m) long and the other about 12’ (3.7m) long with each leg being about 3’ (0.9m) wide.

Searching EB of data in minutes?!

DNA strands can be searched (matched) using a search molecule and inserting this into the storage solution (that holds the data strands). Such a molecule will find a place in the data that has a matching (DNA) data element and I believe attach itself to the data strand.

For example, lets say you had recorded all of a country’s emails for a month or so and you wanted to search them for the words, “bomb”, “terrorist”, “kill”, etc. One could create a set of search molecules, replicate them any number of times (depending on how quickly you wanted to search the data and how many matches you expected), and insert them into a data pool with multiple data strands that stored the email traffic.

After some time, you’d come back and your search would be done. You’d need to then extract the search hits, and read out the portion of the data strands (emails) that matched. I’m guessing extraction would involve some sort of (wet) chemical process or filtration.

State of Catalog DNA storage

David mentioned that as a publicity stunt they wrote the whole Wikipedia onto Catalog DNA storage. The whole Wikipedia fit into a cylinder about the height of a big knuckle on your hand and in a width smaller than a finger. The size of the whole Wikipedia, with complete edit history is 10TB uncompressed and if they stored all the edit versions plus its media such as images, videos, audio and other graphics, that would add another 23TB (as of end of 2014), so ~33TB uncompressed.

David believes in 18 months they could have a WORM (write once, read many times) data storage solution that could be deployed in customer data centers which would supply immense data repositories in relatively small solution containers.

CatalogDNA is currently in a number of PoCs with major corporations (not labs or universities) to show how DNA storage technology can be used to solve problems.

David believes that at some point they will be able to make compute engines entirely of DNA. At that point, one could have a combined compute and storage (HCI-like) DNA server using the same technology in a solution. And as mentioned previously, one could replicate from one DNA server & storage to a million DNA servers & storage in just 20 cycles. How’s that for scale out.


David Turek, CTO Catalog DNA

Dave Turek is Catalog’s Chief Technology Officer. He comes to Catalog from IBM where he held numerous executive positions in High Performance Computing and emerging technologies.

He was the development executive for the IBM SP program which produced the first commercially successful massively parallel system; he started IBM’s Linux Cluster business; launched an early offering in Cloud computing called Deep Computing Capacity on Demand; produced the Roadrunner system, the world’s first petascale computer; and was responsible for IBM’s exascale strategy which led to the deployment of the Summit and Sierra systems at Oak Ridge and Lawrence Livermore National Laboratories respectively.

David has been invited to testify to Congress on numerous occasions regarding the future of computing in the US and has helped establish technical collaborations with universities, businesses, and government agencies around the world.

This image has an empty alt attribute; its file name is Subscribe_on_iTunes_Badge_US-UK_110x40_0824.png
This image has an empty alt attribute; its file name is play_prism_hlock_2x-300x64.png
This image has an empty alt attribute; its file name is Spotify_Logo_CMYK_Black-1024x307.png