Ray was at SFD19 a few weeks ago and the last session of the week (usually dead) was with MinIO and they just blew us away (see videos of MinIO’s session here). Ray thought Anand Babu (AB) Periasamy (@ABPeriasamy), CEO MinIO, who was the main presenter at the session, would be a great invite for our GreyBeards podcast. Keith and I had a ball talking with AB.
Why object store
There’s something afoot in object storage space over the last year or so. It seems everybody is looking to deploy object store whether that be on prem, in CoLo facilities and in the cloud. It could be just the mass of data coming online but that trend has remained the same for years no. No it’s something else.
It all starts with AWS and S3. Over the last couple of years AWS has been rolling out new functionality that only works with S3 and this has been driving even more adoption of S3 as well as other object storage solutions.
S3 compatible object stores are available in just about every cloud service, available from major (and minor) storage vendors and in open source from MinIO.
Why S3 is so popular
Because object store is accessed via RestFUL interfaces, traditionally most implementations used their own API to access it. But when AWS created S3 (simple storage service) with their own API/SDK to access it, it somehow became the de-facto standard interface for all other object stores. S3 compatibility became a significant feature that all object stores had to support.
Sometime after that MinIO came into existence. MinIO provides a 100% open source, fully AWS S3 compatible object store that you can run anywhere on prem, in CoLo facilities and indeed in the cloud. In fact, there exist customers that run MinIO in AWS AB says this is probably just customers using a packaged software solution which happens to include MinIO but it’s nonetheless more expensive than AWS S3 as it uses EC2 instances and EBS storage to create an object store
Customers can access MinIO object stores with the AWS S3 SDK or the MinIO SDK. and you can access AWS S3 storage with AWS S3 SDK or use MinIO SDK. Occosionally, AWS S3 updates have broken MinIO’s SDK but these have been later fixed by AWS. It seems AWS and MinIO are on good terms.
AB mentioned that as customers get up to a few PBs of AWS S3 storage they often find the costs to be too high. It’s at this point that they start looking at other object storage solutions. But because MinIO is 100% S3 compatible and it’s open source many of these customers deploy it in their own data center facilities or in colo environments.
For those customers that want it, MinIO also offers an S3 gateway. With the gateway on prem customers can use S3 or standard file services to access S3 object storage located in the cloud. The gateway also works in the public cloud and can support both AWS s3 as well as Microsoft Blob storage as a backend.
MinIO matches AWS S3 features
AWS S3 has a number of great features and MinIO has matched or exceeded them all, step by step. AWS S3 has cross region replication options where customers can replicate S3 data from one region to another. MinIO supports both asynchronous replication of S3 data and synchronous replication (using RADIO).
But MinIO adds support for erasure coding within a fault domain. Default is Nx2 erasure coding which duplicates all your data so as long as 1/2 of your servers and storage are available you continue to have access to all your data. But this can be configured down like 12+4 where data is split accross 16 servers any four of which can fail and you can still access data.
AWS customers can use a Snowball (standalone storage device) to transfer data to or from S3 storage. AWS Snowball implements a subset of S3 API and requires a NAS staging area of equivalent size to migrate data out of S3. MinIO has support for Snowball’s limited S3 API and as such, Snowball’s can be used to migrate data into or out of MinIO. MinIO has a blog post which describes their support for AWS Snowball.
AWS also offers S3 Lambda services or server less computing services where compute services can be invoked when data is loaded in a bucket and then turned off when no longer needed. AWS Lambda depends on AWS messaging and other services to work properly. But MinIO supports Lambda like functionality using other open source services. AB mentions MQTT and Kafka services. MinIO has another blog post discussing their Lambda like services based on Kafka.
AWS recently implemented Snowflake a SQL database server for unstructured data that uses S3 storage to hold data. Ray and Keith almost choked on that statement as unstructured data and databases never used to be uttered in the same breath. But what AWS has shown was that you can use object store for database data as long as you are willing to load the table into memory and process it there and then unload any modified table data back into the object store. Indexing of the object data seems to be done as the data is being loaded and is also being done in a (random IO) cache or in memory and once done can also be unloaded into the object store.
Now Snowflake uses S3 but it’s not available on prem. MinIO has a number of data base partners that make use of their object store as a backend to host a Snowflake like service onprem. AB mentioned Spark and Splunk but there are others as well.
We ended up the discussion with what does it mean to have 20K stars on GitHub. AB said if you did a java script getting 20K stars would be easy but you just don’t see this sort of open source popularity for storage systems. He said the number is interesting but the growth rate is even more interesting.
The podcast runs ~47 minutes. AB was a great to talk tech with. Keith and I could have talked all afternoon with AB. It was very hard to stop the recording as we could have talked with him for another hour or more. AB said he doesn’t like to do podcasts or videos but he had no problem with us firing away questions. Listen to the podcast to learn more.
Podcast: Play in new window | Download (Duration: 46:56 — 64.5MB) | Embed
Subscribe: Apple Podcasts | Google Podcasts | Spotify | Stitcher | Email | RSS
Anand Babu Periasamy, CEO MinIO
AB Periasamy is the CEO and co-founder of MinIO. One of the leading thinkers and technologists in the open source software movement, AB was a co-founder and CTO of GlusterFS which was acquired by RedHat in 2011. Following the acquisition, he served in the office of the CTO at RedHat prior to founding MinIO in late 2015. AB is an active angel investor and serves on the board of H2O.ai and the Free Software Foundation of India.
He earned his BE in Computer Science and Engineering from Annamalai University.