41: Greybeards talk time shifting storage with Jacob Cherian, VP Product Management and Strategy, Reduxio

In this episode, we talk with Jacob Cherian (@JacCherian),  VP of Product Management and Product Strategy at Reduxio. They have a produced a unique product that merges some characteristics of CDP storage and the best of hybrid and deduplicating storage today into a new primary storage system. We first saw Reduxio at VMworld a couple of years back and this is the first chance we have had a chance to talk with them.

Backdating data

Many of us have had the need to go back to previous versions of files, volumes and storage. But few systems provide an easy way to do this. Reduxio is the first storage system that makes this extremely effortless to do.

Reduxio’s storage system splits apart an IO write operation into data and meta-data. The IO meta-data information includes the volume/LUN id, offset into the volume, and data length. The data is chunked, compressed, hashed, and then sent to NVRam cache. The IO meta-data and a system wide time stamp together with data chunk hash(es) are sent to a separate key-value (K-V) meta-data store.

What Reduxio supplies is an easy way to go back for any data volume, to any second in its past. Yes there are limits as to how far back one can go with a data volume. Like saving every second for the last 8 hours,  every hour for the last week, every week for the last month, every month for the last year, etc. all of which can be established at volume configuration time. But all this does is tell Reduxio when to discard old data.

With all this in place, re-establishing a volume to some instant in its past is simply a query to the meta-data K-V store with the appropriate time stamp. The meta-data K-V store returns from the query all the hashes and other IO meta-data for all the data chunks in sequence for the volume of data at that point in time, in it’s past. With that information the system can easily fabricate the volume at that moment in its past.

By keeping the data and the meta-data tag, time stamp and hash(es) information separate, Reduxio can reconstruct the data at any time (to one second granularity) in the past where data is still available to the system.

Performance

In the past, this sort of time shifting storage functionality was limited to a separate CDP backup appliance. What Reduxio has done is integrate all this functionality with a deduplicating-compressed, auto tiering primary storage system. So every IO is chunking, deduplicating, compressing data and splitting the meta-data, time-stamps, hashes from data chunks.  There is no IO performance penalty for doing any of this, it’s all a part of the normal IO path of the Reduxio primary storage system.

However, there is some garbage collection activity that needs to go on in order to deal with data that’s no longer needed. Reduxio does this mostly in real time, as the data actually expires.

Deduplication, compression and all the other characteristics of the storage system that enable its time shifting capabilities cannot be turned off.

Auto storage tiering

Reduxio optimized their auto-tiering beyond what is normally done in other hybrid storage systems. Data is chunked and moved to cache and ultimately destaged to flash. Hot vs. cold data is analyzed in real time, not sometime later with other hybrid storage system. Also, when data is deemed cold and needs to be moved to disk, Reduxio takes another step to analyze it’s meta-data K-V store and other information to see what other data was referenced during the same time as this data. This way it can attempt to demote a “group” of data chunks that will likely all be referenced together. That way when one chunk of this “group” of data is referenced, the rest can be promoted to flash/cache at the same time.

Their auto-tiering group algorithm is used, every time they demote data and every time they promote data to a faster tier they can start to record any data that is referenced together. This way the next time they demote data chunks  the group definition can be further refined.

Reduxio storage system

Reduxio provides a hybrid (disk-SSD) iSCSI primary storage system that holds 40TB of storage today, and with an average compression-dedupe ratio (over their 2PB of field data) of  >4:1, 40TB should equate to over 160TB of usable data storage. Some of that usable storage would be for current volume data and some would be used for historical data.

There was a Slack discussion the other week on what to do about ransomware. It seems to me that Reduxio with its time traveling storage, could be used as an effective protection for any ransomware.

The podcast runs ~41 minutes, although snapshots have been around for a long time (one of the Greybeards worked on a snapshotting storage system back in the early 90s), Reduxio has taken the idea to new heights.  Listen to the podcast to learn more.

Jacob Cherian, VP Product Management and Product Strategy, Reduxio

Jacob is responsible for Reduxio’s product vision and strategy. Jacob has overall ownership for defining Reduxio’s product portfolio and roadmap.

Prior to joining Reduxio, Jacob spent 14 years at Dell in the Enterprise Storage Group leading product development and architectural initiatives for host storage, NAS, SAN, RAID and other data center infrastructure. As a member of Dell’s storage architecture council he was responsible for developing Dell’s strategy for unstructured data management, and drove its implementation through organic development efforts and technology acquisitions such as Ocarina Networks and Exanet. In his last role as a Dell expatriate in Israel he oversaw Dell’s FluidFS development.

Jacob started his career in Dell as a development engineer for various SAN, NAS and host-side solutions, then served as the Architect and Technologist for Dell’s MD series of external storage arrays.

Jacob was named a Dell Inventor of the Year in 2005, and holds 30 patents and has 20 patents pending in the areas of storage and networking. He holds a Bachelor of Science (B.S.) in Electrical Engineering from the Cochin University of Science and Technology, a Master of Science (M.S.) in Computer Science from Oklahoma State University, and a Master of Business Administration (MBA) from the Kellogg School of Management, Northwestern University