42: GreyBeards talk next gen, tier 0 flash storage with Zivan Ori, CEO & Co-founder E8 Storage.

In this episode, we talk with Zivan Ori (@ZivanOri), CEO and Co-founder of E8 Storage, a new storage startup out of Israel. E8 Storage provides a tier 0, next generation all flash array storage solution for HPC and high end environments that need extremely high IO performance, with high availability and modest data services. We first saw E8 Storage at last years Flash Memory Summit (FMS 2016) and have wanted to talk with them since.

Tier 0 storage

The Greybeards discussed new tier 0 solutions in our annual yearend industry review podcast. As we saw it then, tier 0 provides lightening fast (~100s of µsec) read and write IO operations and millions of IO/sec. There are not a lot of applications that need this level of speed and quantity of IOs but for those that do, Tier 0 storage is their only solution.

In the past Tier 0, was essentially SSDs sitting on a PCIe bus, isolated to a single server. But today, with the emergence of NVMe protocols and SSDs, 40/50/100GBE NICs and switches and RDMA  protocols, this sort of solution can be shared across from racks of servers.

There were a few shared Tier 0 solutions available in the past but their challenge was that they all used proprietary hardware. With today’s new hardware and protocols, these new Tier 0 systems often perform as good or much better than the old generation but with off the shelf hardware.

E8 came to the market (emerged out of stealth and GA’d in September of 2016) after NVMe protocols, SSDs and RDMA were available in commodity hardware and have taken advantage of all these new capabilities.

E8 Storage system hardware & software

E8 Storage offers a 2U HA appliance with 24, hot-pluggable NVMe SSDs connected to it and support 8 client or host ports. The  hardware appliance has two controllers, two power supplies, and two batteries. The batteries are used to hold up a DRAM write cache until it can be flushed to internal storage for power failures. They don’t do any DRAM read caching because the performance off the NVMe SSDs is more than fast enough.

The 24 NVMe SSDs are all dual ported for fault tolerance and provide hot-pluggable replacement for better servicing in the field. One E8 Storage system can supply up to 180TB of usable, shared NVMe flash storage.

E8 Storage uses RDMA (RoCE) NICs between client servers and their storage system, which support 40GBE, 50GBE or 100GBE networking.

E8 does not do data reduction (thin provisioning, data deduplication or data compression) on their storage, so usable capacity = effective capacity.  Their belief is that these services consume a lot of compute/IO limiting IO/sec and increasing response times and as the price of NVMe SSD capacity is coming down over time these activities become less useful.

They also have client software that provides a fault tolerant initiator for their E8 storage. This client software supports MPIO and failover across controllers in the event of a controller outage. The client software currently runs on just about any flavor of Linux available today and E8 is working to port this to other OSs based on customer requests.

Storage provisioning and management is through a RESTful API, CLI or web based GUI management portal. Hardware support is supplied by E8 Storage and they offer a 3 year warranty on their system with the ability to extend this to 5 years, if needed.

One problem with today’s standard NVMe over Fabric solutions is that they lack any failover capabilities and really have no support for data protection. By developing their own client software, E8 provides fault tolerance and data protection for Tier 0 storage. They currently supported RAID 0 and 5 for E8 Storage and RAID 6 is in development.

Performance

Everyone wants native DAS-NVMe SSD storage but unlike server Tier 0 solutions, E8 Storage’s 180TB of NVMe capacity can be shared across up to 100 servers (currently have 96 servers talking to a single E8 Storage appliance at one customer).  By moving this capacity out to a shared storage device it can be be made more fault tolerant, more serviceable and be amortized over more servers. However the problem with doing this has always been the lack of DAS like performance.

Talking to Zivan, he revealed that a single E8 Storage service was capable of 5M IO/sec, and at that rate, the system delivers an average response time of  300µsec and for a more reasonable 4M IO/sec, the system can deliver ~120µsec response times. He said they can saturate a 100GBE network by operating at 10M IO/sec. He didn’t say what the response time was at 10M IO/sec but with network saturation, response times probably went exponentially higher.

The other thing that Zivan mentioned was that the system delivered these response times with a very small variance (standard deviation). I believe he mentioned 1.5 to 3% standard deviations which at 120µsec is 18 to 36µsec and even at 300µsec its 45 to 90µsec. We have never see this level of response times, response time variance and IO/sec in a single shared storage system before.

E8 Storage

Zivan and many of his team previously came from IBM XIV storage. As such, they have  been involved in developing and supporting enterprise class storage systems for quite awhile now. So, E8 Storage knows what it takes to create products that can survive in 7X24, high end, highly active and demanding environments.

E8 Storage currently has customers in production in the US. They are seeing primary interest  in their system from the HPC, FinServ, and Retail industries but any large customers could have the need for something like this.  They sell their storage for from $2 to $3/GB.

The podcast runs ~42 minutes, and Zivan was easy to talk with and has a good grasp of the storage industry technologies.  Listen to the podcast to learn more.

Zivan Ori CEO & Co-Founder, E8 Storage

Mr. Zivan Ori is the co-founder and CEO of E8 Storage. Before founding E8 Storage, Mr. Ori held the position of IBM XIV R&D Manager, responsible for developing the IBM XIV high-end, grid-scale storage system, and served as Chief Architect at Stratoscale, a provider of hyper-converged infrastructure.

Prior to IBM XIV, Mr. Ori headed Software Development at Envara (acquired by Intel) and served as VP R&D at Onigma (acquired by McAfee).

41: Greybeards talk time shifting storage with Jacob Cherian, VP Product Management and Strategy, Reduxio

In this episode, we talk with Jacob Cherian (@JacCherian),  VP of Product Management and Product Strategy at Reduxio. They have a produced a unique product that merges some characteristics of CDP storage and the best of hybrid and reduplicating storage today into a new primary storage system. We first saw Reduxio at VMworld a couple of years back and this is the first chance we have had a chance to talk with them.

Backdating data

Many of us have had the need to go back to previous versions of files, volumes and storage. But few systems provide an easy way to do this. Reduxio is the first storage system that makes this extremely effortless to do.

Reduxio’s storage system splits apart an IO write operation into data and meta-data. The IO meta-data information includes the volume/LUN id, offset into the volume, and data length. The data is chunked, compressed, hashed, and then sent to NVRam cache. The IO meta-data and a system wide time stamp together with data chunk hash(es) are sent to a separate key-value (K-V) meta-data store.

What Reduxio supplies is an easy way to go back for any data volume, to any second in its past. Yes there are limits as to how far back one can go with a data volume. Like saving every second for the last 8 hours,  every hour for the last week, every week for the last month, every month for the last year, etc. all of which can be established at volume configuration time. But all this does is tell Reduxio when to discard old data.

With all this in place, re-establishing a volume to some instant in its past is simply a query to the meta-data K-V store with the appropriate time stamp. The meta-data K-V store returns from the query all the hashes and other IO meta-data for all the data chunks in sequence for the volume of data at that point in time, in it’s past. With that information the system can easily fabricate the volume at that moment in its past.

By keeping the data and the meta-data tag, time stamp and hash(es) information separate, Reduxio can reconstruct the data at any time (to one second granularity) in the past where data is still available to the system.

Performance

In the past, this sort of time shifting storage functionality was limited to a separate CDP backup appliance. What Reduxio has done is integrate all this functionality with a deduplicating-compressed, auto tiering primary storage system. So every IO is chunking, deduplicating, compressing data and splitting the meta-data, time-stamps, hashes from data chunks.  There is no IO performance penalty for doing any of this, it’s all a part of the normal IO path of the Reduxio primary storage system.

However, there is some garbage collection activity that needs to go on in order to deal with data that’s no longer needed. Reduxio does this mostly in real time, as the data actually expires.

Deduplication, compression and all the other characteristics of the storage system that enable its time shifting capabilities cannot be turned off.

Auto storage tiering

Reduxio optimized their auto-tiering beyond what is normally done in other hybrid storage systems. Data is chunked and moved to cache and ultimately destaged to flash. Hot vs. cold data is analyzed in real time, not sometime later with other hybrid storage system. Also, when data is deemed cold and needs to be moved to disk, Reduxio takes another step to analyze it’s meta-data K-V store and other information to see what other data was referenced during the same time as this data. This way it can attempt to demote a “group” of data chunks that will likely all be referenced together. That way when one chunk of this “group” of data is referenced, the rest can be promoted to flash/cache at the same time.

Their auto-tiering group algorithm is used, every time they demote data and every time they promote data to a faster tier they can start to record any data that is referenced together. This way the next time they demote data chunks  the group definition can be further refined.

Reduxio storage system

Reduxio provides a hybrid (disk-SSD) iSCSI primary storage system that holds 40TB of storage today, and with an average compression-dedupe ratio (over their 2PB of field data) of  >4:1, 40TB should equate to over 160TB of usable data storage. Some of that usable storage would be for current volume data and some would be used for historical data.

There was a Slack discussion the other week on what to do about ransomware. It seems to me that Reduxio with its time traveling storage, could be used as an effective protection for any ransomware.

The podcast runs ~41 minutes, although snapshots have been around for a long time (one of the Greybeards worked on a snapshotting storage system back in the early 90s), Reduxio has taken the idea to new heights.  Listen to the podcast to learn more.

Jacob Cherian, VP Product Management and Product Strategy, Reduxio

Jacob is responsible for Reduxio’s product vision and strategy. Jacob has overall ownership for defining Reduxio’s product portfolio and roadmap.

Prior to joining Reduxio, Jacob spent 14 years at Dell in the Enterprise Storage Group leading product development and architectural initiatives for host storage, NAS, SAN, RAID and other data center infrastructure. As a member of Dell’s storage architecture council he was responsible for developing Dell’s strategy for unstructured data management, and drove its implementation through organic development efforts and technology acquisitions such as Ocarina Networks and Exanet. In his last role as a Dell expatriate in Israel he oversaw Dell’s FluidFS development.

Jacob started his career in Dell as a development engineer for various SAN, NAS and host-side solutions, then served as the Architect and Technologist for Dell’s MD series of external storage arrays.

Jacob was named a Dell Inventor of the Year in 2005, and holds 30 patents and has 20 patents pending in the areas of storage and networking. He holds a Bachelor of Science (B.S.) in Electrical Engineering from the Cochin University of Science and Technology, a Master of Science (M.S.) in Computer Science from Oklahoma State University, and a Master of Business Administration (MBA) from the Kellogg School of Management, Northwestern University

40: Greybeards storage industry yearend review podcast

In this episode, the Greybeards discuss the year in storage and naturally we kick off with the consolidation trend in the industry and the big one last year, the DELL-EMC acquisition. How the high margin EMC storage business is going to work in a low margin company like Dell is the subject of much speculation. That and which of the combined companies storage products will make it through the transition make for interesting discussions. And Finally what exactly is Dell’s long term strategy is another question.

We next turn to the coming of age of object storage. A couple of years ago, object storage was being introduced to a wider market but few wanted to code to RESTful interfaces. Nowadays, that seems to be less of a concern and the fact that one can have onsite/offsite/cloud based object storage repositories from open source, proprietary solutions and everything in between is making object storage a much more appealing option to enterprise IT.

Finally, we discuss the new Tier 0. What with NVMe SSDs and the emergence of NVMe over Fabric coming out last year, Tier 0 has never looked so promising.  You may recall that Tier 0 was hot about 5 years with TMS and Violin and others coming out with lightning fast storage IO. But with DELL-EMC DSSD: startups (E8 storage, Mangstor, Apeiron data systems, and others); NVMDIMMs, CrossBar, and Everspin coming out with denser offerings; and other SCM (Micron, HPE, IBM, others?) technologies on the horizon, Tier 0 has become red hot again.

Sorry about the occasional airplane noise and other audio anomalies. The podcast runs  over 47 minutes. Howard and I could talk for hours on what’s happening in the storage industry. Listen to the podcast to learn more.

Ray Lucchesi is the President and Founder of Silverton Consulting, a prominent blogger at RayOnStorage.com, and can be found on twitter @RayLucchesi.

Howard Marks is the Founder and Chief Scientist of howardmarksDeepStorage, a prominent blogger at Deep Storage Blog and can be found on twitter @DeepStorageNet.

 

39: Greybeards talk deep storage/archive with Matt Starr, CTO Spectra Logic

In this episode, we talk with Matt Starr (@StarrFiles),  CTO of Spectra Logic, the deep storage experts. Matt has been around a long time and Ray’s shared many a meal with Matt as we’re both in NW Denver. Howard has a minor quibble with Spectra Logic over the use of his company’s name (DeepStorage) in their product line but he’s also known Matt for awhile now.

The Pearl

Matt and Spectra Logic have a number of customers with multi-PB to over an EB of data repository problems and how to take care of these ever expanding storage stashes is an ongoing concern.  One of the solutions Spectra Logic offers is the Black Pearl Deep Storage, which provides an object storage, RESTfull interface front end to storage tiering/archive backend that uses flash, (spin-down) disk, (LTFS) tape (libraries) and the (AWS) cloud as backend storage.

Major portions of the Black Pearl are open sourced and available on GitHub. I see several (DS3-)SDK’s for Java, Python, C, and others. Open sourcing the product provides an easy way for client customization. In fact, one customer was using CEPH and they modified their CEPH backup client to send a copy of data off to the Pearl.

We talk a bit about the Black Pearl’s data integrity. It uses a checksum, computed over the object at creation time which is then verified anytime the object is retrieved, copied, moved or migrated and can be validated periodically (scrubbed), even when it has not been touched.

Super Computing’s interesting (storage) problems

Matt just returned from the SC16 (Super Computing Conference 2016) in Salt Lake City last month. At the conference there were plenty of MultiPB customers that were looking for better storage alternatives.

One customer Matt mentioned  was the Square Kilometer Array, the world’s largest radio telescope which will be transmitting 700TB/hour, over an 1EB per year.  All that data has to land somewhere and for this quantity (>eb) of data, tape becomes an necessary choice.

Matt likened Spectra’s  archive solutions to warehouses vs. factories. For the factory floor,  you need responsive (AFA or hybrid) primary storage but for the warehouse, you just want cheap, bulk storage (capacity).

The podcast runs long, over 51 minutes, and reveals a different world from the GreyBeards everyday enterprise environments. Specifically customers that have extra large data repositories and how they manage to survive under the data deluge. Matt’s an articulate spokesperson for Spectra Logic and their archive solutions and we could have talked about >eb data repositories for hours.  Listen to the podcast to learn more.

matt-starrMatt Starr, CTO, Spectra Logic

Matt Starr’s tenure with Spectra Logic spans 24 years and includes experience in service, hardware design, software development, operating systems, electronic design and management. As CTO, he is responsible for helping define the company’s product vision, and serves as the executive representative for the voice of the market. He leads Spectra’s efforts in high-performance computing, private cloud and other vertical markets.

Matt served as the lead engineering architect for the design and production of Spectra’s TSeries tape library family. Spectra Logic has secured more than 50 patents under Matt’s direction, establishing the company as the innovative technology leader in the data storage industry. He holds a BS in electrical engineering from the University of Colorado at Colorado Springs.

38: GreyBeards talk with Rob Peglar, Senior VP and CTO, Symbolic IO

In this episode, we talk with Rob Peglar (@PeglarR), Senior VP and CTO of Symbolic IO, a computationally defined storage vendor. Rob has been around almost as long as the GreyBeards (~40 years) and most recently was with Micron and prior to that, EMC Isilon. Rob is also on the board of SNIA.

Symbolic IO has emerged out of stealth earlier this year and intends to be shipping products by late this year/early next.  Rob joined Symbolic IO in July of 2016.

What’s computational storage?

It’s all about symbolic representation of bits. Symbolic IO has  come up with a way to encode bit streams into unique symbols that offer significant savings in memory space, beyond standard data compression techniques.

All that would be just fine if it was at the end of a storage interface and we would probably just call it a new form of data reduction. But Symbolic IO also incorporates persistent memory (NV-DIMMs, in the future 3D XPoint, RERam, others) and provides this symbolic data inside a server, directly through its processor data cache, in (decoded) raw data form.

Symbolic IO provides a translation layer between persistent memory and processor cache that decodes the symbolic representation of the data in persistent memory for data reads on the way into data cache and encodes the symbolic representation of the raw data for data writes on the way out of cache to persistent memory.

Rob says that the mathematics are there to show that Symbolic IO’s data reduction is significant and that the decode/encode functionality can be done in a matter of a few clock cycles per cache (line) access on modern (Intel) processors.

The system continually monitors the data it sees to determine what the optimum encoding should be and can change its symbolic table to provide more memory savings for new data written to persistent memory.

All this reminds the GreyBeards of Huffman encoding algorithms for data compression (which one of us helped deploy on a previous [unnamed] storage product). Huffman encoding transformed ASCII (8-bit) characters into variable length bit streams.

Symbolic IO will offer 3 products:,

  • IRIS™ Compute, which provides a persistent memory storage, accessed using something like the Linux pmem library and includes Symbolic StoreModules™ (persistent memory hardware);
  • IRIS Vault, which is an appliance with its own (IRIS) infused Linux (Symbolic’s SymCE™) OS plus Symbolic IO StoreModules, that can run any Linux application without change accessing the persistent memory and offers full data security, next generation snapshot-/clone-like capabilities with BLINK™ full storage backups, and offers enhanced physical security with the removable, IRIS Advanced EYE ASIC; and
  • IRIS Store, which extends the IRIS Vault and IRIS Compute above with more tiers of storage, using Symbolic IO StoreModules as Tier1, PCIe (flash) storage as Tier 2 and external SSD storage as Tier 3 storage.

For more information on Symbolic IO’s three products, so we would encourage you to read their website (linked above).

The podcast runs long, over 47 minutes, and was wide ranging, discussing some of the history of processor/memory/information technologies. It was very easy to talk with Rob and both Howard and I have known Rob for years, across multiple vendors & organizations.  Listen to the podcast to learn more.

peglar_robert_160x200Rob Peglar, Senior VP and CTO, Symbolic IO

Rob Peglar is the Senior Vice President and Chief Technology Officer of Symbolic IO. Rob is a seasoned technology executive with 39 years of data storage, network and compute-related experience, is a published author and is active on many industry boards, providing insight and guidance. He brings a vast knowledge of strategy and industry trends to Symbolic IO. Rob is also on the Board of Directors for the Storage Networking Industry Association (SNIA) and an advisor for the Flash Memory Summit. His role at Symbolic IO will include working with the management team to help drive the future product portfolio, executive-level forecasting and customer/partner interaction from early-stage negotiations through implementation and deployment.

Prior to joining Symbolic IO, Rob was the Vice President, Advanced Storage at Micron Technology, where he led next-generation technology and architecture enablement efforts of Micron’s Storage Business Unit, driving storage solution development with strategic customers and partners. Previously he was the CTO, Americas for EMC where he led the entire CTO functions for the Americas. He has also held senior level positions at Xiotech Corporation, StorageTek and ETA Systems.

Rob’s extensive experience in data management, analytics, high-performance computing, non-volatile memory, distributed cluster architectures, filesystems, I/O performance optimization, cloud storage and replication and archiving, networking, virtualization makes him a sought after industry expert and board member. He was named an EMC Elect in 2014, 2015 and 2016. He was one of 25 senior executives worldwide selected for the CRN ‘Storage Superstars’ Award in 2010.