75: GreyBeards talk persistent memory IO with Andy Grimes, Principal Technologist, NetApp

Sponsored By:  NetApp
In this episode we talk new persistent memory IO technology  with Andy Grimes, Principal Technologist, NetApp. Andy presented at the NetApp Insight 2018 TechFieldDay Extra (TFDx) event (video available here). If you get a chance we encourage you to watch the videos as Andy, did a great job describing their new MAX Data persistent memory IO solution.

The technology for MAX Data came from NetApp’s Plexistor acquisition. Prior to the acquisition, Plexistor had also presented at a SFD9 and TFD11.

Unlike NVMeoF storage systems, MAX Data is not sharing NVMe SSDs across servers. What MAX Data does is supply an application-neutral way to use persistent memory as a new, ultra fast, storage tier together with a backing store.

MAX Data performs a write or an “active” (Persistent Memory Tier) read in single digit µseconds for a single core/single thread server. Their software runs in user space and as such, for multi-core servers, it can take up to 40  µseconds.  Access times for backend storage reads is the same as NetApp AFF but once read, data is automatically promoted to persistent memory, and while there, reads ultra fast.

One of the secrets of MAX Data is that they have completely replaced the Linux Posix File IO stack with their own software. Their software is streamlined and bypasses a lot of the overhead present in today’s Linux File Stack. For example, MAX Data doesn’t support metadata-journaling.

MAX Data works with many different types of (persistent) memory, including DRAM (non-persistent memory), NVDIMMs (DRAM+NAND persistent memory) and Optane DIMMs (Intel 3D Xpoint memory, slated to be GA end of this year). We suspect it would work with anyone else’s persistent memory as soon as they come on the market.

Even though the (Optane and NVDIMM) memory is persistent, server issues can still lead to access loss. In order to provide data availability for server outages, MAX Data also supports MAX Snap and MAX Recovery. 

With MAX Snap, MAX Data will upload all persistent memory data to ONTAP backing storage and ONTAP snapshot it. This way you have a complete version of MAX Data storage that can then be backed up or SnapMirrored to other ONTAP storage.

With MAX Recovery, MAX Data will synchronously replicate persistent memory writes to a secondary MAX Data system. This way, if the primary MAX Data system goes down, you still have an RPO-0 copy of the data on another MAX Data system that can be used to restore the original data, if needed. Synchronous mirroring will add 3-4  µseconds to the access time for writes, quoted above.

Given the extreme performance of MAX Data, it’s opening up whole new set of customers to talking with NetApp. Specifically, high frequency traders (HFT) and high performance computing (HPC). HFT companies are attempting to reduce their stock transactions access time to as fast as humanly possible. HPC vendors have lots of data and processing all of it in a timely manner is almost impossible. Anything that can be done to improve throughput/access times should be very appealing to them.

To configure MAX Data, one uses a 1:25 ratio of persistent memory capacity to backing store. MAX Data also supports multiple LUNs.

MAX Data only operates on Linux OS and supports (IBM) RedHat and CentOS, But Andy said it’s not that difficult to add support for other versions of Linux Distros and customers will dictate which other ones are supported, over time.

As discussed above, MAX Data works with NetApp ONTAP storage, but it also works with SSD/NVMe SSDs as backend storage. In addition, MAX Data has been tested with NetApp HCI (with SolidFire storage, see our prior podcasts on NetApp HCI with Gabriel Chapman and Adam Carter) as well as E-Series storage. The Plexistor application has been already available on AWS Marketplace for use with EC2 DRAM and EBS backing store. It’s not much of a stretch to replace this with MAX Data.

MAX Data is expected to be GA released before the end of the year.

A key ability of the MAX Data solution is that it requires no application changes to use persistent memory for ultra-fast IO. This should help accelerate persistent memory adoption in data centers when the hardware becomes more available. Speaking to that, at Insight2018, Lenovo, Cisco and Intel were all on stage when NetApp announced MAX Data.

The podcast runs ~25 minutes. Andy’s an old storage hand (although no grey beard) and talks the talk, walks the walk of storage religion. Andy is new to TFD but we doubt it will be the last time we see him there. Andy was very conversant on the MAX Data technology and the market that it apparently is opening up for this new technology.  Listen to our podcast to learn more.

Andy Grimes, Principal Technologiest, NetApp

Andy has been in the IT industry for 17 years, working in roles spanning development, technology architecture, strategic outsourcing and Healthcare..

For the past 4 years Andy has worked with NetApp on taking the NetApp Flash business from #5 to #1 in the industry (according to IDC). During this period NetApp also became the fastest growing Flash and SAN vendor in the market and regained leadership in the Gartner quadrant.

Andy also works with NetApp’s product vision, competitive analysis and future technology direction and working with the team bringing the MAX Data PMEM product to market.

Andy has a BS degree in psychology, a BPA in management information systems, and an MBA. He current works as a Principal Technologist for the NetApp Cloud Infrastructure Business Unit with a focus on PMEM, HCI and Cloud Strategy. Andy lives in Apex, NC with his beautiful wife and has 2 children, a 4 year old and a 22 year old (yes don’t let this happen to you). For fun Andy likes to Mountain Bike, Rock Climb, Hike and Scuba Dive.

50: Greybeards wrap up Flash Memory Summit with Jim Handy, Director at Objective Analysis

In this episode we talk with Jim Handy (@thessdguy), Director at Objective Analysis,  a semiconductor market research organization. Jim is an old friend and was on last year to discuss Flash Memory Summit (FMS) 2016. Jim, Howard and I all attended FMS 2017 last week  in Santa Clara and Jim and Howard were presenters at the show.

NVMe & NVMeF to the front

Although, unfortunately the show floor was closed due to fire, there were plenty of sessions and talks about NVMe and NVMeF (NVMe over fabric). Howard believes NVMe & NVMeF seems to be being adopted much quicker than anyone had expected. It’s already evident inside storage systems like Pure’s new FlashArray//X, Kamanario and E8 storage, which is already shipping block storage with NVMe and NVMeF.

Last year PCIe expanders and switches seemed like the wave of the future but ever since then, NVMe and NVMeF has taken off. Historically, there’s been a reluctance to add capacity shelves to storage systems because of the complexity of (FC and SAS) cable connections. But with NVMeF, RoCE and RDMA, it’s now just an (40GbE or 100GbE) Ethernet connection away, considerably easier and less error prone.

3D NAND take off

Both Samsung and Micron are talking up their 64 layer 3D NAND and the rest of the industry following. The NAND shortage has led to fewer price reductions, but eventually when process yields turn up, the shortage will collapse and pricing reductions should return en masse.

The reason that vertical, 3D is taking over from planar (2D) NAND is that planar NAND can’t’ be sharing much more and 15nm is going to be the place it stays at for a long time to come. So the only way to increase capacity/chip and reduce $/Gb, is up.

But as with any new process technology, 3D NAND is having yield problems. But whenever the last yield issue is solved, which seems close,  we should see pricing drop precipitously and much more plentiful (3D) NAND storage.

One thing that has made increasing 3D NAND capacity that much easier is string stacking. Jim describes string stacking as creating a unit, of say 32 layers, which you can fabricate as one piece  and then layer ontop of this an insulating layer. Now you can start again, stacking another 32 layer block ontop and just add another insulating layer.

The problem with more than 32-48 layers is that you have to (dig) create  holes (connecting) between all the layers which have to be (atomically) very straight and coated with special materials. Anyone who has dug a hole knows that the deeper you go, the harder it is to make the hole walls straight. With current technology, 32 layers seem just about as far as they can go.

3DX and similar technologies

There’s been quite a lot of talk the last couple of years about 3D XPoint (3DX) and what it  means for the storage and server industry. Intel has released Octane client SSDs but there’s no enterprise class 3DX SSDs as of yet.

The problem is similar to 3D NAND above, current yields suck.  There’s a chicken and egg problem with any new chip technologies. You need volumes to get the yield up and you need yields up to generate the volumes you need. And volumes with good yields generate profits to re-invest in the cycle for next technology.

Intel can afford to subsidize (lose money) 3DX technology until they get the yields up, knowing full well that when they do, it will become highly profitable.

The key is to price the new technology somewhere between levels in the storage hierarchy, for 3DX that means between NAND and DRAM. This does mean that 3DX will be more of between memory and SSD tier than a replacement for for either DRAM or SSDs.

The recent emergence of NVDIMMs have provided the industry a platform (based on NAND and DRAM) where they can create the software and other OS changes needed to support this mid tier as a memory level. So that when 3DX comes along as a new memory tier they will be ready

NAND shortages, industry globalization & game theory

Jim has an interesting take on how and when the NAND shortage will collapse.

It’s a cyclical problem seen before in DRAM and it’s a question of investment. When there’s an oversupply of a chip technology (like NAND), suppliers cut investments or rather don’t grow investments as fast as they were. Ultimately this leads to a shortage and which then leads to  over investment to catch up with demand.  When this starts to produce chips the capacity bottleneck will collapse and prices will come down hard.

Jim believes that as 3D NAND suppliers start driving yields up and $/Gb down, 2D NAND fabs will turn to DRAM or other electronic circuitry whichwill lead to a price drop there as well.

Jim mentioned game theory is the way the Fab industry has globalized over time. As emerging countries build fabs, they must seek partners to provide the technology to produce product. They offer these companies guaranteed supplies of low priced product for years to help get the fabs online. Once, this period is over the fabs never return to home base.

This approach has led to Japan taking over DRAM & other chip production, then Korea, then Taiwan and now China. It will move again. I suppose this is one reason IBM got out of the chip fab business.

The podcast runs ~49 minutes but Jim is a very knowledgeable, chip industry expert and a great friend from multiple  events. Howard and I had fun talking with him again. Listen to the podcast to learn more.

Jim Handy, Director at Objective Analysis

Jim Handy of Objective Analysis has over 35 years in the electronics industry including 20 years as a leading semiconductor and SSD industry analyst. Early in his career he held marketing and design positions at leading semiconductor suppliers including Intel, National Semiconductor, and Infineon.

A frequent presenter at trade shows, Mr. Handy is known for his technical depth, accurate forecasts, widespread industry presence and volume of publication. He has written hundreds of market reports, articles for trade journals, and white papers, and is frequently interviewed and quoted in the electronics trade press and other media.  He posts blogs at www.TheMemoryGuy.com, and www.TheSSDguy.com

GreyBeards on Storage year end 2015 podcast

In our annual yearend podcast and it’s the Ray and Howard show, talking about storage futures, industry trends and some storage world excitement of- the past year.

We start the discussion deconstructing recent reductions in year over year revenues at major storage vendors. It seems with the advent of all flash arrays (AFA), and all major vendors and most startups now have AFAs, customers no longer feel the need to refresh old storage hardware with similarly (over-)configured new systems. Instead, most can get by with AFA storage, at smaller capacities that provides the same, if not better, performance. Further9, the fact that AFAs are available from so many vendors and startups, customers no longer have to buy performance storage exclusively from major vendors anymore. This is leading to a decline in major vendor storage revenues, which should play itself out over the next 1-2 years as most enterprise storage systems are refreshed.

Recent and future acquisitions also came up for discussion. NetApp’s purchase of SolidFire was a surprise, but SolidFire had carved out a good business with service providers and web-scale customers which should broaden NetApp’s portfolio. In the mean time, the Dell-EMC acquisition takes them out of the competition for new technology acquisitions, at least until it closes. NetApp’s new CEO, George Kurian, appears more willing than his predecessor to go after good storage technology, wherever it comes from.

Software delivered (defined) storage came up as well. With the compute available in todays micro-processors, there’s very little a software delivered storage system can’t do. And with scale-out storage, there’s even more cores to work with. Software delivered storage and scale-out will continue to play a spoiler role, at least in the low to mid-range, in the storage market throughout the next year.

Nonetheless, hardware still has some excitement left. Intel’s recent acquisition of Altera, now makes Xeon/x86 processing available for embedded applications that previously had to rely on ARM and MIPS processing. Now, there’s nothing an FPGA hardware based system can’t do. Look for lot’s more activity here over the long term.

We talked about recent SMR disks coming out and how they could be used in storage systems today.  There was some adjacent discussion on the flash-disk crossover, and conclude it’s unlikely over the next 3-5 years, at least for capacity drives. Although there’s plenty of analyst that say it’s already happened, on a pure $/GB there’s still no comparison.

We then turned to  3D TLC NAND and the  reliability capabilities available from current controlller technologies. Raw planar NAND available today is much less reliable than what we had 1-2 generations back, but the drives, if anything, have gotten more reliable. This is due to the reliability technology inherent in todays SSD controllers.

We had an aside, on SSD overprovisioning and how this should become a customer level option.  Reducing overprovisioning would decrease drive endurance but it’s a tradeoff that the vendors/distributors make for customers today. We feel that at least for some customers, they could make this decision just as well. Especially if drive replacements were a customer maintenance activity with replacement SSDs shipped in a just-in-time manner.

We conclude on 3D XPoint (3DX) non-volatile memory. We both agreed 3DX adoption depends on pricing which will change over time. In the long term, we see the potential for a new storage system with 3DX or other new non-volatile memory as a top performing storage/caching/non-volatile memory tier, 3D TLC NAND as a middle tier and SMR disk as the bottom tier. When is another question.

Our year end discussion always wanders a bit, from high end business trends to in the weeds technologies and everything in-between. This one is no exception and runs over 49 minutes. We tried to do another Year End video this time but neither of our video recording systems worked out, but we had a good audio recording, so we went with the podcast this year. Next year should be back to video.  Listen to the podcast to learn more.

Howard Marks

Howard Marks is the Founder and Chief Scientist of howardmarksDeepStorage, a prominent blogger at Deep Storage Blog and can be found on twitter @DeepStorageNet.

 

Ray Lucchesi

Ray Lucchesi is the President and Founder of Silverton Consulting, a prominent blogger at RayOnStorage.com, and can be found on twitter @RayLucchesi.