79: GreyBeards talk AI deep learning infrastructure with Frederic Van Haren, CTO & Founder, HighFens, Inc.

We’ve talked with Frederic before (see: Episode #33 on HPC storage) but since then, he has worked for an analyst firm and now he’s back on his own again, at HighFens. Given all the interest of late in AI, machine learning and deep learning, we thought it would be a great time to catch up and have him shed some light on deep learning and what it needs for IT infrastructure.

Frederic has worked for HPC / Big Data / AI / IoT solutions in the speech recognition industry, providing speech recognition services for some of the largest organizations in the world. As I understand it, the last speech recognition AI application he worked on implemented deep learning.

A brief history of AI

Frederic walked the Greybeards through the history of AI from the dawn of computing (1950s) until the recent emergence of deep learning (2010).

He explained that, early on one could implement a chess playing program, using hand coded rules based on a chess expert’s playing technique. Later when machine learning came out, one could use statistical analysis on multiple games and limited rule creation to teach a AI machine learning system how to play chess. With deep learning (DL), all you have to do now is to feed a DL model all the games you have and it learns how to play chess well all by itself. No rule making needed.

AI DL training and deployment infrastructure

Frederic described some of the infrastructure and data needs for various phases of an industrial scale, AI DL workflow.

Training deep learning models takes data and the more, the better. Gathering/saving large amounts of data used for DL training is a massive write workload and at the end of that process, hopefully you have PB of data to work with.

Selecting DL training data from all those PBs, involves a lot of mixed read and write IO. In the end, one has selected and extracted the data to use to train your DL models.

During DL training, IO needs are all about heavy data read throughput. But there’s more, in the later half of the talk, Frederic talked about the need to keep expensive GPU cores busy and that requires sophisticated caching or Tier 0 storage supporting low latency IO.

Ray’s been doing a lot of blogging and other work on AI machine and deep learning (e.g., see Learning machine learning – parts 1, 2, & 3) so it was great to hear from Frederic, a real practitioner of the art. Frederic (with some of Ray’s help) explained the deep learning training process. But it wasn’t detailed enough for Howard, so per Howard’s request, we went deeper into how it really works.

Once you have a DL model trained and working within specifications (e.g., prediction accuracy), Frederic said deploying DL models into production involves creating two separate clusters. One devoted to deep learning model inferencing, which takes in data from the world and performs inferencing (prediction, classification, interpretations, etc.) and the other uses that information for model adaption to fine tune DL models for specific instances.

Adaption and inferencing were both read and write IO workloads and the performance of this IO was dependent on a specific model’s use

Model adaption would personalize model predictions for each and every person, car, genotype, etc. This would be done periodically (based on SLAs, e.g. every 4 hrs). After that, a new, adapted model could be introduced into production, adapted for that specific person/car/genotype.

If the adaption applied more generally, that data and its human-machine validated/vetted prediction, classification, interpretation, etc. would be added back into the DL model training set to be used the next time a full model training pass was to be done. Frederic said AI DL model training is never done.

Sometime later, all this DL training, production and adaption data needs to be archived for long term access.

We then discussed the recent offerings from NVIDIA and major storage vendors that package up a solution for AI deep learning. It seems we are seeing another iteration of Converged Infrastructure, only this time for AI DL.

Finally, over the course of Ray’s AI DL education, he had come to the belief that AI deep learning could be applied by anyone. Frederic corrected Ray stating that AI deep learning should be applied by anyone.

The podcast runs ~44 minutes. Frederic’s been an old friend of Howard’s and Ray’s, since before the last podcast. He’s one of the few persons in the world that the GreyBeards know that has real world experience in deploying AI DL, at industrial scale. Frederic’s easy to talk with and very knowledgeable about the intersection of Ai DL and IT infrastructure. Howard and I had fun talking with him again on this episode. Listen to the podcast to learn more. .

Frederic Van Haren

Frederic Van Haren is the Chief Technology Officer @ HighFens. He has over 20 years of experience in high tech and is known for his insights in HPC, Big Data and AI from his hands-on experience leading research and development teams. He has provided technical leadership and strategic direction in the Telecom and Speech markets.

He spent more than a decade at Nuance Communications building large HPC and AI environments from the ground up and is frequently invited to speak at events to provide his vision on the HPC, AI, and storage markets. Frederic has also served as the president of a variety of technology user groups promoting the use of innovative technology.

As an engineer, he enjoys working directly with engineering teams from technology vendors and on challenging customer projects.

Frederic lives in Massachusetts,  USA but grew up in the northern part of Belgium where he received his Masters in Electrical Engineering, Electronics and Automation.