(Automated transcription)
Jay Marshall: Lots of people talk about democratizing AI, but you can't democratize AI if you stop at the hardware. And lots of folks are starting to realize that deploying AI in production, at scale, can be challenging and very expensive if they wait until the end to consider their hardware options.
Eric Korman: At Striveworks, we say our goal is to disappear MLOps, and what we mean by that is there's all these things that data scientists do that's not actually data science.
So things such as provisioning: compute, setting up GPU device drivers, managing model checkpoints and artifacts, managing data sets. And it really just should be taken care of for them and automated.
And that's what our platform, Chariot, aims to do. So really just allowing data scientists to focus on data science.
And then the other big aspect of Chariot is we want it to go wherever our clients are. So if we have a client that, say, is already hooked into a cloud provider such as GCP, or if they're multi-cloud or hybrid or completely on-prem, our platform can deploy there and has the same low-code, no-code interface, regardless of the underlying infrastructure.
Now with this stack, in Chariot, our customers can train, deploy, and monitor models on a very low-code, no-code way. And that deployment piece is taken care of by NeuralMagic and C3D.
So models are running on CPU, especially with the C3D instances compared to the earlier generations. You're not sacrificing any speed by moving those models off of GPU.
And so that really just saves our customers money, allows them to train more models, it's
Jay Marshall: At its core, NeuralMagic is an ML optimization company. We help customers optimize their machine learning models and then run those models as performantly as possible on the underlying hardware.
Now the latest Google Cloud C3D instances, powered by AMD EPYC™ processors, bring even more performance than the prior generation N2D instances at, ultimately, a lower cost.
And Google Cloud's own internal testing found 2x performance gains on NLP models like BERT and up to 3x performance gains on computer vision models like YOLO and ResNet in gen-over-gen performance comparisons.
NeuralMagic's DeepSparse can be deployed either as a virtual machine image, a container
on Google Kubernetes Engine, or even on physical hardware, like Google Distributed Cloud.
But by bringing this kind of performance to CPU-based architectures like the C3D instances on Google Cloud, NeuralMagic is giving customers choice by deploying on the ubiquitous x86 processors that they already know how to manage.
This choice, and the operational efficiencies that come with it, move us one step closer to the true democratization of AI.