Is MLOps Disappearing?

Transcript

(Automated transcription)

Jay Marshall:

Hey, everybody. I'm Jay Marshall with Neural Magic. Today I'm joined by Eric Korman, one of the cofounders of Striveworks, another great startup that we've partnered with earlier on this year. Thanks for joining me today, Eric.

Eric Korman:

Yeah, thanks for having me.

Jay Marshall:

So, you know, on Neural Magic we talked a lot lately about what we call, kind of, the three F's. So, fast, frictionless, flexible. And, for us, we're usually talking very specifically about optimizing models and then running those models at a very efficient way on regular CPU architectures. I know for you all at Striveworks, and Chariot specifically, you're kind of doing those same value props, but extending it across the entire life cycle. So why don't you take a second and maybe tell our audience, here, a little bit more about what you all do at Striveworks.

Eric Korman:

Yeah, sure. So, our core offering is Chariot, which is an end-to-end MLOps platform. And it's especially well-suited for deep learning.

So, think, tasks and computer vision like object detection or image classification, and also natural language processing tasks. And it takes you at a very low-code, no-code way through the whole model development life cycle, from getting data into the system centrally stored and versioned, to getting annotated for supervised training tasks, to training models on that data, to then, cataloging, deploying, monitoring, and refining those models. And so, with a very strong emphasis on your second 'F' of 'friction' and making that, you know, to us frictionless is the idea of kind of 'Disappearing MLOps.' And all that infrastructure piece is kind of, you know, under the hood and data scientists don't need to worry about, you know, setting up device drivers and getting their libraries in order. They can just, in a very declarative way, you know, say how they want to train something in terms of, you know, model architecture, data sets, and that kicks off this whole, sort of, process-as-code.

Jay Marshall:

Yeah, and I love that process-as-code term. I know for myself, my background, kind of enterprise architecture/cloud architectures with some of the mega Cloud providers, you know, that whole kind of explosion in the 2010s around infrastructure-as-code and obviously CI/CD and DevOps. I love that 'Disappear MLOps' tagline.

Why don't you, maybe, share some of those specific challenges or even examples that you've seen with folks in terms of helping that happen, because that sounds like it'd be really exciting for folks in this space. Especially right now.

Eric Korman:

Yeah, definitely. And it's definitely had a lot of, you know, engineering challenges on our side to make this happen, make this disappearance happen. And, you know, one of the biggest issues is actually resource management.

So, one of the benefits of our platform is the diverse amount of environments that can be deployed to. So, yeah, we deploy to the, you know, major Cloud providers, but we also can deploy on-prem. We have a lot of customers that are interested in that because, you know, their data is sensitive, for example. And so, there, you know, you're dealing with a very finite number of resources and so an easy way to have MLOps not be disappeared—be very apparent—is if, you know, your model can't train because there's no GPUs available, right? And so, you know, what's key for us in

this partnership is that we can offload all of model inferences to CPU, which Neural Magic's software allows us to do. Then we can reserve GPUs for things, just training mostly.

And so that, you know, really efficient use of resources really helps in the disappearing of MLOps.

Jay Marshall:

And again, this is also why we were looking forward to doing kind of this short but sweet video because I think a lot of times when we talk about, you know, getting the performance that we get on CPUs it's really not about not having GPUs, but it is again that flexibility and the ability, whether you're on the public Cloud in your private data center, as you say, at the Edge. That anywhere where there's x86 or ARM compute, you know, you can squeeze that performance out and get to run it.

So, starting with something that's sparse and then training in a way to preserve the sparsity. And then the data scientists, you know, selects their hyperparameters, what's the batch size they want, what optimizer sampling strategy, data set, obviously, to train on, and so forth.

And then clicking, you know, go/start, that starts the process. And so the system will go see what available compute there is to run the training job, go ahead train that model, you know, metrics, live training loss. Things like that are reported back to the user so they can monitor the progress of the training process. And then, once the data scientist is satisfied, you know, with the model checkpoint, they can elevate that to Chariot's Model Catalog. And so, that's now usable as an inference endpoint where you can just post data to, and the main mechanism we look to deploy that through is through Neural Magic, namely the DeepSparse runtime so that will be deployed on CPUs without having to sacrifice any inference speed.

Video Summary

In this video, Jay Marshall of Neural Magic interviews Eric Korman, cofounder of Striveworks, to discuss their collaboration and shared focus on efficient AI model deployment. They highlight Striveworks Chariot, an end-to-end MLOps solution designed for deep learning tasks like computer vision and natural language processing. The incorporation of Neural Magic into Chariot means data scientists can offload all model inferences to CPU, reserving GPUs for training. This resource efficiency is another step toward the goal of “disappearing MLOps.”

Striveworks and Neural Magic Partner to Disappear MLOps