Video

Striveworks and Neural Magic Partner to Disappear MLOps

Discover how seamless AI workflows and efficient resource management are transforming model deployment across cloud, on-prem, and edge environments.

Is MLOps Disappearing? (Featuring Striveworks)

Transcript
(Automated transcription)
 
Jay Marshall:
Hey, everybody. I'm Jay Marshall with Neural Magic. Today I'm joined by Eric Korman, one of the cofounders of Striveworks, another great startup that we've partnered with earlier on this year. Thanks for joining me today, Eric.
 
Eric Korman:
Yeah, thanks for having me.
 
Jay Marshall:
So, you know, on Neural Magic we talked a lot lately about what we call, kind of, the three F's. So, fast, frictionless, flexible. And, for us, we're usually talking very specifically about optimizing models and then running those models at a very efficient way on regular CPU architectures. I know for you all at Striveworks, and Chariot specifically, you're kind of doing those same value props, but extending it across the entire life cycle. So why don't you take a second and maybe tell our audience, here, a little bit more about what you all do at Striveworks.
 
Eric Korman:
Yeah, sure. So, our core offering is Chariot, which is an end-to-end MLOps platform. And it's especially well-suited for deep learning. 
 
So, think, tasks and computer vision like object detection or image classification, and also natural language processing tasks. And it takes you at a very low-code, no-code way through the whole model development life cycle, from getting data into the system centrally stored and versioned, to getting annotated for supervised training tasks, to training models on that data, to then, cataloging, deploying, monitoring, and refining those models. And so, with a very strong emphasis on your second 'F' of 'friction' and making that, you know, to us frictionless is the idea of kind of 'Disappearing MLOps.' And all that infrastructure piece is kind of, you know, under the hood and data scientists don't need to worry about, you know, setting up device drivers and getting their libraries in order. They can just, in a very declarative way, you know, say how they want to train something in terms of, you know, model architecture, data sets, and that kicks off this whole, sort of, process-as-code.
 
Jay Marshall:
Yeah, and I love that process-as-code term. I know for myself, my background, kind of enterprise architecture/cloud architectures with some of the mega Cloud providers, you know, that whole kind of explosion in the 2010s around infrastructure-as-code and obviously CI/CD and DevOps. I love that 'Disappear MLOps' tagline.
 
Why don't you, maybe, share some of those specific challenges or even examples that you've seen with folks in terms of helping that happen, because that sounds like it'd be really exciting for folks in this space. Especially right now.
 
Eric Korman:
Yeah, definitely. And it's definitely had a lot of, you know, engineering challenges on our side to make this happen, make this disappearance happen. And, you know, one of the biggest issues is actually resource management.
 
So, one of the benefits of our platform is the diverse amount of environments that can be deployed to. So, yeah, we deploy to the, you know, major Cloud providers, but we also can deploy on-prem. We have a lot of customers that are interested in that because, you know, their data is sensitive, for example. And so, there, you know, you're dealing with a very finite number of resources and so an easy way to have MLOps not be disappearedbe very apparentis if, you know, your model can't train because there's no GPUs available, right? And so, you know, what's key for us in
So, starting with something that's sparse and then training in a way to preserve the sparsity. And then the data scientists, you know, selects their hyperparameters, what's the batch size they want, what optimizer sampling strategy, data set, obviously, to train on, and so forth.
 
And then clicking, you know, go/start, that starts the process. And so the system will go see what available compute there is to run the training job, go ahead train that model, you know, metrics, live training loss. Things like that are reported back to the user so they can monitor the progress of the training process. And then, once the data scientist is satisfied, you know, with the model checkpoint, they can elevate that to Chariot's Model Catalog. And so, that's now usable as an inference endpoint where you can just post data to, and the main mechanism we look to deploy that through is through Neural Magic, namely the DeepSparse runtime so that will be deployed on CPUs without having to sacrifice any inference speed.