Disappearing MLOps: Seamless Integration of Striveworks' Chariot & AWS

Transcript

(Automated transcription)

Host:

Hello. Good afternoon and welcome to today’s session, “Disappearing MLOps: Seamless Integration of Striveworks’ Chariot and AWS,” presented by the AWS US Fed Partners and Solutions Architect team. Please note that all attendees are in listen-only mode, and feel free to drop your questions into the questions panel and not the chat box, and those questions will be addressed at the end with the Q&A session. Lastly, this session is being recorded, and links will be provided at a later date for on-demand viewing. To get the session kicked off, please welcome our moderator, Fazal.

Fazal Mohammed (AWS):

Thanks, John. Good afternoon, everyone. Welcome to this exciting session on disappearing MLOps. I am Fazal Mohammed, a principal solutions architect with AWS in the Growth Strategy and Partners Group, providing thought leadership on strategy and leading partner transformation and modernization efforts. I have 25+ years of technical and management experience leading and delivering complex enterprise architectures. I have led several large projects implementing IT strategy, architecture advice, implementation, and operations management across civilian and DOD agencies. And with that, I have a couple of esteemed guests with me today, James Rebesco from Striveworks and James Skelton from Maxar, and I would like to pass on the mic to James Rebesco first to give a brief intro, James.

Jim Rebesco (Striveworks):

Yeah, thanks. Appreciate it. And I guess we’re not making it easy on you by having the same name here, but I’m Jim Rebesco. I’m the CEO and one of the cofounders of Striveworks. We’re an MLOps company, and our vision really is to find ways of disappearing MLOps to make it the kind of boring plumbing that helps end users get machine learning to assist them and making timely, accurate, and effective decisions. Before I spent my time here at Strive, I worked at a FinTech company, which we took public in 2016, and prior to that I got a PhD in computational neuroscience, which I really enjoyed, but really no longer use in my day-to-day life.

Fazal Mohammed (AWS):

Awesome, thanks James or JimBob?

JimBob Skelton (Maxar):

Yeah, we’ll just go for JimBob for distinction. I’m James JimBob Skelton. I am a product manager for integrated analytics with Maxar Technologies. I’ve been with Maxar for about 11 and a half years now. Most of that time spent on site within a special operations command supporting various Maxar efforts or projects in that space. But now, as a product manager for integrated analytics, a lot of my focus is on, or my main baby is our maritime monitoring capability, but also pulling together a lot of different Maxar capabilities and projects to tailor them to specific customer use cases, utilizing what we do best as a company is not our satellite imagery, it’s how we interact with that imagery to provide valuable content insights to those who need it.

Fazal Mohammed (AWS):

Awesome. Thanks, James. JimBob is what we’ll go with and Jim for James Rebesco here. And so we’ll deconflict that for our webinar. So with that, I think we can jump straight in, and, just for the audience here, please get your Q&As into the Q&A box, and we’ll be monitoring those so that we can probably interject some of those questions in the middle or at the end. We leave some time to answer those. Definitely want to hear back from you throughout the session as this topic is going to go through a few topic areas and we want to do justice in the next one hour and cover them all, but as you can imagine, it’s going to be hard, so definitely would like to keep the conversation going from here on. So, all right, jumping straight in here, Jim, we have the title disappearing MLOps before we disappear it... what is MLOps and why is it important for the current landscape of machine learning?

Jim Rebesco (Striveworks):

Yeah, absolutely, and that’s a great way to start the question. What is it that we’re trying to disappear? And so when we talk about MLOps, we’re talking about managing the kind of life cycle of the development, deployment, and management of AI/ML models or statistical models in general. And so, you’re talking about, kind of, three things. You’re talking about people, you’re talking about process, you’re talking about the kind of technology platform that really enables those two, kind of, first elements to operate at speed. So, when we talk about MLOps, what we’re really kind of, the fundamental so what that we’re addressing, is models are statistical descriptions of the world and as the world changes, those models need to change or be modified to kind of keep pace and ensure performance. And so, it borrows a lot of the same concepts from the kind of DevOps world of software with this, kind of, in the sense of we need to have a structured process and flow for developing software or developing models, but then on top of it, we’re kind of adding this new wrinkle and where models and software really deviate is that once I kind of put a piece of software out there into the world, it has a fairly deterministic behavior, especially if I’ve written it well, hopefully.

Whereas with models, ultimately at the end of the day, they’re statistical entities, and so as the world and as the data that’s being put through that model change, I expect to have a change in performance, and quite often that’s a degradation. So, we need to embrace, once we put that model into production, when we talk about MLOps, we also need to embrace that notion that as the world changes, my model is going to need to change with it.

Fazal Mohammed (AWS):

That’s awesome. I think you brought up DevOps and that’s leading into my next question. So, as you said, the software definition that the software doesn’t change as often as maybe the data does, with that tieback, what other differences are there with DevOps practices and MLOps practices?

Jim Rebesco (Striveworks):

Yeah, that’s a great question, and from my perspective left of deployment, as we’re building and training the model, there’s a lot of similarity I think between how we think about DevOps and how we think about MLOps. Where things get really interesting, I think, is post-deployment. So, when I deploy a piece of software, obviously I’ll want to add features, I’ll want to fix bugs, but really my core monitoring of that software is going to be focused on what I’ll call essentially infrastructure and resource management. Really, my monitoring of a software application is to be focused on is its heart beating, is it alive, is it behaving? Whereas with the model, I’ve got to ask I think a much more complicated question, which is not just is it functioning, but is it functioning in an environment that kind of matches my expectations when I train the model of what that environment would be?

So, if I load up a, if I create a piece of software, I don’t really need to ask as rich of a question in terms of what data’s flowing into it from a statistical perspective and what data’s flowing out of it. But that’s really kind of, I think one of the core disciplines of MLOps. Am I monitoring my inputs? Are my inputs matching the world that this model was trained on, on my monitoring, my outputs? Are they matching what I expect to come out of this model? When that’s no longer the case, what actions do I need to take? Do I need to retrain the model? Do I need to fix inferences that have been coming out of this model? Do I need to alert a human being? That kind of post-production element, I think, is one of the key differentiators between an MLOps practice in a more traditional software DevOps practice.

Fazal Mohammed (AWS):

Copy. So, just to paraphrase that a little bit, so MLOps is going to stand on top of all the best practices that DevOps has developed. So, it’s not trying to reinvent the wheel from scratch, but it’s an addition to all the goodness that you built with your DevOps practices. Is that a fair assumption?

Jim Rebesco (Striveworks):

Yeah, I think that’s just kind of a wonderful synopsis of it, where, as we’re writing this code, we’re going to follow the same process of development, versioning, monitoring, what’s going on within that. We’re going to deploy code through automated pipelines and processes. We’re going to monitor that code in production, but we’re also going to ask ourselves the questions that we talked about there from a modeling perspective. That’s right.

Fazal Mohammed (AWS):

Nice. So, now that we talked about MLOps and as a pipeline, so what are the key stages of an MLOps pipeline?

Jim Rebesco (Striveworks):

So, to double click on it, what we’re really talking about is how do we turn a data into a model? How do we put that model into production, and then how do we put new real world data against that model and then get an output that impacts our organization, our business? So, you’ve got to support things like the ability to ingest data, to prepare and clean data. Many folks on the line have probably heard about the ideas of labeling or annotating data. If they’re going to build a supervised learning model or a deep learning model, then you’ve got to train it. You’ve got to manage the infrastructure and resources to bring, whether it’s GPUs or other kinds of feed and pieces of infrastructure together to train the model. You want to run model experiments. You want to be able to say, “Hey, I’m going to make a few tweaks.”

Which of these five models is probably going to perform the best? Once you get that, you’re going to probably need to validate the model, say, “Hey, whether it’s some holdout data, some kind of business metrics or something else, before I put this model into production, does it really do what I want? Can I trust it, and is it going to perform at a level that meets the outcomes that I need out of my business?” Then, once you’ve done that, how do you deploy it? I’m a data scientist by training, so for me, I’ve always wanted to just dust my hands off and walk away once the model’s trained, but the reality is the hard work is actually just beginning. You need to be able to put that model into a production environment, be able to monitor how it’s performing from a resource and infrastructure perspective, as well as from a statistical or modeling perspective as I’ve touched on previously.

Once it’s there, then you need to be able to make sure that you can maintain audit explainability governance. Is this model being used in an appropriate manner? Is it touching the right data? And then, once all that’s said and done, the last component of this kind of life cycle is that remediation. The world will change, and if you are doing the stuff that JimBob and his team at Maxar are doing, the world can change very rapidly. And when you do that, you need to have the automated processes and systems in place to put a hand up and say, “Hey, this model is no longer performing at an optimal level. What do we need to do? Do we need to retrain it? Do we need to fix inferences? Who do we need to notify within our organization?” That was a bit of a long answer, but all those elements really truly are important as we think about this full MLOps life cycle.

Fazal Mohammed (AWS):

That’s great actually. Now I definitely wanted to disappear based on that.

Yeah, it seems like a lot of work.

Just kind of pulling the thread a little bit before we make it this way, but it seems like with DevSecOps, we have software builders mostly operating at that level, but from what I heard you say, there’s a lot of other personas as part of the MLOps pipeline, like data scientists who might not have the same skill sets like a DevSecOps engineer or a software developer might have because dealing with data, and as you said, the full life cycle of iterative loops and just people all over the enterprise doing this needs a little bit of a structure so that you’re not missing things in the crack or that your process for the final inference is up to the industry benchmarks and your enterprise needs. So any comments on that?

Jim Rebesco (Striveworks):

No, I think that’s really a terrific point, and you hit the nail on the head as far as I think the very multidisciplinary team that really needs to come together to make an effective AI or ML practice within your organization. But I’d add one more person to that list, and that’s that subject matter expert or that end user. So, data science is very much, in my view, an empirically driven world. If I’m here as a physicist saying F=MA, and you can give me one example, that’s not true; that’s a fundamental challenge to this law of nature that I’ve tried to put forth to you. But if I say, “Hey, I built a statistical model,” and you show me a data point that doesn’t agree with it, I say, “That’s an outlier and how we behave or how we react, that outlier is going to be very dependent on the context and the business objectives.”

If we have an outlier in a system that tries to kind of count the number of empty parking spaces in a parking lot, we may just kind of shrug and say, “Hey, we were off by one, but that’s okay. That happens.” If we have that same outlier in a self-driving car, we’re probably going to say that’s just simply unacceptable, and this vehicle shouldn’t go on the road until that’s dealt with or fixed in some way. The so what of that is, is that in addition to your software developers, your data scientists, your ML engineers, I think it’s really important in the kind of AI life cycle to get those end users in that team early and often so they can give you that kind of critical context and steer, not just, “Hey, is this an accurate model, but is this a model that’s really going to do what we specifically need to do within this business context?”

Fazal Mohammed (AWS):

Yeah, that’s awesome. I think just in practice as you’re going through this, the different personas that are going to get plugged into this ecosystem are just going to keep growing all the way until it reaches the end user. So, that leads us to our next question around best practices. And now that we have seen that could get labor intensive, it could get real hairy as a lot of these things happen. How do you control this, right? This is where you’re going with this whole topic around disappearing MLOps, but you’re going to provide a structure and predictability around this process so that now you’re seeing the guts of it. Let’s see how to abstract it.

Jim Rebesco (Striveworks):

Yeah, a hundred percent. And this is one of those things where there’s no such thing as a free lunch, but I think we come awfully darn close to it because you’ve got these two kinds of complementary challenges that you just touched on there. One is how do we maintain control, observability, and oversight over this process? And then the other part of it is how do we make it as efficient as possible? And again, I think one of the good news stories is that both those things kind of converged to the same pain points and ultimately the same solution. So what is it? I like to use the example of YouTube, and so we’ve all used it, and I like to joke and say, “Hey, do you remember when you logged onto YouTube for the first time and they had you annotate a dataset for 15 hours before they started serving you videos?” Of course not.

What’s really effective in that kind of vignette and that user story is the fact that a capability like YouTube is serving an end user content: a consumer. And it’s also, kind of behind the scenes, able to keep custody and track of what content is being served to that user and what behaviors he or she took. What videos did you click on? How long do you watch it for? What didn’t you click on? And so, when you look at the MLOps process, a lot of times where you see the real pain happening is not in kind of putting a model into production or training a model. It’s as the world changes, trying to provide that incremental update and you’ve got to go into production data systems, and now you’ve got data scientists who are kind of trying to hand jam or wrangle data to figure out how to create a new training set to incrementally improve the model.

And that’s where the stresses come in. That’s where you’ve got very skilled data scientists spending a lot of time doing manual data manipulation and data entry where not only are you kind of losing custody over what they’re up to, but it’s probably not the best use of their time. And so, to tie this back to the kind of story of the analogy as it should say with YouTube, I think the way you can resolve this first and foremost is by putting a really heavy emphasis on being able to capture on platform how users interact with and how users consume the outputs of the inferences from a model and using that to really guide that incremental update or that incremental modification of that model. And so the good news story here is that when you pull that off, the folks at Maxar have done, what you end up getting is a real AI-powered set of applications where, while there’s a lot of MLOps going on under the hood, that’s not something that you’re really challenging or tasking your end users to be involved in during the day. They just see a system that works better, and they don’t have to worry about how it got there. And I think that’s kind of one of the good news stories is that as you automate those data pipelines and those processes to turn production data into training data, you maintain custody, you maintain observability, but you also do it for free, and that’s kind of a win-win.

Fazal Mohammed (AWS):

Awesome. I know we’ll delve a little deeper into the next section. We talk about your platform, but before we get there, so you talked about this whole life cycle and how it’s disappearing MLOps with the Chariot specifically. We’ll talk about it, but what are some of the main issues that you are seeing with keeping a model long term in production? So, how should the companies think about that? What is even long term? How do they manage data quality? What are your thoughts around that whole topic now that data out there is solving the needs of the customer?

Jim Rebesco (Striveworks):

Yeah, that’s a great question. And so we’ve kind of hinted at it, but let’s pull the thread on this notion that, hey, the world changes and when that does, your models need to too. So, what we’ve seen is I think two core stories. The first one is, it’s basically, it comes down to use case, one, and data, two. So, let’s talk about the second one first. So, data, that’s pretty simple. I need to know, I’m looking at the Suez Canal, and I need to know how many big container ships are stacked up there, unable to get through because I forget the name of the vessel, but the one that kind of got caught sideways there.

Great. So, once I need to be able to count something like that or perform something like that, that may become interesting somewhere else. Maybe that’s interesting, hopefully not, but maybe the same thing happens in the Panama Canal three months later. And so, now am I looking at the same types of ships? Is the background sufficiently different that it’s degrading my model performance? The world’s changed slightly, and so my use case has changed slightly, and now I need to kind of modify that model, right? That’s one big one. The other big one is when you kind of give people a taste of this, they realize now, hey, there’s a lot more data probably in the world than I as a human being have been able to analyze. And once they see the ability of an AI system to be able to truly augment their workflows and kind of scale that up, that kind of second and third and fourth question tends to flow really rapidly.

Like, “Hey, now that this ship thing is working in the Suez Canal, can I count transits in the, I don’t know, the port of LA? Can I look at semi trucks that are waiting to offload from a railroad or whatever it is?” So, what you get is you get this really positive feedback loop where not only in your organization do you see, hey, now that I’m not stuck constantly fixing this first model that also let’s you say, now I can build a second, and a third, and a fourth, and you can really scale horizontally across your organization efficiently.”

Fazal Mohammed (AWS):

No, that’s—

JimBob Skelton (Maxar):

Awesome. I think…

Fazal Mohammed (AWS):

Go ahead.

JimBob Skelton (Maxar):

Yeah, just to dive in on the use case and the use case type to geography from a computer vision machine learning concept, a potential problem or issue that has to be resolved and resolved quickly that could be achieved through the diminishing MLOps is, say, we’re successfully executing this model, looking at this particular use case in the Caribbean or off the Galapagos Islands, for instance. But if that use case changes, if it shifts and we need to look somewhere up in the Arctic, more than likely it will require some retooling. The model won’t necessarily perform as well as it did before. So, how do we build the ecosystem, the environment to where that could be addressed very quickly in that scale? So, because there are different satellite or imagery conditions that you could expect in the Arctic versus what you would off the coast of the Galapagos. So, those are types of scenarios that I think go into your question that drives... they have to be adaptable because things like this will change. Your different customers will have different use cases. Their use case might be the same but different geographies. So you’ve got to be able to pivot and adapt quickly with your AI/ML infrastructure, an organizational flow.

Fazal Mohammed (AWS):

Yeah, I think that’s a great point, JimBob, I think a lot of organizations struggle. Maybe the prototype is successful thereafter, scaling them out and keeping them fresh for the next versions of it to come, or even extensions of those use cases are really hard to do. It comes down to the people who did it the first time, and that’s not how you scale. You need to definitely democratize this and automate this. So, I think there’s a great juncture to switch over to describing what Chariot does. So I’ll put that slide up here, Jim, if you want to spend some time describing the Chariot platform of how it achieves end-to-end MLOps.

Jim Rebesco (Striveworks):

Yeah, absolutely. And so we’ve talked about, I think, at the core, I’d say MLOps workflow and value proposition, which is that notion of how do you clean data, how do you label it, how do you prepare it for training, training models, serving models, all that good stuff. And so, what I’d really like to double click on in addition to that is really kind of hitting in on, maybe going a click deeper for the technical folks in the audience when we talk about, hey, being able to capture the way users interact with an AI system and piping that back into your MLOps framework. In our view, that’s where the magic happens. And so, maybe kind of going to click deeper on there I think might be interesting to the folks on the line. And so, what does that mean? First and foremost, I think it really means that we got to talk about three things.

We got to talk about the ability to integrate. So, if you are simply locked into a very prescriptive data science environment, you’re not going to be able to integrate neatly with where customers want to deploy applications. And that can be in cloud, like AWS. That can also be augmenting that by going to the edge, whether it’s with a Snowball or something like that or everything in between. The really cool part is once you get that, then you can start to build logic on top of that. So, every time a user interacts with an inference, whether it’s, hey, here’s a movie you’d like to watch, I think this is a ship, or everything in between, can you kind of automatically collect that behavior? Can you implement business logic that lets you say, hey, by behavior, this person liked that movie, by behavior, this person validated that that was a ship or not. Collect that because that data is gold for your model retraining process; automate that retraining.

So, you don’t need a data scientist to come back. It’s not a Jira ticket; it’s just simply a process that occurs kind of through the life cycle of the model. And then you get to the last really interesting thing I think about Chariot, which is that we care a lot about the outputs of the models. We care a lot about... it’s not just that we put a model into production and from there kind of go on and good luck, but we want to know where those inferences are because we tie that to a data lineage system and a governance capability which you can now say is, “Hey, Jim’s ship model was doing great for a couple weeks, but then the world changed, and it hasn’t been doing great anymore.” So great, we’ll fix the model. All that’s happening behind the scenes. But now I’ve got one more question. And the question is: What did that model spit out over the last two weeks?

Who’s been using it? Has that been an input to other models? If I’m kind of building a bit of a knowledge graph here of the kind of downstream impacts of this model that hasn’t been performing, can I pull all that data out, or am I now looking at a table or some other kind of structured data store saying, you know what, there’s a bunch of stuff in here that’s probably wrong because I know that the model that had been writing to it wasn’t performing well, but I can’t figure out what that is, and now I’m stuck with a bunch of very difficult choices in terms of trashing the table, manual audit or manual modification of that, or just shrugging our shoulders and moving on. And so, when we built Chariot, we really focused on those kinds of post-production pain points because having lived it as a data scientist myself, that was the stuff where I found like, hey, I’m not building new models. I’m not trying new architectures.

I’m not doing any of the stuff that I, as a data scientist, have really been trained for and I’m expert in because I’m constantly dealing with all this other stuff. And that was the thing that we really focused on trying to automate. So, I’ll close on this topic by just making one more point. That when we talk about disappearing MLOps, we mean you don’t need data scientists so you can automate the ML process. Far from that. What we actually mean is we want to put that kind of scarce, highly trained, highly skilled labor at the point of maximum efficacy and maximum impact within the organization. And typically that isn’t having them hand curate or move data files around in the ad hoc manner. That’s having them building new models, attacking new use cases, kind of building out that AI practice.

Fazal Mohammed (AWS):

Awesome. Now, I think this is where the tiebacks to how AWS has helped accelerate or could accelerate some of our customer journeys. Our customers already have a lot of resources built on top of AWS, right? They have their applications, front-end applications, back-end data lakes. There’s a lot of data that is going to be the inputs to the model training. So, how can our customers who are already in AWS leverage the managed services that come with AWS like Glue and Athena and trying to take some of our zero ETL approaches from these databases and provide that to a platform like Chariot to get the best of both sides, right? Can you talk a little bit about that aspect of synergy?

Jim Rebesco (Striveworks):

Yeah, absolutely. I mean, there’s the stuff that just kind of makes sense the minute you say it, whether it’s AWS, container registries, model catalogs, obviously the kind of ability to just autoscale blob storage with S3 RDS. Those things are, I hate to say industry standard, but they almost are at this point—just, they’re wonderful general purpose tools; that helps anybody who’s trying to build an application or leverage a platform like Chariot just do so in a really kind of fast and efficient way. Actually, now that I say efficiency, that actually brings to mind one other thing that may be a little bit less familiar to the audience, which is that AWS has this wonderful feature—I think it’s called the Cost Explorer—but what’s really nice for us and for our customers is that, as we’re all aware, building models, GPU-intensive training and inferencing and other kind of elements of the MLOps life cycle can be expensive if not maintained and thoughtful in your approach.

And what’s really nice is that when we can integrate Chariot with something like that, it’s a very transparent look within the MLOps framework for our customers to understand where they’re not just doing things but where they’re spending money and if there are things that they can do more efficiently, it really makes it a very easy click through pass through so that folks who are using Chariot can be very confident that they’re doing it in a cost-efficient manner. And that’s, I think, a really terrific thing that maybe folks in the audience might not be as aware of versus something like RDS in terms of what AWS brings to the table there.

Fazal Mohammed (AWS):

Yeah, that’s a great call out. I think especially with our resources, people are expecting, I mean, that’s one of the reasons why AWS services are so popular is the undifferentiated heavy lifting of doing things of managing servers or patching them or all taking care of in AWS, right? So, now they’re already writing on top of that scalable blob storage or an RDS cluster that they don’t have to manage anymore. Now, with that zero ETL services and ETL services like Glue, you’re able to extract that data and provide it to a platform like Chariot that can then take that and do the life cycle around the training and building and deploying aspects of the model I think has a good synergy based on all the types of workloads that our customers run on AWS.

Jim Rebesco (Striveworks):

Yeah, and I think it also really helps too because it allows folks to kind of maintain basically a competitive handle over their vendor landscape that the more you leverage those common services, the easier it is for you to work with best-in-class capabilities on top of those with real minimal frictions in terms of switching or other kinds of organizational challenges that would come if you were rolling your own on those core services. Yeah.

Fazal Mohammed (AWS):

Definitely. So just in the interest of time, I definitely want to give you time to do a demo. I know this is one of the key things we wanted to highlight, the partnership between Striveworks and Maxar and how Maxar was able to leverage Chariot to build the mission capability. So, I’m going to turn over the screen to you here so you can present your screen, Jim.

Jim Rebesco (Striveworks):

Terrific. And while we’re doing that, is just what I’ll try to do is, I’ll try to just actually give just a real light scene setter and then turn it over to JimBob, and is everyone able to see my screen now?

Fazal Mohammed (AWS):

Yeah, I can see it.

Jim Rebesco (Striveworks):

Yes.

Host:

Yeah, you’re good. Excellent.

Jim Rebesco (Striveworks):

Okay, so it’s a little bit of a scene setter here, and I’m not the kind of geospatial expert Jimbo is, so I’ll try to talk as little as possible and then kind of get off the X. But as I understand the kind of core challenges, I hit play here a bit early, but as I understand the core challenges here, whether it’s in the Pacific or just the world is a big place, and Maxar customers are generating and consuming and just an absolutely massive amount of imagery data to the point where, as a human being, your ability or a team’s ability to consume all that data with the mark one eyeball is extremely challenging. And so, the question then becomes, if, as a human being, if I want answers and I don’t want pixels because I simply can’t look at enough pixels, how do you square the circle where these analysts who are using this data for highly consequential, highly important, mission-critical tests have both kind of confidence in what’s coming out of that system as well as the ability to rapidly iterate and make it a bit more focused to their need? So, JimBob, how did I do in terms of framing it? And then if you’d like, I can kind of just hit play here and allow you to talk over the use case and the customer requirement, and then once we do that, we’ll probably loop it again, and I can kind of double-click on some of the features. How’s that sound?

JimBob Skelton (Maxar):

Yeah, yeah, no, that’s fine. Try to recall when I can from memory, I believe a lot of what we did here and really this partnership, this marriage is very compelling in terms of taking—how we’re taking two different effective ecosystems for this kind of work and building it together, building that nexus. And so, what we’re looking at here, if I recall correctly, is utilizing Maxar’s maritime monitoring capability called Crow’s Nest, which for what we’ve done here with Chariot is integrated our API tipping function into Chariot so that the customer use case here for monitoring these vessels and testing this capability in terms of latency, how quickly not just the image be collected once that tip was submitted from out of Chariot to our satellites, but how quickly the derivative insights, the vessels here in this case, were subsequently delivered back into the Chariot platform for consumption by the end users.

We actually ran several tests, API tips at different places in the world, a lot around Pearl Harbor. I believe there were some other locations as well that had more real-world implications in terms of what the vessels are doing, waiting for vessels or certain assets to depart particular areas. But this is one of our main functions—is to provide quick access, automated access to our satellites to respond to emerging events in this case, in the maritime domain. And the Chariot platform is a perfect environment for this to take place. And our ability, because of how we’re designed in Maxar, our ability to configure and work with Striveworks, in this case, to set this up was seamless, and really it took a matter of days for us to be able to get to this point to where tipping to Maxar EO satellites could occur for a subsequent collection.

Jim Rebesco (Striveworks):

Yeah, thanks for that. What I’ll double-click on as we talk through and walk through the demo is, what you’re seeing here is, I think what’s so cool about this to me is you’re seeing this kind of very interesting kind of melding of two things. As JimBobby just said, a very kind of geospatial intelligence focused tipping and cueing platform and what y’all have done with Crow’s Nest, and then you’ve got these kind of very kind of small bumps in the wire in that process to enable a machine learning capability. And so, as we’re looking through these kinds of detections and streaming, I’ll kind of talk a little bit about this. Obviously, I think most of the folks in the audience are familiar with these bounding boxes here as kind of the traditional output of an object-detection computer vision algorithm or also how a human analyst would be marking and annotating this data for downstream processing.

And so, by trying to make that kind of notion of like, hey, are you labeling data or are you doing what you need to do as an analyst anyway, by providing that integration, what you get is not just the kind of core tipping and cueing capability, but as we highlight here in a moment, when you go into an image like this, great, you got one wrong and you missed one. So, how does an analyst kind of clean this data up quickly for his or her downstream tasks while also piping that back to a model? And that’s what you’re seeing here. Mark the boat that was missed, validate the one that was correct, and then ding or reject the one that is not what you’re looking for. And then when you hit that submit updates button and the second, that’s all I need to do, there’s no more data science that’s going to be facing the end user, but Chariot’s logging that information. And as that feedback accumulates, you as a customer can pre-configure the business logic you want. Hey, 10 up votes, three down votes—that’s worth the retrain or whatever else it is that matters to you and your business. And now you’ve gotten your data scientists off the ship detector and on to the next task while your end users are seeing improving and not decreasing performance and efficacy over time. So, that’s really kind of the special sauce, and really appreciate JimBob, you and everybody at Maxar partnership and putting that together. It’s really cool.

JimBob Skelton (Maxar):

Yeah, absolutely. It’s definitely compelling bringing these ecosystems together I think because we try to execute and employ similar practices with our computer or model development, our model catalog. And then I know there’s good ways, a lot we could do with integrating with Chariot beyond what we’re doing in the maritime domain.

Fazal Mohammed (AWS):

Yeah, that’s a really interesting and, thanks, awesome demo. And I see as JimBob at the end that you alluded to, it was excellent, you said it was pretty quick to get started with and you got the performance that you needed out of it, and now you’re trying to take it to other domains. So, the expertise of what was proven here is now scalable. I think that’s one of the key points that we definitely want to hit home on is that this is setting up just like how DevSecOps practices and tools built around the ecosystem to make real-time changes to code is now being possible at the model level. And we see some of that here as an example in action. So, that’s awesome work.

So, now that we’ve talked about a real-life use case where chat has been used and with Maxar and interjected with their existing capability and added functionality, there’s always questions around Jim and JimBob, maybe you’ve seen this too, is around data security. So, now we are introducing another layer that is a statistical model and the data that was used to train it is the original data. What are the best practices for securing data that’s used in training, and how do we evaluate that this data is what should have been used for training for the use case that you’re pursuing? So thoughts on that?

Jim Rebesco (Striveworks):

Yeah, I mean I’d say that the first and foremost thing is, like we do with kind of production data pipelines elsewhere that we need to secure, whether that’s credit card transactions or satellite imagery, the extent to which you can automate those pipelines is kind of one of the first critical prerequisites for security. When there is a notion that a data scientist or a group of human beings have to, kind of, manually move and manipulate data at some point within your MLOps framework, I think it’s like cybersecurity, right? Human error always has to be kind of top of the queue in terms of security threats and vulnerabilities. And the more you can do to automate and mitigate that, I think you’re going to be on a good posture as a starting point there.

Fazal Mohammed (AWS):

Gotcha. And is there any talk about MLOps with regards to privacy regulations like GDPR, HIPAA like CCPA for California consumer protection? How is that being done?

Jim Rebesco (Striveworks):

It’s a massive question. I think it’s a rapidly evolving space, so we can walk the dog in terms of what are your entitlements to a piece of data? As a data scientist, are you entitled to access a piece of health care data or some other piece of data to train a model? The data governance world has done a phenomenal job in providing a solution there. Where things I think are going to get really interesting over the next couple years is we’re already seeing the initial signals of this in the large language model or LLM world where there may be a piece of data or some pieces of data that either one, they’re sensitive, but they’re a very small fraction of the overall data. And the question is as to what extent does that model have the exact same sensitivities? Does it inherit all the sensitivities of the data? Some? None?

Or I think even more challenging if the kind of input data individually are relatively unsensitive pieces of data when aggregated into a model of sufficient explanatory predictive power. Can that model generate outputs that are sensitive and might run afoul of some of those privacy concerns? That’s going to be the big challenge I think in data governance over the next five or 10 years. And, at the risk of being a bit of a broken record on this topic, I think one of the key elements you need to maintain there is integrity and observability through your MLOps pipeline because as regulators, as legislators, as policy makers are figuring out what this really means, it’s going to be a shifting landscape, and we’re going to need the ability to be responsive and compliant in quick order.

Fazal Mohammed (AWS):

Gotcha. And since you touched the topic around LLM, I know we talked about or showed the use case of computer vision models being trained and used for mission use case. So, Chariot obviously, again being used for other models as well, and especially as we get into fine-tuning—probably not everybody’s going to retrain a model from scratch for LLM unless they want to. But is that what you’re seeing is the use cases of taking an existing model from Hugging Face possibly or even somewhere else that they are wanting to fine-tune for their use case?

Jim Rebesco (Striveworks):

Yeah, yeah, I think the ability—we’re seeing this across the enterprise—a ton of people are very interested in as it relates to LLMs in both being able to maintain some overall custody of that model, whether it’s self-hosting it or hosting it through a trusted cloud provider or something else like that. But also very clearly where we’re at in this is the ability to fine-tune those large language and other models is I think going to be one of the most robust use cases is certainly something that we’re seeing. And I think that’ll kind of maintain its centrality in the LLM space for quite some time. There’s a ton of promise there, and people are just really getting after it to see how they can put these, tweak them and put them into practice in their business.

Fazal Mohammed (AWS):

And just from a use case perspective, I know we have a lot of customers who use commercial AWS regions and government like the GovCloud regions, and then we have our higher region Secret and Top Secret. So, Chariot can run in all of those regions. How does it work?

Jim Rebesco (Striveworks):

Yeah, that’s right. We can and are currently running in all of those, including some of those kinds of tougher ones to get into that you mentioned at the end of that list.

Fazal Mohammed (AWS):

Awesome. I think that’s another point to have reinforced here in the stock track is that it is portable from that regard; you can take it to all of these regions and for different use cases that require different data classification needs that our customers might come with.

Jim Rebesco (Striveworks):

Yeah, and I was talking about it a little while ago before we hopped on this, but I think one of the other really cool and interesting emerging use cases is that the more you can do to kind of disappear this kind of model management framework, the more powerful it gets in edge, disconnected, or expeditionary use cases, where previously you had this challenge of, gosh, I really need the ability to analyze data in real time and maybe a more disconnected or resource-constrained environment than I ideally hope for. But at the same time, when I move into that kind of edge environment, my ability to kind of control what it is I’m going to be interested in tends to disappear too. So it’s a tough challenge: Am I going to have the right model when I’m out there is ultimately what that boils down to. And it’s been great working with AWS because you’ve got those offerings and things like the Snowball where it really kind of lets you in our view, kind of hit the best of both worlds where you can take some pretty good compute horsepower and infrastructure out there, and that gives you the ability to be as dynamic with your data processing capabilities as that operating environment.

Fazal Mohammed (AWS):

Yeah, I think that’s a great call. I—even mentioning the edge side, which is growing and actually is going to get more and more complex as we see both for military use and non-military use, especially in disaster recovery scenarios where there are models trained to find with a drone if there are people alive in a disaster zone. I mean, that’s really, really critical—use cases to go forward and conquer because a matter of life and death there where you’re now able to take these quick models that are fine-tuned and trained for different types of terrain that can detect—just like in your ship use case—now detect for life forms that now we can have adequate disaster response. I think that will be an awesome use case for your platform as well. So, with that, we are coming up on time. We have 10 minutes, but I definitely want to give you some time to talk about future trends. So, what do you see just based on your engagements with various customers? We saw one such engagement with Maxar and then even possibly end customers like government customers. What are you seeing as where they’re headed? And on a macro level, what do you see as future trends in MLOps?

Jim Rebesco (Striveworks):

Yeah, I think we’re seeing two big macro trends, and the two of them are—the first one is democratization. As AI/ML becomes more mainstream, more performant, and more closely tied to business impacts, the demand for people to be able to self-serve that capability for some definition is just rapidly growing. And the complementary theme we’re seeing then is that, as people move into that space where you’ve got maybe not one or two, but now 50 or a hundred users and consumers of AI products and capabilities, how do you maintain governance and audit and observability over the top of that, especially in highly regulated spaces and in the government? I think people are both very excited about the returns on investment they’re seeing and now asking the question as we go from 50 to 50,000 models in production: Am I going to be able to be confident from a CISO role that I’ve got good custody and good observability on what’s going on in my data science process? So, I think you’re going to see a convergence of those two things, I think, and the way you’re going to see that convergence is through a renewed emphasis on tightly coupled production pipelines and really embracing a whole life cycle approach.

Fazal Mohammed (AWS):

And from a future trend perspective, Jim, we see a lot of leaderboards for models and a lot of measurement metrics that give us a little bit of introspection on the lineage of the model as well as how the model is performing for use case. What do you see there as trends going forward? How open and how can I know because I’m not going to open the guts and see everything, but how can I be confident going forward that these models are doing what they’re supposed to? And, as you said, it’s just going to proliferate more and more, not only from a security perspective, but also from an end user perspective. I should have some confidence in using the results coming out of models.

Jim Rebesco (Striveworks):

Yeah, that’s a great question. And the whole, I think, industry of validation and verification of models is really going to explode. And you’re seeing a ton of great performers really building very nice point solutions there in terms of data observability, stress testing models. But I think those are going to be really critical capabilities. But in the same way that if I said, if you ask me, “Hey, how is, 20 years ago or 30 years ago, how is cybersecurity going to evolve?” And I said, “Firewalls are important and network traffic monitoring is important and email encryption is important”—I probably have been missing the point a little bit in the sense that what you’re really going to see is an emergence of folks like Splunk, people who can sit on top of all those point solutions and kind of provide that holistic, holistic life cycle view. And so, when I say that, I don’t mean to say that those things won’t be important, but they’re going to be very important as you embed those capabilities in the context of your business—meaning that validating models and making it easy to validate models is going to be very useful to the enterprise. But if I’m building an autonomous car, and you’re building a parking lot monitor, the way we both need to validate a computer vision algorithm that’s ultimately interested in detecting an automobile is profoundly different. And it’s unlikely that if we can’t provide that business context, that solution, that’ll just kind of work for us out of the box. That’s kind of my view, both of us.

Fazal Mohammed (AWS):

Yeah. Yeah, I think that’s a great perspective. So, especially with large language models becoming almost a Swiss Army knife, there’s still a lot of cases where very curated models are going to perform much better for that use case. So, you’re going to use them—

Jim Rebesco (Striveworks):

All. Yeah, I can learn a lot about appendectomies from chat GPT, but I would really hope that my surgeon wasn’t right.

Fazal Mohammed (AWS):

A hundred percent. A hundred percent. So cool. I know we are at the last five minutes. I see one question on the chat before that, just in general, we talked a lot about MLOps, the Chariot platform that it’s being used in, and some of the future, but how do organizations get started? Would they be able to trial Chariot, or what is the best way to engage and get hands on with Chariot?

Jim Rebesco (Striveworks):

Yeah, absolutely. So, for folks who are interested, it is really easy through our team and quite frankly working with partners like AWS to get on platform quickly, try before you buy, take it for a spin. And from there, we always like to have that question, which is like we had this conversation with Maxar like, hey, what systems do you really need this to disappear inside of? How are you processing data today? And how can this platform help you do it in a more efficient manner? So, working with a partner like AWS, that’s always the easy button.

Fazal Mohammed (AWS):

Awesome. And for the attendees, if you have questions, definitely type them into the questions panel, and also we’d have a survey link to come up on your chat. Thanks, Don. I think that just past it, it doesn't take more than a minute. Definitely fill it out at AWS. We are definitely very feedback hungry. We want to know how we did and how we can improve. That’s how all our products are built around here. So, appreciate your help in filling out the survey to let us know how we can perform better. So, the link is on the chat. And with that, there’s one question here, Jim, JimBob, is, can you discuss challenges relating to handling imbalance data or bias algorithms in MLOps? How can you solve these with MLOps?

Jim Rebesco (Striveworks):

Yeah, those are great questions. And at the end of the day, again, this is something where you need your business logic to be very tightly coupled to your monitoring. You need to be aware that those things are going on. You need to be aware that those things are happening both early in the model development process as well as late in it. And you really can’t do that unless you’re either asking a data scientist to be very, very careful or you’ve got observability over how that data goes in. So, I hate to kind of keep saying it’s a whole assistance approach, but as we, and by we I mean industry, continue to develop automated solutions like, hey, check up this data. I balance, hey, check if you’re using this model and environment where you’re really putting emphasis on an underrepresented class of data, those capabilities are out there, they’re straightforward, but where the seams occur in that process is when you’ve got human beings hand jamming and manipulating data, and you lose observability over the top. Yeah.

JimBob Skelton (Maxar):

I think what Jim mentioned earlier about the integration, more closer integration with the subject matter experts into this process and into the MLOps, does a lot to mitigate any kind of biases or anything like that. So, especially when you get those SMEs that have a great understanding of this type of science and what’s taking place behind it, then they’re better poised to take their inherent knowledge and IT to avoid those types of issues.

Fazal Mohammed (AWS):

A hundred percent. JimBob, I think nothing like the SME validation. I mean, the more validation of the SMEs that we can get in an automated way, the better. But till then, we are going to have to rely on SMEs to corroborate the evidence there a little bit. So awesome. I know we are at time for any other questions. We’ll definitely respond back if you leave us back an email. Again, thanks for attending and reiterating there’s a link on the chat to do the survey. Please complete that—really going to be helpful for us. And with that, I’d like to thank Jim and JimBob here. Amazing session, great products, and a lot of topics we talked about. Appreciate your time here in talking to us about this important and ever-growing topic as we delve into the new space of AI/ML. This is going to be more and more important that we get the foundations right, and this is one of the foundation principles where you’re going to stand and scale out solutions in the future. So, amazing talk and looking forward to more of these sessions.

Jim Rebesco (Striveworks):

Yeah, thanks for having us. Appreciate it.

Fazal Mohammed (AWS):

Awesome.

Jim Rebesco (Striveworks):

Thank you for the invite.

Video Summary

Explore the challenge of maintaining an efficient feedback loop for updating machine learning models, which is often perceived as laborious and challenging. This webinar discusses the integration of machine learning operations (MLOps) technology with Amazon Web Services (AWS) data systems, known as "disappearing" MLOps.

By seamlessly integrating with the Striveworks Chariot platform, AWS customers can leverage existing production data systems to achieve further scalability and operational improvement.

Learning Objectives:

Explore an innovative, integrated, and automated business model for MLOps.
Understand the benefits of deep integration between the MLOps process and production data systems.
Discover the distinctive capabilities offered by Striveworks to empower data analytic teams.