MLOps is a Machine Learning(ML) engineering culture and practice that aims to unify ML system development (Dev) and ML system operations(Ops). It’s the application of DevOps practice in machine learning field. Practicing MLOps means that we advocate for automation and monitoring at all steps of ML system construction, including integration, testing, releasing, deployment and infrastructure management.
A few quotes to understand more about MLOps:
Building your own MLOps pipeline is not one day's work. Before you get started, it's important to understand the different stages or levels of MLOps pipeline, and plan accordingly to gradually build up a full MLOps pipeline. In this article, we will look at the different maturity levels of MLOps pipeline and what to do to get to each level.
MLOps maturity level is a good way to measure the level of automation in a machine learning pipeline. The exact definition of maturity levels might diff a little in different industry MLOps leaders such as Google or Microsoft. According to Microsoft Azure’s MLOps Levels definition (Strongly recommend reading) and Google’s architecture review of MLOps levels, they can be summarized as:
Let’s do a deep dive into the MLOps maturity levels and see what characteristics each maturity level has.
Maturity level 0 is heavily manual. In this level, there is no DevOps for model release, and no MLOps for model training and deployment. A data scientist manually extracts and experiments with data, and then manually trains and creates ML models, and then manually evaluates and validates the models. The ML models are then handled off to a software engineer to manually deploy to ML service in production. Below is a diagram of a typical MLOps Maturity Level 0 pipeline:
Source: Google Cloud MLOps level diagram: Manual process
In this level, there are a few obvious characteristics to call out:
In summary, it' fine to be on this level when a ML project first starts. But if staying on this level, the manual effort quickly adds up and take a toll on the overall efficiency and progress of the project, as well as scientist and engineer happiness.
Maturity level 1 is still manual but has taken a step forward to add automation into the system. The ML workflow performed by the data scientist is still manual, but once the ML model is handed over to the engineer (or the scientist self), the release, deployment, and monitoring of the ML service is fully automated. The pipeline that performs ML service release and deployment is considered as a CI/CD pipeline, which ensures proper DevOps practices ML service and code management.
It's important to note that the DevOps in this level refers to ML service release, not ML model release. The ML model, once created, is not automatically released and deployed to production. A engineer/scientist still needs to manually trigger the ML service pipeline to deploy the ML model to production.
In this level, there are a few characteristics to call out:
A team usually advances to this level pretty quickly. Often times it's because there are more engineers than scientists on the team, therefore the engineers have more capacity to quickly build up a DevOps pipeline for ML service deployment. However, the team and project as whole still feels the pain of infrequent release cycles, due to the manual effort in involved in ML model creation workflow, which often takes longer time, particularly if the model training involves deep learning and large amount of data.
Maturity level 2 makes significant progress on ML workflow automation. In this level, the steps from data extraction, data processing, and model training are fully automated. A scientist utilizes the efficiency of those automation to collect data and train ML model quickly, reducing the overall ML workflow and release cycle. However, the trained model still needs to be manually validated and still needs to be manually handed over to the engineers to deploy to ML service in production.
A few characteristics to call out in this level:
In this level, the team has made great progress in automating the ML workflow and clearly feels the efficiency and speed improvement it brings to the overall release cycle. At this point, it only makes sense to keep moving forward to the next maturity level with more automation.
Maturity level 3 tackles the last step of the ML workflow - model deployment. In this level, the trained model is automatically validated using the predefined metric threshold. Once the model validation passes, the ML model is automatically sent to ML service pipeline and automatically triggers a ML service pipeline deployment to deploy the ML model to production. As you have probably already guessed, in this level, all steps from data collection all the way to model deployment in production are automated. The ML training pipeline and ML service pipeline are connected and work together to turn raw data into served ML model in production.
It's also important to realize that we are not at full MLOps yet. The job is not done when the ML model is deployed to production. Model monitoring and specially model retraining are very important parts of a full MLOps system, which we don't have yet.
A few characteristics to call out in this level:
The team now has a working and efficient system to quickly deliver raw data to served model in production. Productivity and speed are significantly improved for the ML project. The team may tempt to rejoice and call the job done. However, to achieve full MLOps, there is a final maturity level to go after.
Maturity level 4 is the final level as full MLOps. The noticeable difference between level 4 and level 3 is the capability of model retraining. ML model is heavily data driven. After the ML model is deployed to production and starts running inference on real world data that it has never been trained on, its performance can start degrading over time. It's important to set up performance monitoring metrics to monitor ML model performance in production, and more importantly, automatically triggers model retraining when mode performance metrics go down certain threshold. Below is a diagram of a typical MLOps Maturity Level 4 pipeline:
Source: Google Cloud MLOps level diagram: ML Pipeline
Level 4 MLOps pipeline incorporates all continuous integration(CI), continuous delivery(CD), and continuous training (CT) processes into the pipeline. Comparing to previous level pipelines, a few new features are added around continuous training:
Level 4 full MLOps pipeline creates a unified ML pipeline with modularized components, instead of isolated components with manual transitions. Level 4 MLOps pipeline changes how we look at ML development and deployment processes. It’s no longer a separate workflow where scientist creates models and engineer deploys them, like the old days when developer team creates applications and operations team deploy and monitor them. Unified ML pipeline simplifies scientists’ workflow and empowers scientists to create, test, deploy, and monitor ML models with fast iterations and confidence.
At this level, the team finds itself running a very efficient and robust MLOps pipeline. ML feature release cycle is significantly reduced and more features are delivered to customers more quickly and safely. Moreover, as the architecture and platform of MLOps pipeline is highly reusable, the team can now put more focus on scaling to more ML based products, without having to start from scratch and go through the slow and painful process of Level 0 again.