Modern MLOps: Simplifying and automating ML pipelines using Databricks and Kubernetes in AWS
A challenging problem in modern MLOps is reducing technical debt. Since each environment has certain requirements, oftentimes this problem becomes extremely complex very quickly. Therefore, choosing the right toolset is of fundamental importance. Traditional DevOps has certain axioms for SDLC. MLOps attempts to transfer these axioms into ML context. However, model products have different needs and requirements from software products. Main differences include the need for Continuous Training (CT), Continuous Monitoring (CM) and model version control (MVC). Databricks ML is an integrated machine learning environment that reduces significantly technical debt. It can be used as an end-to-end machine learning solution, or use parts of it according to your needs. In particular, MLflow is a well-known API provided by Databricks ML that allows Data Scientists to apply MVC at scale. This enhances collaboration, automation, visibility and can fit nicely in any modern CI/CD/CT/CM pipeline. During this talk, we will discuss about the significance of choosing the right MVC topology and how it affects the entire design. Also, explain how MLflow fits in ITV’s ecosystem CI/CD/CT/CM pipelines and release strategies for model artefacts on Kubernetes
Christos Sarakasidis is an experienced Machine Learning engineer with a keen interest in modern DevOps, software engineering and cloud computing. He has previously worked in research developing algorithms to solve problems in algebra & topology and helped major tech-driven organisations to build in cloud robust ML solutions. Christos joined in February 2022 ITV and is currently the Lead Machine Learning engineer. His role includes the creation of automated ML solutions to enable data-driven decisions across various ITV departments.