Vinay Narayana

The current situation at most companies could be summarized as below:

· Every team has their own unique way of testing and productionizing a model

· Lack of a centralized feature store

· Severe data quality issues

· Limited to no data or model monitoring in production (or test)

· Limited to no operational readiness

· Fragmented collaboration with partner teams

This presentation takes the use case of a typical data science org that can apply software engineering principles to improve and solve all the above typical scenarios.

A vision that all data science teams could aspire for, involves the following:

· access to reliable data (with SLOs),

· automate data processing, model, training, evaluation and validation,

· productionize the model either for batch or online serving,

· continuously monitor data and model in production,

· use a trigger based mechanism to auto train, deliver and deploy in production

For achieving the vision, multiple goals need to be put in place. Some of them are below:

· Transform and standardize on how we do MLOps across all teams

· Leverage a centralized feature store and remove any training or serving skew

· All data produced must be treated as a product

· Enable comprehensive data and model monitoring capabilities

· Follow standard tiered approach model for implementing operations readiness

· Lastly, nurture relationships and collaborate with data engineering, central infra teams, etc

The rest of the presentation will go into details on how to implement each of the above goals along with a few high level architectural patterns.

This website uses cookies to ensure you get the best experience. Learn more