A Unified ML Data Pipeline for Real-Time Features: From Training to Serving
On a global marketplace like Etsy where buyers come to buy unique, varied items from sellers from around the globe, the inventory of items is constantly changing. Users preferences also change in real time as they discover the latest selection being offered. In such a dynamic environment, Machine Learning models for different applications (including search, recommendations or computational advertisement) need to collect different real time data signals, process them and finally leverage them to make the most relevant predictions.
In this talk we will detail how we use realtime feature logging & streaming systems to capture in-session / trending activities, in order to compute features for our different ML models and use it for downstream applications such as a Bandit or Reinforcement Learning System.
Aakash is a Senior Engineering Manager in Etsy's Machine Learning Infrastructure group. His team's focus is on building scalable & efficient realtime ML systems that allow Etsy to leverage its vast quantities of marketplace data for different ML applications such as search, advertisement or recommendations. Aakash has been involved with different startup companies since the start of his career including Ooyala, Platfora, Quantifind & finally Blackbird, which was acquired by Etsy. At all these companies his work has been at the intersection of Data Science, Machine Learning & Distributed Systems. Aakash holds a degree in Computer Science from Carnegie Mellon.