Arnau Ramisa

Fashion Product Retrieval with Real World Images

Online retail stores have vast catalogs with hundreds of thousands or even millions of products. Searching for the perfect product in this space with basic text search would already be a daunting task, but it gets worse: more often than not, product descriptions are inadequate or they do not exist at all. If you compound different languages and vocabulary choices between different retailers and geographical regions it only gets more difficult. On the other hand, images are a universal language that, with current deep learning techniques, can be leveraged to search for similar products without the limitations of text. They have, however, their own set of difficulties: catalog images are taken by professional photographers in ideal conditions, often with a white background, while "query pictures" are much more diverse and can have many undesirable characteristics such as bad illumination, motion blur, complex backgrounds or unusual viewpoints. To bridge this gap, we can use a type of model called Siamese networks that, using pairs of corresponding shop and consumer pictures, learn a common embedding for both styles of image.

Arnau Ramisa received the MSc degree in computer science (computer vision) from the Autonomous University of Barcelona (UAB) in 2006, and in 2009 completed a PhD at the Artificial Intelligence Research Institute (IIIA-CSIC) and the UAB. Between 2009 and 2011, he was a postdoctoral fellow at the LEAR team in INRIA Grenoble / Rhone-Alpes, and between 2011 and 2015 a research fellow at the Institut de Robòtica i Informàtica Industrial in Barcelona (IRI). Since 2015 he is working as a computer vision researcher at Wide Eyes Tech. His research interests include object classification and detection, image retrieval, robot vision and natural language processing.

This website uses cookies to ensure you get the best experience. Learn more