-
08:00
REGISTRATION
-
09:00
WELCOME
Ann Thyme-Gobbel - Director of UX/UI Design - Loose Cannon Systems
While doing a PhD in Cognitive Science and Linguistics at UCSD, Ann's interest in phonetics and NLP led to a dissertation using neural networks to model how speakers of a language use paradigms and analogy to create new words in that language. Since then, her professional life has spanned R&D, product development and services across organizations, including Nuance, Amazon Lab 126, and smaller startups. Always curious about how people talk to machines—and how machines respond—she's co-author of the upcoming book "Mastering Voice Interfaces".
-
NATURAL LANGUAGE PROCESSING
-
09:15
Advancing the State of the Art in Conversational AI for all Developers
Alex Weidauer - Co-Founder and CEO - Rasa
Advancing the State of the Art in Conversational AI for all Developers
Most people think of chatbots and voice assistants as either dumb or super smart (if they’ve never tried one :) ). Based on working with hundreds of developers in creating assistants, Rasa believes there are multiple levels of assistants – in fact, 5 levels of AI assistants. This talk discusses challenges when trying to advance from a simple to an advanced assistant, recent advances in NLP and machine learning research, and provides hands-on advice on how to setup in-house teams.
Alex Weidauer is co-founder and CEO of Rasa, provider of the standard infrastructure layer for conversational AI, which supplies the tools necessary to build better, more resilient contextual assistants. With over 1.5M+ downloads since launch, Rasa Open Source is loved by developers worldwide, with a friendly, fast-growing community learning from each other and working together to make better text- and voice-based AI assistants. Alex studied Computer Science and Management, has worked for McKinsey and various tech startups before. Alex was recently on Forbes’ 30 under 30 list.
-
09:35
On-Device Language Understanding for the Google Assistant
Rushin Shah - Senior Manager - Google
On-Device Language Understanding for the Google Assistant
Improvements in mobile power have enable the move of the computing behind commercial products from datacenters to the users' devices. For the Google Assistant, this allows a faster and more reliable product, as well as new user experiences which take advantage of the reduced latency and increased responsiveness. A central problem in this new scenario is the on-device deployment of semantic parsing solutions which can identify the intent that the user has expressed. This talk discusses the challenges inherent to building such an on-device solution. These range from technical (NN architectures, quantization, differential privacy), infrastructural (releasing, versioning) and organizational (data collection and management, developer experience).
Rushin Shah is a senior manager at Google. Prior to Google, Rushin was leading the natural language understanding group at Facebook Conversational AI. Previously, he was at Siri at Apple for 5 years, where he built and headed the natural language understanding group and he also worked at the query understanding group at Yahoo. He has worked on a broad range of problems in the NLP area including parsing, information extraction, dialog and question answering. He holds degrees in language technologies and computer science from Carnegie Mellon and IIT Kharagpur.
-
10:00
Delivering Magical Customer Experiences through Advances in Conversational AI
Nikko Ström - Senior Principal Scientist - Alexa AI
Delivering Magical Customer Experiences through Advances in Conversational AI
Voice services like Alexa have opened a new channel to deliver greater customer value every day. In this session, we'll review the recent advances in dialogue technologies that have made multi-turn conversations sound natural to customers, and how Alexa’s conversational AI will bring a wave of new voice experiences that weren’t possible before.
Nikko Ström is a technologist and scientist with a deep background in speech technologies. He joined Amazon in 2011 as a Senior Principal Scientist and was a founding member of the team that built Amazon Echo and Alexa. In this role, he leads deep learning efforts and machine learning projects across the Alexa organization.
Nikko has more than twenty years of experience in the field of Automatic Speech Recognition from a few of the most prominent research laboratories and companies in the world, and has published extensively at international conferences, journals, patents, and books. He has worked as a research scientist at the MIT Lab for Computer Science and a speech scientist at the start-up company Tellme Networks in 2000, and in 2007 he transitioned to the Core Speech Recognition Team at Microsoft, pushing the limits of the state of the art in commercial speech recognition technology in collaboration with the Microsoft Research Speech group.
Nikko earned his PhD at the Speech Communications LAB at KTH in Stockholm. As part of his thesis work, Nikko developed the world's first continuous speech recognizer for the Swedish language and made significant contributions in speaker adaptation and Artificial Neural Network technologies. He also published Open Source Artificial Neural Network software, which has been downloaded by thousands of researchers worldwide. He was also an invited guest researcher at the Advanced Telephony Research Lab in Kyoto, Japan, where he contributed to world class research in the speaker adaptation field.
-
10:25
COFFEE
-
11:05
Emotionally and Semantically Conditioned Dialogue Response Generation
William Wang - Assistant Professor - University of California, Santa Barbara
Emotionally and Semantically Conditioned Dialogue Response Generation
A major challenge for advancing the state-of-the-art dialogue systems is to improve the controllability and robustness of the neural dialogue response generation module. In this talk, I will introduce MojiTalk, our recent work towards building the next generation empathetic dialogue agent that can synthesize human emotion into dialogue generation, without the need of any human annotated data. Furthermore, I will also discuss, in a complex multi-domain task-oriented dialogue system, how we can use graph representations to encode dialogue act dependency graph, and generate controllable and explainable response with disentangled hierarchical self-attention models.
William Wang is the Director of UC Santa Barbara's Natural Language Processing group and Responsible Machine Learning Center. He is an Assistant Professor in the Department of Computer Science at the University of California, Santa Barbara. He received his PhD from the School of Computer Science, Carnegie Mellon University. He has broad interests in machine learning approaches to data science, including statistical relational learning, information extraction, computational social science, speech, and vision. He has published more than 80 papers at leading NLP/AI/ML conferences and journals, and received best paper awards (or nominations) at ASRU 2013, CIKM 2013, EMNLP 2015, and CVPR 2019, a DARPA Young Faculty Award (Class of 2018), a Google Faculty Research Award (2018), three IBM Faculty Awards (2017-2019), two Facebook Research Awards (2018, 2019), an Adobe Research Award in 2018, and the Richard King Mellon Presidential Fellowship in 2011. He frequently serves as an Area Chair for NAACL, ACL, EMNLP, and AAAI. He is an alumnus of Columbia University, and a former research scientist intern of Yahoo! Labs, Microsoft Research Redmond, and University of Southern California. In addition to research, William enjoys writing scientific articles that impact the broader online community: his microblog @王威廉 has 100,000+ followers and more than 2,000,000 views each month. His work and opinions appear at major tech media outlets such as Wired, VICE, Scientific American, Fast Company, NASDAQ, The Next Web, Law.com, and Mental Floss.
-
11:25
Just Ask: An Interactive Learning Framework for Vision and Language Navigation
Seokhwan Kim - Senior Machine Learning Scientist - Amazon Alexa AI
Just Ask: An Interactive Learning Framework for Vision and Language Navigation
In the vision and language navigation task, the agent may encounter ambiguous situations that are hard to interpret by just relying on visual information and natural language instructions. We propose an interactive learning framework to endow the agent with the ability to ask for users' help in such situations. As part of this framework, we investigate multiple learning approaches for the agent with different levels of complexity. The simplest model-confusion-based method lets the agent ask questions based on its confusion, relying on the predefined confidence threshold of a next action prediction model. To build on this confusion-based method, the agent is expected to demonstrate more sophisticated reasoning such that it discovers the timing and locations to interact with a human. We achieve this goal using reinforcement learning (RL) with a proposed reward shaping term, which enables the agent to ask questions only when necessary. The success rate can be boosted by at least 15% with only one question asked on average during the navigation. Furthermore, we show that the RL agent is capable of adjusting dynamically to noisy human responses. Finally, we design a continual learning strategy, which can be viewed as a data augmentation method, for the agent to improve further utilizing its interaction history with a human. We demonstrate the proposed strategy is substantially more realistic and data-efficient compared to previously proposed pre-exploration techniques.
Seokhwan Kim is currently a senior machine learning scientist at Amazon Alexa AI. He received his Ph.D. from Pohang University of Science and Technology. Prior to joining Amazon, he conducted work in natural language understanding and spoken dialog systems where he was a Research Scientist at Adobe Research and the Institute for Infocomm Research. He has authored more than 50 peer-reviewed publications in international journals and conferences in speech and language technology areas. In 2015, he joined the organizing team of Dialog System Technology Challenge (DSTC) and has contributed to the last five challenges. In addition, he has been acting as a program committee member of the major conferences in NLP, speech, dialog, and AI fields including ACL, NAACL-HLT, EMNLP, ICASSP, Interspeech, IWSDS, AAAI, IJCAI, and ICLR.
-
CONVERSATIONAL AI
-
11:45
Incorporating Common Sense and Semantic Understanding within the Assistants
Chandra Khatri - Senior AI Scientist - Uber AI
Incorporating Common Sense and Semantic Understanding within the Assistants
With advancements in Deep Learning and data collection techniques, we have built artificial agents which more functional such that they can perform significantly better than humans on well-defined and unambiguous tasks such as Atari games. However, they do poorly on tasks that are dynamic and seem straightforward to humans such as embodied navigation and performing open-ended conversations, even after training with millions of training samples. One key element which differentiates humans from artificial agents in performing various tasks is that humans have access to common sense and semantic understanding, which is learned from past experiences. In this talk, I will be presenting how incorporating common sense and semantic understanding significantly help the agents in performing a complex task such as house navigation. I will also showcase that the semantic embeddings learned by the agent mimic the structural and positional patterns of the environment.
Chandra Khatri is a Senior AI Scientist at Uber AI driving Conversational AI efforts at Uber. Prior to Uber, he was the Lead AI Scientist at Alexa and was driving the Science for the Alexa Prize Competition, which is a $3.5 Million university competition for advancing the state of Conversational AI. Some of his recent work involves Open-domain Dialog Planning and Evaluation, Conversational Speech Recognition, Conversational Natural Language Understanding, and Sequential Modeling.
Prior to Alexa, Chandra was a Research Scientist at eBay, wherein he led various Deep Learning and NLP initiatives such as Automatic Text Summarization and Automatic Content Generation within the eCommerce domain, which has lead to significant gains for eBay. He holds degrees in Machine Learning and Computational Science & Engineering from Georgia Tech and BITS Pilani.
-
12:05
Conversational AI on the Edge
Sujith Ravi - Director - Amazon
Conversational AI on the Edge
Dr. Sujith Ravi is a Director at Amazon. Prior to that, he was leading and managing multiple ML and NLP teams and efforts in Google AI. He founded and headed Google’s large-scale graph-based semi-supervised learning platform, deep learning platform for structured and unstructured data as well as on-device machine learning efforts for products used by billions of people in Search, Ads, Assistant, Gmail, Photos, Android, Cloud and YouTube. These technologies power conversational AI (e.g., Smart Reply), Web and Image Search; On-Device predictions in Android and Assistant; and ML platforms like Neural Structured Learning in TensorFlow, Learn2Compress as Google Cloud service, TensorFlow Lite for edge devices.
Dr. Ravi has authored over 90 scientific publications and patents in top-tier machine learning and natural language processing conferences. His work has been featured in press: Wired, Forbes, Forrester, New York Times, TechCrunch, VentureBeat, Engadget, New Scientist, among others, and also won the SIGDIAL Best Paper Award in 2019 and ACM SIGKDD Best Research Paper Award in 2014. For multiple years, he was a mentor for Google Launchpad startups. Dr. Ravi was the Co-Chair (AI and deep learning) for the 2019 National Academy of Engineering (NAE) Frontiers of Engineering symposium. He was also the Co-Chair for ICML 2019, NAACL 2019, and NeurIPS 2018 ML workshops and regularly serves as Senior/Area Chair and PC of top-tier machine learning and natural language processing conferences like NeurIPS, ICML, ACL, NAACL, EMNLP, COLING, KDD, and WSDM.
-
12:25
Language and Interaction in Minecraft
Kavya Srinet - Research Engineer - Facebook AI Research (FAIR)
Language and Interaction in Minecraft
I will discuss a research program aimed at building a Minecraft assistant, in order to facilitate the study of agents that can complete tasks specified by dialogue, and eventually, to learn from dialogue interactions. I will describe the tools and platform we have built allowing players to interact with the agents and to record those interactions, and the data we have collected. In addition, I will describe an initial agent from which we (and hopefully others in the community) can iterate.
I am a Research Engineer at Facebook AI Research. Prior to FAIR, I was a Machine Learning Engineer at the Silicon Valley AI Lab at Baidu Research led by Adam Coates and Andrew Ng, where I worked on speech and NLP problems. I was at the Allen Institute for Artificial Intelligence for a summer before that working on learning to rank for Semantic Scholar. I did my grad school from Language Technology Institute at Carnegie Mellon University, where I worked on areas of machine translation, question answering and learning to rank for graphs and knowledge bases.
-
12:45
Interpreting Expression from Voice in Human-Computer Interactions
Vikramjit Mitra - Senior Research Scientist - Apple
Interpreting Expression from Voice in Human-Computer Interactions
Millions of people reach out to digital assistants such as Siri every day, asking for information, making phone calls, seeking assistance, and more. The expectation is that these assistants should understand the intent of the user’s query. Detecting the intent of a query from a short, isolated utterance is a difficult task. Intent cannot always be obtained from speech-recognized transcriptions. A transcription-driven approach can interpret what has been said but fails to acknowledge how it has been said, and as a consequence, may ignore the expression present in the voice. In this talk, we will explore if a machine-learned system can reliably detect vocal expression in queries using acoustic and primitive affective embeddings. We will further explore if it is possible to improve affective state detection from speech using a time-convolutional long-short time memory (TCLSTM) architecture. We will demonstrate that using intonation and affective state information can help to attain a relative equal error rate (EER) decrease of 60% compared to a bag-of-word based system, corroborating that expression is significantly represented by vocal attributes, rather than being purely lexical.
Vikramjit Mitra is a Senior Research Scientist at Apple who is working on speech science and machine learning for human-machine interactions. Previously he worked as an Advanced Research Scientist at SRI International's Speech Technology and Research Laboratory from 2011 to 2017. He received his PhD in Electrical Engineering from University of Maryland, College Park in 2010. His research interests include speech for health applications, robust signal processing for noise/channel/reverberation, speech recognition, production/perception-motivated signal processing, information retrieval, machine learning, and speech analytics. One of his major research contributions is the estimation of speech articulatory information from the acoustic signal, and using such information for recognition of both natural and clinical speech, and the detection of depressive symptoms in adults. He led SRI’s STAR lab’s efforts on robust acoustic feature research and development, which led to state-of-the-art results in keyword spotting and speech activity detection in DARPA’s Robust Automatic Transcription of Speech Program. He has served as the PI/co-PI of several projects funded by NSF and has worked on research efforts funded by DARPA, IARPA, AFRL, NSF and Sandia National Laboratories. He is a senior member of the IEEE and an affiliate member of the Speech and Language processing technical committee (SLTC), and he has served on the scientific committees of several workshops and technical conferences.
-
13:10
LUNCH
-
CREATING PERSONALITY IN AI ASSISTANTS
-
14:15
Avenging Clippy
Ryan Germick - Principal Designer - Google
Avenging Clippy
In this talk Ryan will show how a decade of doodling on Google’s homepage lead him to found the team that expresses the Google Assistant’s personality. Moreover, this talk will explore the opportunity of character–including anthropomorphic paper clips ahead of their time–as the mental model for conversational interfaces.
Ryan Germick is a Principal Designer at Google where he leads the Conversation Design, UX Writing, and Assistant Personality teams. Previously, Ryan was a founding member and long-time lead of the Google Doodle team.
-
14:35
Artificial Emotional Intelligence – A New Age of AI
Kai Rosenkranz - CTO - mindfulmachines A.I.
Artificial Emotional Intelligence – a new age of AI
For a fulfilling conversation, two components are essential: the chance to make yourself properly understood and a conversation partner responding to you appropriately both on the intellectual as well as the emotional level.
Thus, to provide successful and satisfying communication, a conversational AI must have emotional and verbal intelligence, i.e. a combination of empathy, reasoning, and advanced conversation skills. The ideal conversational AI must have empathy and own emotions, a user-related emotional and contextual memory, psychological knowledge, and human-like communication soft skills. It must be able to listen, learn, and adapt, and - in order to make sense in a conversation - connect the dots and draw conclusions. This way, it can engage the user, make appropriate recommendations, or take immediate appropriate action.
In this talk, I will argue that the ideal conversational AI should be a social animal: it strives to learn all about the user’s social surroundings and – if used within a community – can create significant additional value by pro-actively intercommunicating between users of all shapes. This way, communication obstacles between individuals and corporations, as well as between consumers and providers become easily eliminated – to the point where the conversational AI may serve as the user’s primary interface to his or her world.
A conversational AI like that is not a mere vision: In this session, I will - for the first time ever - introduce soulbot, the world’s first empathetic conversational AI with emotional intelligence. The unique features of this AEI are empathy, own emotions, and advanced, human-like communication skills and tactics. soulbot listens, adapts, and empathizes. soulbot continuously learns from and about the user, it connects the dots and draws conclusions to engage, support, make recommendations, or take immediate appropriate action.
An award-winning veteran of the games industry for more than 20 years, Kai Rosenkranz started his first job at game developer Piranha Bytes while still at school. As a software developer, composer, and participator of the studio, he worked on several AAA titles such as the Gothic series. Next, he established his own company, Nevigo/Articy Software, creating and trading architectural tools for software development and game design.
In 2016, Kai left Articy to take on the challenge as Chief Technology Officer for mindfulmachines A.I.. He poured his skill and experience into working with Dr. Klaus Lassert on the soulbot® engine, aiming to revolutionize the way we interact with voice assistants and other machines. As chief architect of the product, he has been a key player in creating soulbot®, the first conscious, mindful, and appreciative, artificial emotional intelligence.
On a mission to support other creative minds in the industry, and himself being a successful composer, Kai also founded the professional network “European Game Composers”. Kai is also an experienced speaker on games and technology, e.g. at GDC, devcom, Nordic Game, Develop.
-
14:50
The Power of Personalized Proactivity in AI Assistants
Ken Dodelin - VP of Conversational AI Products - Capital One
The Power of Personalized Proactivity in AI Assistants
Hear from Ken Dodelin, Vice President of Conversational AI, about the increasing role proactivity has in truly intelligent AI assistants. In this presentation, Ken will share insights and lessons learned through his team's multi-year product development journey with Eno, Capital One's AI assistant. Today, Eno does more than 15 things proactively for customers, allowing them to spend less time banking and more time living. Topics covered will include:
• the human-assistant proactive tasks that AI Assistants can be great at • finding the balance between helpfully proactive and intrusively annoying • the importance of character and defining the role of a proactive AI Assistant for consumers
Ken Dodelin is Vice President of Conversational AI Products at Capital One, where he leads teams building intelligent assistants. Previously, Ken served as VP of Product Development at Cricket Media and Director of Mobile Products at The Washington Post. Ken also founded and led Mobile Surroundings, which produced the "It Happened Here" mobile app that reached #1 in the iTunes Travel category. Ken is an Adjunct Faculty member of Georgetown University’s McDonough School of Business, where he teaches MBA courses in mobile and AI product development. Ken holds a JD and MBA from the University of North Carolina – Chapel Hill and a BS in Psychology from the College of William & Mary.
-
15:10
COFFEE
-
BUILDING BETTER AI ASSISTANTS
-
15:55
Dialogues with Plato: Concurrent Training of Conversational Agents
Alexandros Papangelis - Senior Research Scientist - Uber AI
Dialogues with Plato: Concurrent training of conversational agents
In this talk I will introduce our recently released Plato Research Dialogue System - a platform for developing conversational agents - and present a method for concurrently training two conversational agents, each with different role, by letting them interact with each other via self-generated language. This is achieved by employing multi-agent reinforcement learning methods to train our agents using DSTC2 data (in the domain of restaurant information) as a testbed and show the kinds of dialogues our system can generate.
Alex is currently with Uber AI, on the Conversational AI team; his interests include statistical dialogue management, natural language processing, and human-machine social interactions. Prior to Uber, he was with Toshiba Research Europe, leading the Cambridge Research Lab team on Statistical Spoken Dialogue. Before joining Toshiba, he was a post-doctoral fellow at CMU's Articulab, working with Justine Cassell on designing and developing the next generation of socially-skilled virtual agents. He received his PhD from the University of Texas at Arlington, MSc from University College London, and BSc from the University of Athens.
-
16:15
Building Language Technologies for Social Good
Diyi Yang - Assistant Professor - Georgia Tech
Building Language Technologies for Social Good
We live in an era where many aspects of our daily activities are recorded as textual and activity data, from social media posts, to medical and financial records, to work activities captured by Wikipedia and other online tools. My research combines techniques in natural language processing, machine learning, and theories in social science to study human behavior in online communities, with the goal of developing theories and systems to build better socio-technical systems. In this talk, I will explain my research from two specific studies. The first one focuses on modeling how people seek and offer support via language in online cancer support communities, and the second studies what makes language persuasive by introducing a semi-supervised neural network to recognize persuasion strategies in loan requests on crowdfunding platforms. Through these two examples, I show how we can accurately and efficiently model human communication to build better social systems.
Diyi Yang is an assistant professor in the School of Interactive Computing at Georgia Tech, also affiliated with the Machine Learning Center ([email protected]) at Georgia Tech. Diyi received her Ph.D. from the Language Technologies Institute at Carnegie Mellon University, and her bachelor's degree from Shanghai Jiao Tong University, China. She is interested in Computational Social Science, and Natural Language Processing. She has published at leading NLP/HCI conferences and journals, and received one Notable Dataset Award from EMNLP 2015, one Best Paper Honorable Mention from ICWSM 2016, and two Best Paper Honorable Mentions from SIGCHI 2019. Diyi was awarded Carnegie Mellon Presidential Fellowship and Facebook Ph.D. Fellowship.
-
16:35
Towards Interactive Learning of Spoken Dialog Systems
Bing Liu - Research Scientist - Facebook
Towards Interactive Learning of Spoken Dialog Systems
Spoken dialog system is a prominent component in today’s virtual personal assistant, which enables people to perform everyday tasks by interacting with devices via voice interfaces. Recent advances in deep learning enabled new research directions for end-to-end neural network based dialog modeling. Such data-driven learning systems address many limitations of the conventional dialog systems but also introduce new challenges. In this talk, we will discuss recent research work on deep and reinforcement learning for neural dialog systems. We will further discuss how we can the challenges on learning efficiency and scalability by combining offline training and online interactive learning with human-in-the-loop.
Bing is a research scientist at Facebook working on conversational AI. His area of work focuses on machine learning for spoken language processing, natural language understanding, and dialog systems. He develops conversational AI systems that learn continuously from user interactions with weak supervision via deep and reinforcement learning. Before joining Facebook, he interned at Google Research working on end-to-end learning of neural dialog systems. Bing received his Ph.D. from Carnegie Mellon University where he worked on deep learning and reinforcement learning for task-oriented dialog systems.
-
17:00
CONVERSATION & DRINKS
-
08:00
REGISTRATION
-
09:00
WELCOME
Ann Thyme-Gobbel - Director of UX/UI Design - Loose Cannon Systems
While doing a PhD in Cognitive Science and Linguistics at UCSD, Ann's interest in phonetics and NLP led to a dissertation using neural networks to model how speakers of a language use paradigms and analogy to create new words in that language. Since then, her professional life has spanned R&D, product development and services across organizations, including Nuance, Amazon Lab 126, and smaller startups. Always curious about how people talk to machines—and how machines respond—she's co-author of the upcoming book "Mastering Voice Interfaces".
-
DECISION-MAKING ASSISTANCE
-
09:10
Augmenting Human Capabilities with AI - Towards Digital Companions
Florian Michahelles - Head of Research Group, Artificial & Human Intelligence - Siemens Corporation
Augmenting Human Capabilities with AI - Towards Digital Companions
Florian Michahelles heads the Artificial & Human Intelligence research group focused on creating digital companions for industry with the aim to augment human capabilities. He will present the potential and limitations of AI and present examples of how to complement human capabilities rather than replacing them. Additionally, he will introduce corporate cross-functional initiatives of how to upscale workers in the digital age. Florian will present results and reflect on the lessons learned on the journey.
Florian Michahelles is the head of research group of Artificial & Human Intelligence at Siemens Corporation in Berkeley. Together with his team he focuses on the creation of Digital Companions for Industry. A Digital Companion is an entity that enhances human capabilities. It digests, integrates, and shares information with humans so they can focus on meaningful tasks. According to the user needs, a Digital Companion can act as a guardian, assistant, or partner. The Digital Companion embodiment adapts to what best suits the current user context and needs.
Prior his engagement with Siemens, Florian has been working as a director of the Auto-ID lab and lecturer at ETH Zurich. Florian has published 100+ academic papers in international conferences and journals and is actively supporting the research community by voluntary roles as program chair, research proposal evaluator and guest lecturer.
-
09:30
Democratizing Medical Imaging with AI Assisted Image Acquisition and Interpretation
Kilian Koepsell - Co-Founder & CTO - Caption Health
From Idea to Research to Implementation: Making a Real Difference
Kilian has spoken previously about Caption Health’s approach to address echocardiogram variability with AI. The company has since earned FDA breakthrough device designation and 510(k) clearance for its AI-guided ultrasound system, Caption AI, and announced its first customer, Northwestern Medicine. Not every company with an AI and healthcare idea can say the same. Kilian will illustrate what’s necessary to move beyond the theoretical and create useful technology for medical professionals, improving patient care. He will also discuss the medical settings that can benefit from A (and the challenges), and his take on how AI can democratize care.
Caption Health CTO and Co-Founder Kilian Koepsell leads the company’s efforts to use the latest in artificial intelligence and deep learning to bring the diagnostic power of ultrasound to more healthcare providers, democratizing access to healthcare and improving patient outcomes.
Prior to co-founding Caption Health, he worked on developing computer vision algorithms matched to the human visual processing system at the Redwood Neuroscience Institute and UC Berkeley — research he brought to Caption Health’s ultrasound guidance software. He also co-founded White Matter Technologies and was a founding team member at IQ Engines, which was acquired by Yahoo! for its Flickr group. He holds a PhD in physics from the University of Hamburg, as well as two master's degrees in mathematics and physics from the same university.
-
09:50
Developing CrisisBot: A Tool to Help Train Suicide Prevention Helpline Counselors
Orianna DeMasi - Postdoctoral Scholar - University of California, Davis
Developing CrisisBot: A Tool to Help Train Suicide Prevention Helpline Counselors
Crisis helplines fill a critical need by giving distressed individuals access to counselors experienced in de-escalation strategies during times of crisis. However, training counselors can be challenging for helplines that often operate with limited time and human resources. In addition to resource limitations, a significant challenge is that novice counselors must build experience in realistic environments without putting distressed individuals in danger. In an effort to provide a safe, no-risk environment for novice counselors to learn how to generate appropriate responses and practice counseling, we are building an interactive, automated tutoring system that novice counselors can use to learn and, more importantly, practice de-escalation strategies. This system includes a chat interface to practice counseling, real-time feedback, and real-time suggestions on improving counseling techniques. With such a system, new counselors could feel more confident starting out and crisis helplines could expand their services to help more people in need.
Orianna DeMasi is a postdoctoral scholar in Computer Science at the University of California, Davis working with Zhou Yu on applied dialogue systems. She completed her PhD in Computer Science at the University of California, Berkeley where she spent time as a Data Science Fellow at the Berkeley Institute of Data Science. Her research focuses on using computational tools to improve mental health, wellbeing, and care delivery.
-
10:10
AI for Healthcare: Scaling Access and Quality of Care for Everyone
-
CO-PRESENTING
Anitha Kannan - Founding Member - Curai
AI for Healthcare: Scaling Access and Quality of Care for Everyone
Half of the world’s population lack access to healthcare services. Just in the US alone, 30% of the working-age adult population have inadequate health insurance coverage to get even basic access to services. Meanwhile, the healthcare system is known to have large inefficiencies that current technology hasn’t been able to address. In this talk, we will describe how our work on combining latest AI advances with medical experts and online access has the huge potential to change the landscape in healthcare access and provide 24/7 quality healthcare.
The talk will have two parts. The first part focuses on our research in areas such as NLP and medical diagnosis; Using our research in medical diagnosis as a running example, the talk will emphasize the necessary properties for the machine learned models to be effective in realistic settings to assist doctors. The second part of the talk focuses on integration of research into product and building a machine learning feedback loop. Here, we will describe the unique challenges in deploying doctor-facing AI/ML models and how we overcome them for successful adoption.
Anitha Kannan is a founding member at Curai where she works on AI-driven solutions to healthcare. Prior to Curai, she has held senior research positions at Facebook AI research and at Microsoft research. She holds a PhD in machine learning from University of Toronto and was a Darwin Fellow at the University of Cambridge, UK. Her research also impacted products at Microsoft for which she has received many technical/business awards. She has extensively published in top-tier conferences with 2500+ citations and holds 25+ patents.
-
CO-PRESENTING
Sindhu Raghavan - Engineering Manager, Machine Learning - Curai
AI for Healthcare: Scaling Access and Quality of Care for Everyone
Half of the world’s population lack access to healthcare services. Just in the US alone, 30% of the working-age adult population have inadequate health insurance coverage to get even basic access to services. Meanwhile, the healthcare system is known to have large inefficiencies that current technology hasn’t been able to address. In this talk, we will describe how our work on combining latest AI advances with medical experts and online access has the huge potential to change the landscape in healthcare access and provide 24/7 quality healthcare.
The talk will have two parts. The first part focuses on our research in areas such as NLP and medical diagnosis; Using our research in medical diagnosis as a running example, the talk will emphasize the necessary properties for the machine learned models to be effective in realistic settings to assist doctors. The second part of the talk focuses on integration of research into product and building a machine learning feedback loop. Here, we will describe the unique challenges in deploying doctor-facing AI/ML models and how we overcome them for successful adoption.
Sindhu Raghavan leads the machine learning team at Curai, a health-tech startup using AI to provide the world’s best healthcare to everyone. Prior to joining Curai, Sindhu has held research and engineering positions at Netflix and Samsung Research. She holds a PhD in machine learning and natural language processing from the University of Texas at Austin. Her interests and work spans across several areas of machine learning including statistical relational learning, recommender systems, natural language processing, and deep learning.
-
10:30
COFFEE
-
APPLICATION OF AI ASSISTANTS IN INDUSTRY
-
11:00
Improving Voice Assistants’ Understanding Through Joint Contextual Learning
Sai Sumanth - Speech/ML Scientist - Uber AI
Improving Voice Assistants’ Understanding Through Joint Contextual Learning
The quality of automatic speech recognition (ASR) is critical to AI Assistants as ASR errors propagate to and directly impact downstream tasks such as language understanding (LU) and dialog management. In this talk, I will go over multi-task neural approaches to perform contextual language correction on ASR outputs jointly with LU to improve the performance of both tasks simultaneously. I will share the results obtained using state-of-the-art Generalized Pre-training (GPT) Language Models based joint ASR correction and language understanding tasks.
Sai is part of the conversational AI team at Uber, working on building conversational agents for Uber partners. In addition, he is also working on Ludwig, a code free deep learning toolbox open sourced by Uber AI. Prior to joining Uber AI, he led Uber’s efforts in reducing payment losses and Risk support costs using Machine Learning models.
Prior to Uber, Sai was a Master’s student in the Language Technologies Institute at CMU working with Prof. Alan Black on developing novel methods for Speech Translation in zero resource languages. Before CMU, he worked at Microsoft Research on developing automatic transcription techniques for Indian classical Music. He holds a degree in Electrical Engineering from IIT Kharagpur, where he worked on Digital Signal and Speech Processing problems.
-
11:20
Use Cases of Conversational Technology in Healthcare
Joseph Tyler - Director of Conversation Design - Sensely
Use Cases of Conversational Technology in Healthcare
For 3+ years, Joseph has been designing and delivering multi-modal conversational interactions for mobile and web interfaces at the avatar-driven healthtech company Sensely. His day-to-day tasks include dialogue design, VUI, configuration, usability testing, localization, NLP/speech and analytics and project and product management. Before moving into tech, Joseph completed his PhD in Linguistics (University of Michigan, 2012), followed by 3 years as postdoc and professor. His has research focused on discourse structure, intonation, sociolinguistics and psycholinguistics. He grew up in Michigan, went to college in DC (Georgetown University), and has lived abroad in Belgium, Germany and Qatar. This talk will discuss use cases of conversation technology in healthcare, showing how it can improve patient outcomes, increase access to care, and reduce costs. Drawing from his experience at the conversation platform company Sensely, Joseph will discuss symptom checkers, chronic care monitoring, and other projects. His discussion will include demos of the tools themselves, how they are constructed, and how they are used.
-
11:40
Virtual Learning Assistants for Scalable Education
Dee Kanejiya - Founder & CEO - Cognii
Virtual Learning Assistants for Scalable Education
This session will provide an introduction to Virtual Learning Assistants for scaling the quality and affordability of education. Compared to the general purpose Virtual Assistants designed to answer users’ short questions, Virtual Learning Assistants focus on evaluating users’ long explanatory answers and engaging them in a conversational tutoring dialog. This requires different types of Natural Language Understanding and Natural Language Generation models optimized for processing longer linguistic inputs. This session will provide technical details and application use cases of Virtual Learning Assistants.
Dee Kanejiya is the founder and CEO of Cognii, a leading provider of conversational AI technology to the education industry. He has over two decades of experience in technology and business development in the areas of AI, speech recognition, natural language understanding, and machine learning. Prior to starting Cognii, he developed multilingual virtual assistant technologies for smartphones at Nuance Communications, and Vlingo Corporation. He studied Master’s and PhD in Electrical Engineering at Indian Institute of Technology Delhi and conducted research at Carnegie Mellon University, and Karlsruhe Institute of Technology, Germany.
-
12:00
Building a Conversational AI Platform for Banking
Dominique Boucher - Chief Solutions Architect, Conversational AI Platform - National Bank of Canada
Building a Virtual Assistant Factory
As Conversational AI applications gain traction in an organization, it is of paramount importance to put in place the right tools and processes to develop, deploy, and evolve virtual assistants in an efficient way. In this talk, we will share our experience building a virtual assistant factory to deploy chatbots at scale at NBC, without compromising on performance and customer experience.
Key takeaways:
Deployment process needs to be aligned with DevOps best practices
Maintenance of conversational AI apps require care and education
Efficiency comes from tight collaboration with multiple teams
Dominique Boucher is currently Chief Solutions Architect / Conversational AI at National Bank of Canada where he is responsible for the development of NBC's dialogue system platform from the technical side. His main interests revolve around the application of AI/machine learning techniques to complex business problems, and in the use of conversational interfaces to help optimize business processes in particular. Prior to that, he was the CTO of Nu Echo where he led the Omnichannel Innovations Lab, from both the business and R&D perspectives. He has been in the speech recognition and conversational AI industry for more than 20 years. He holds a PhD from the University of Montreal.
-
12:20
LUNCH
-
FUTURE OF AI ASSISTANTS
-
13:30
Towards Scalable Multi-domain Conversational Agents
Abhinav Rastogi - Senior Software Engineer - Google Research
Towards Scalable Multi-domain Conversational Agents
Large-scale virtual assistants, like Google Assistant, Amazon Alexa, Apple Siri etc. help users to accomplish a wide variety of tasks. They need to integrate with a large and constantly increasing number of services or APIs over a wide variety of domains. Supporting new services with ease, without retraining the model, and reducing maintenance workload are necessary to accommodate future growth. To highlight these challenges, we recently released the Schema-Guided dialogue dataset, which is the largest publicly available corpus of task-oriented dialogues. In this talk, I will describe the methodology of creation of this dataset which minimizes the need for complex manual annotation, while considerably reducing the time and cost of data collection. As a solution to the above challenges, I will also introduce the schema-guided approach for building virtual assistants, which utilizes a single model across all services and domains, with no domain-specific parameters.
Abhinav Rastogi is a Senior Software Engineer at Google Research, working on dialogue systems. His research interests include natural language understanding, language generation and multimodal dialogue. Previously, Abhinav was at Stanford University, where he worked with Prof. Andrew Ng on video understanding and Prof. Christopher Manning on natural language inference. Abhinav holds degrees in Electrical Engineering from Stanford University and IIT Bombay.
-
13:55
Immersive Conversational Assistants are the Next Wave in AI
Shivani Poddar - Tech Lead (Machine Learning/AI) - Facebook
How to Transform your Company with Conversational AI - 3xing Product, People and Processes
Why is it that traditionally the area where assistants are most valuable are corporate/enterprise settings, however the most pervasive are personal assistants? Why is it that the average Joe who has never had a human assistant is expected to leverage a digital one? Answering these questions leads us to one of the most underutilized markets in the world of conversational AI, Enterprise/Corporates. With this talk I explore the next leaps that conversational AI can make (and is making), from building contextualized transcriptions, note taking, meeting summaries to scheduling meetings and reminders, how conversational AI is ready to disrupt businesses of today and tomorrow. While we explore the possibilities of tomorrow, I will also chalk out some of the pitfalls of these systems, and how to hedge against them both as a developer and as a business employing these conv AI systems.
Key Takeaways:
State of conversational Assistants for Enterprise Use cases
Key areas where conv AI is ready to disrupt the market
Pitfalls of these conv AI systems and how to avoid them
Shivani is an machine learning engineer on the Facebook Assistant team working on both the product and research arms around machine learning reasoning for assistants, and multi-modal assistants of the future. Before Facebook, she was at Carnegie Mellon University, where she helped build the CMU Magnus system for social chit chat ground up for the first wave of Amazon Alexa Prize Challenge. She has also published work on modeling user psychology, and building argumentation systems that help in negotiation. Her research background spans across-disciplines such as computer science, psychology and machine learning.
-
14:20
PANEL: What Does the Democratization of AI Mean for the Future of AI Assistants?
-
PANELIST
Adam Cheyer - Co-Founder and VP Engineering/VP of R&D - Viv Labs/Samsung
Building a Conversational Experience in Minutes with Samsung’s Bixby
For decades, the relationship between developer and computer was simple: the human told the machine what to do. Next came machine learning systems, where the machine was in charge of computing the functional logic behind developer-supplied examples, typically in a form that humans couldn't even understand. Now we are entering a new age of software development, where humans and machines work collaboratively together, each doing what they do best. The Developer describes the "what" -- objects, actions, goals -- and the machine produces the "how", writing the code that satisfied each user's request by interweaving developer-provided components. The result is a system that is easier to create and maintain, while providing an end-user experience that is more intelligent and adaptable to users' individual needs. In this talk, we will show concrete examples of this software trend using a next-generation conversational assistant named Bixby. We will supply you with a freely downloadable development environment so that you can give this a try yourself, and teach you how to build a conversational experience in minutes, to start monetizing your content and services through a new channel that will be backed by more than a billion devices in just a few years.
Adam Cheyer is co-Founder and VP Engineering of Viv Labs, and after acquisition in 2016, a VP of R&D at Samsung. Previously, Mr. Cheyer was co-Founder and VP Engineering at Siri, Inc. In 2010, Siri was acquired by Apple, where he became a Director of Engineering in the iPhone/iOS group. Adam is also a Founding Member and Advisor to Change.org, the premier social network for positive social change, and a co-Founder of Sentient Technologies. Mr. Cheyer is an author of more than 60 publications and 27 issued patents.
-
PANELIST
Eugenia Kuyda - CEO & Co-Founder - Luka
Eugenia Kuyda is a co-founder and CEO at Luka. Prior to starting Luka, she founded a branding agency where her clients consisted of Formula One teams and large banks. She also led a banking project built on top of a telecommunications company where she developed an SMS bank assistant. Prior to that, she began working as a journalist at age 12 for Novaya Gazeta (nationwide newspaper) where she had her own column. She also wrote for Sport Express, GQ and Tatler, and worked as a line producer for Bazelevs, published two books and finally became editor-in-chief at Afisha, a popular lifestyle magazine. She holds an MBA from the London Business School.
-
PANELIST
Danielle Deibler - Co-Founder & CEO - Marvelous.ai
Danielle Deibler is the Co-Founder and CEO of Marvelous.ai, a new startup focused on building natural language technology to discover and expose propaganda, disinformation and bias and enable advocates and policymakers to devise counter-measures and immunities. She has over 25 years in the Internet infrastructure, security, networking, interactive technology, machine learning and AI technologies. Her primary area of focus in the last 20 years has been building scalable real time interactive platforms. Previously she was CEO and Co-Founder of leading edge Reg-Tech startup Compliance.ai. She founded Apps54 and Ignited Artists. Prior to that she was an Entrepreneur in Residence at Trinity Ventures. Deibler has held senior leadership positions in software development, engineering, business development and product management for KIXEYE, Adobe, DIGEX, and UltraDNS
-
15:00
END OF SUMMIT
-
15:00
FAREWELL NETWORKING MIXER
Day 1
10:25
Introduction to Reinforcement Learning
Lex Fridman - AI Researcher - MIT
An Introduction to Reinforcement Learning
Lex Fridman is a researcher at MIT, working on deep learning approaches in the context of semi-autonomous vehicles, human sensing, personal robotics, and more generally human-centered artificial intelligence systems. He is particularly interested in understanding human behavior in the context of human-robot collaboration, and engineering learning-based methods that enrich that collaboration. Before joining MIT, Lex was at Google working on machine learning for large-scale behavior-based authentication.


Day 1
11:10
Muppets and Transformers: The New Stars of NLP
Joel Grus - Principal Engineer - Capital Group
Muppets and Transformers: The New Stars of NLP
The last few years have seen huge progress in NLP. Transformers have become a fundamental building block for impressive new NLP models. ELMo, BERT, and their descendants have achieved new state-of-the-art results on a wide variety of tasks. In this talk I'll give some history of these "new stars" of NLP, explain how they work, compare them to their predecessors, and discuss how you can apply them to your own problems.
Joel Grus is Principal Engineer at Capital Group, where he oversees the development and deployment of machine learning systems. Previously he was a research engineer at the Allen Institute for Artificial Intelligence, where he helped develop AllenNLP, a deep learning library for NLP researchers. Before that he worked as a software engineer at Google and a data scientist at a variety of startups. He is the author of the beloved book Data Science from Scratch: First Principles with Python, the beloved blog post "Fizz Buzz in Tensorflow", and the polarizing JupyterCon talk "I Don't Like Notebooks". You can find him on Twitter @joelgrus


Day 1
12:45
Lunch & Learn
Join the Speakers for Lunch - - Roundtable Discussions during Lunch
Day 1
14:25
How Can AI Aid Digital Transformation – Mesh Twin Learning?
Maciej Mazur - Chief Data Scientist - PGS Software
Fraud Detection in 2020 - Bad Guys Perspective
AI is evolving rapidly these days, and together with it are our fraud detection systems. I want to show you what is the current state of the art approach to fraud detection, how are such systems implemented, and what are key differentiators to look at when choosing a solution for your business (AML, credit card frauds and insurance). Next we will focus on credit card frauds, but not from a payment provider or a bank perspective but a criminal. Learn more on what are the newest technology trends for card frauds, how bad guys build their infrastructure and how they cheat and manipulate your million dollars black boxes that are supposed to keep you safe.
As Chief Data Scientist at PGS Software, Maciej is the technical lead of the data team and implements ML-based solutions for clients around the globe. In his 10 years of IT-experience, he’s worked for major players like Nokia and HPE, developing complex optimisation algorithms even before the term Data Science was coined.


Day 1
16:00
Hands-on Workshop: BERT based Conversational Q&A Platform for Querying a complex RDBMS with No Code
Peter Relan - Chairman and CEO - Got-it.ai
Hands-on Workshop: BERT based Conversational Q&A Platform for Querying a complex RDBMS with No Code
Most business and operations people in organizations want to ask questions of databases regularly. But they are limited by minimal schema understanding and SQL skills. In the field of AI, conversational agents like Rasa, Dialogflow, Lex, Watson, Luis are emerging as NLU-based dialog agents that hook into actions or custom fulfillment logic. Got It is unveiling the first AI product that creates a conversational interface to any custom database schema on MySQL or Google Big Query, using Rasa or Dialog Flow. Got It’s No Code approach automates the discovery and addition of new intents/slots and actions, based on incoming user questions and knowledge of the database schema. Thus, the end-end system adapts itself to an evolving schema and user questions until it can answer virtually any question. Got It supports full sentence NLP for chat based UIs, and search keyword NLP for Analytics UIs to dynamically query a database, without custom fulfillment logic, by utilizing a proprietary DNN.
This workshop provides a hands-on session demonstrating how quick the set up is for the product to start retrieving data from a sophisticated retail industry database schema, for both business analytics as well as for customer service use cases.
Peter Relan is the founding investor and chairman of breakthrough companies, including Discord (300M users), Epic! (95% of US elementary schools) and Got-it.ai (AI+Human Intelligence for Saas and Paas products). Formerly a Hewlett Packard Resident Fellow at Stanford University, and a senior Oracle executive, Peter is working with the Got It team on driving user and business productivity higher by 10X, applying Google BERT and transfer learning to real business databases with minimal training data sets, that allow users to program queries and analytics tools with no technical skills.


Day 2
10:30
Panel & Networking
Investing in Startups: Hear from the Investors - - Panel & Connect
Session takeaways: 1) What are the short, medium and long-term challenges in investing in AI to solve challenges in business & society? 2) What are the main success factors for AI startups? 3) What are the challenges from a VC perspective?
Day 2
11:20
Ludwig, a Code-Free Deep Learning Toolbox
Piero Molino, Uber AI - Sr. Research Scientist & Co-Founder - Uber AI
Ludwig, a Code-Free Deep Learning Toolbox
The talk will introduce Ludwig, a deep learning toolbox that allows to train models and to use them for prediction without the need to write code. It is unique in its ability to help make deep learning easier to understand for non-experts and enable faster model improvement iteration cycles for experienced machine learning developers and researchers alike. By using Ludwig, experts and researchers can simplify the prototyping process and streamline data processing so that they can focus on developing deep learning architectures.
Piero Molino is a Senior Research Scientist at Uber AI with focus on machine learning for language and dialogue. Piero completed a PhD on Question Answering at the University of Bari, Italy. Founded QuestionCube, a startup that built a framework for semantic search and QA. Worked for Yahoo Labs in Barcelona on learning to rank, IBM Watson in New York on natural language processing with deep learning and then joined Geometric Intelligence, where he worked on grounded language understanding. After Uber acquired Geometric Intelligence, he became one of the founding members of Uber AI Labs. He currently leads the development of Ludwig, a code-free deep learning framework.


Day 2
11:50
Building a Conversational Experience in Minutes with Samsung’s Bixby
Adam Cheyer - Co-Founder and VP Engineering/VP of R&D - Viv Labs/Samsung
Building a Conversational Experience in Minutes with Samsung’s Bixby
For decades, the relationship between developer and computer was simple: the human told the machine what to do. Next came machine learning systems, where the machine was in charge of computing the functional logic behind developer-supplied examples, typically in a form that humans couldn't even understand. Now we are entering a new age of software development, where humans and machines work collaboratively together, each doing what they do best. The Developer describes the "what" -- objects, actions, goals -- and the machine produces the "how", writing the code that satisfied each user's request by interweaving developer-provided components. The result is a system that is easier to create and maintain, while providing an end-user experience that is more intelligent and adaptable to users' individual needs. In this talk, we will show concrete examples of this software trend using a next-generation conversational assistant named Bixby. We will supply you with a freely downloadable development environment so that you can give this a try yourself, and teach you how to build a conversational experience in minutes, to start monetizing your content and services through a new channel that will be backed by more than a billion devices in just a few years.
Adam Cheyer is co-Founder and VP Engineering of Viv Labs, and after acquisition in 2016, a VP of R&D at Samsung. Previously, Mr. Cheyer was co-Founder and VP Engineering at Siri, Inc. In 2010, Siri was acquired by Apple, where he became a Director of Engineering in the iPhone/iOS group. Adam is also a Founding Member and Advisor to Change.org, the premier social network for positive social change, and a co-Founder of Sentient Technologies. Mr. Cheyer is an author of more than 60 publications and 27 issued patents.

Day 2
14:00
Panel & Q&A
Ethics in AI: Panel, Q&A & Drop-In - - Hear from Experts in Ethics and Ask your Questions