The promise of ASR: where we stand and what is still missing
- Speech Representation, Perception and Recognition
Abdelrahman Mohamed, Microsoft Research
Abstract: In the past decade, the ASR technology made a huge leap forward in terms of word recognition accuracy, leading to the recent announcement of Microsoft of achieving human parity in conversational speech. In this talk, I will reflect on the recent advances in Neural Network models for acoustic models with special interest in understanding the relation between different models. I will also discuss many outstanding research directions to achieve the promise of conversational systems.
have an interactive transcript feature enabled, which appears below the video when playing. Viewers can search for keywords in the video or click on any word in the transcript to jump to that point in the video. When searching, a dark bar with white vertical lines appears below the video frame. Each white line is an occurrence of the searched term and can be clicked on to jump to that spot in the video.
LEARNING