Speech Representations and Speech Tasks
- Speech Representation, Perception and Recognition
Keith Johnson, UC Berkeley
Abstract: At one point in my research career I was interested in finding “the” representation of speech. But clearly, there is no one level of speech representation. Researchers find it useful to represent speech in a variety of ways. Linguists and phoneticians use phonemes, distinctive features, gestures, and locus equations. Engineers use MFCCs, triphones, and delta features. Psychoacousticians use STRFs, and modulation spectra. This is all well and good for analysts - we have different speech tasks, need to find answers to different kinds of questions, so naturally we will use different ways to represent speech. But, do listeners also also have different tasks to accomplish with speech, and are these different tasks accomplished by reference to different representations? The answer is yes. This talk will discuss some of the speech tasks that listeners face, and the evidence that listeners use different representations in support of these tasks.
have an interactive transcript feature enabled, which appears below the video when playing. Viewers can search for keywords in the video or click on any word in the transcript to jump to that point in the video. When searching, a dark bar with white vertical lines appears below the video frame. Each white line is an occurrence of the searched term and can be clicked on to jump to that spot in the video.
LEARNING