Andrei Barbu: From Language to Vision and Back Again

Andrei Barbu: From Language to Vision and Back Again

Date Posted:  June 9, 2014
Date Recorded:  June 9, 2014
CBMM Speaker(s):  Andrei Barbu
  • All Captioned Videos
  • Brains, Minds and Machines Summer Course 2014

Topics: Importance of bridging low-level perception with high-level cognition; model system for a limited domain that can (1) recognize how well a sentence describes a video, (2) retrieve sample videos for which a sentence is true, (3) generate language descriptions and answer questions about videos, (4) acquire language concepts, (5) use video to resolve language ambiguity, (6) translate between languages, and (7) guide planning; determining whether a sentence describes a video involves recognizing participants, movement, directions, relationships; overview of system that starts with many unreliable detections, uses HMMs to track coherently moving objects and recognize words from tracks, and gets information about participants and relations from a dependency parser (e.g. START) to encode sentence structure; similar approach is used to generate sentences and answer questions about videos (combining trackers and words); examples involving simple objects and agents performing actions such as approach, pick up, put down; translation between languages via imagination of videos depicting sentences

Associated Research Thrust: