Vision and Language

Research Thrust: Vision and Language

Shimon UllmanVision can be combined with aspects of language and social cognition to obtain and communicate complex knowledge about the surrounding environment, for example, to answer a large and flexible set of queries about objects and agents in an image or video in a human-like manner, as captured in the CBMM Challenge. These lectures provide an overview of current approaches aimed at achieving this understanding from visual input, an overview of the START natural language system, and recent efforts to bridge these capabilities. The last lecture of this series addresses a cognitive ability that distinguishes human intelligence from that of other primates: the ability to tell, understand, and recombine stories.

Presentations