Do You See What I Mean? Visual Resolution of Linguistic Ambiguities

Title	Do You See What I Mean? Visual Resolution of Linguistic Ambiguities
Publication Type	CBMM Memos
Year of Publication	2016
Authors	Berzak, Y, Barbu, A, Harari, D, Katz, B, Ullman, S
Date Published	09/2016
Abstract	Understanding language goes hand in hand with the ability to integrate complex contextual information obtained via perception. In this work, we present a novel task for grounded language understanding: disambiguating a sentence given a visual scene which depicts one of the possible interpretations of that sentence. To this end, we introduce a new multimodal corpus containing ambiguous sentences, representing a wide range of syntactic, semantic and discourse ambiguities, coupled with videos that visualize the different interpretations for each sentence. We address this task by extending a vision model which determines if a sentence is depicted by a video. We demonstrate how such a model can be adjusted to recognize different interpretations of the same underlying sentence, allowing to disambiguate sentences in a unified fashion across the different ambiguity types.
arXiv	arXiv:1603.08079v1 [cs.CV]
DSpace@MIT	http://hdl.handle.net/1721.1/103400

Download:

CBMM Memo No: 051

Research Area: