%0 Conference Paper %B Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), %D 2018 %T Grounding language acquisition by training semantic parsersusing captioned videos %A Candace Ross %A Andrei Barbu %A Yevgeni Berzak %A Battushig Myanganbayar %A Boris Katz %X

We develop a semantic parser that is trained ina grounded setting using pairs of videos cap-tioned with sentences. This setting is bothdata-efficient, requiring little annotation, andsimilar to the experience of children wherethey observe their environment and listen tospeakers. The semantic parser recovers themeaning of English sentences despite not hav-ing access to any annotated sentences. It doesso despite the ambiguity inherent in visionwhere a sentence may refer to any combina-tion of objects, object properties, relations oractions taken by any agent in a video. For thistask, we collected a new dataset for groundedlanguage acquisition. Learning a grounded se-mantic parser — turning sentences into logi-cal forms using captioned videos — can sig-nificantly expand the range of data that parserscan be trained on, lower the effort of training asemantic parser, and ultimately lead to a betterunderstanding of child language acquisition.

%B Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), %C Brussels, Belgium %8 10/2018 %@ 978-1-948087-84-1 %G eng %U http://aclweb.org/anthology/D18-1285