Host: Prof. Boris Katz (CSAIL, MIT)
Abstract: Children acquire language from very little data by observing and interacting with other agents and their environment. We demonstrate how by combining methods from robotics, vision, and NLP with a compositional approach, we can create a semantic parser that acquires language with no direct supervision; just captioned videos and access to a physical simulator. Language that describes social situations is often overlooked, to fill this gap we develop a simulator that supports both physical and social interactions. Current models in NLP, despite seeing orders of magnitude more data than children, routinely make mistakes related to physical and social interactions; this approach may lead to filling in these gaps.
We will also discuss a new dataset and methodology for running large-scale experiments in the neuroscience of language; experiments on the scale of those performed with artificial language models in the NLP community. Being able to investigate the relationship between multiple linguistic concepts on the same neural data may reveal how parts of the language network relate to one another. We start by identifying parts of the language network which compute and predict the part of speech of an overhead word.