Temporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context

TitleTemporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context
Publication TypeConference Paper
Year of Publication2017
AuthorsPaul, R, Barbu, A, Felshin, S, Katz, B, Roy, N
Conference NameProceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI 2017)
Date Published08/2017
Conference LocationMelbourne, Australia

A robot’s ability to understand or ground natural language instructions is fundamentally tied to its knowledge about the surrounding world. We present an approach to grounding natural language utter- ances in the context of factual information gathered through natural-language interactions and past vi- sual observations. A probabilistic model estimates, from a natural language utterance, the objects, re- lations, and actions that the utterance refers to, the objectives for future robotic actions it implies, and generates a plan to execute those actions while up- dating a state representation to include newly ac- quired knowledge from the visual-linguistic context. Grounding a command necessitates a representa- tion for past observations and interactions; however, maintaining the full context consisting of all pos- sible observed objects, attributes, spatial relations, actions, etc., over time is intractable. Instead, our model, Temporal Grounding Graphs , maintains a learned state representation for a belief over factual groundings, those derived from natural-language in- teractions, and lazily infers new groundings from visual observations using the context implied by the utterance. This work significantly expands the range of language that a robot can understand by incor- porating factual knowledge and observations of its workspace into its inference about the meaning and grounding of natural-language utterances.


Research Area: 

CBMM Relationship: 

  • CBMM Funded