Publication
Abstracts of the 2014 Brains, Minds, and Machines Summer Course. (2014).
CBMM-Memo-024.pdf (2.86 MB)
The Aligned Multimodal Movie Treebank: An audio, video, dependency-parse treebank. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (2022).
Anchoring and Agreement in Syntactic Annotations. (2016).
CBMM-Memo-055.pdf (768.54 KB)
BrainBERT: Self-supervised representation learning for Intracranial Electrodes. International Conference on Learning Representations (2023). at <https://openreview.net/forum?id=xmcYx_reUn6>
985_brainbert_self_supervised_repr.pdf (9.71 MB)
. A Compositional Framework for Grounding Language Inference, Generation, and Acquisition in Video. (2015). doi:doi:10.1613/jair.4556
The Compositional Nature of Event Representations in the Human Brain. (2014).
CBMM Memo 011.pdf (3.95 MB)
Compositional Networks Enable Systematic Generalization for Grounded Language Understanding. (2021).
CBMM-Memo-129.pdf (1.2 MB)
Compositional RL Agents That Follow Language Commands in Temporal Logic. (2021).
CBMM-Memo-127.pdf (2.12 MB)
Compositional RL Agents That Follow Language Commands in Temporal Logic. Frontiers in Robotics and AI 8, (2021).
frobt-08-689550.pdf (1.57 MB)
Deep Compositional Robotic Planners that Follow Natural Language Commands. Workshop on Visually Grounded Interaction and Language (ViGIL) at the Thirty-third Annual Conference on Neural Information Processing Systems (NeurIPS), (2019). at <https://vigilworkshop.github.io/>
Deep compositional robotic planners that follow natural language commands . International Conference on Robotics and Automation (ICRA) (2020).
Deep compositional robotic planners that follow natural language commands. (2020).
CBMM-Memo-124.pdf (1.03 MB)
Deep sequential models for sampling-based planning. The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018) (2018). doi:10.1109/IROS.2018.8593947
kuo2018planning.pdf (637.67 KB)
Deep video-to-video transformations for accessibility with an application to photosensitivity. Pattern Recognition Letters (2019). doi:10.1016/j.patrec.2019.01.019
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities. Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal. (2015).
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities. (2016).
memo-51.pdf (2.74 MB)
Encoding formulas as deep networks: Reinforcement learning for zero-shot execution of LTL formulas. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2020). doi:10.1109/IROS45743.2020.9341325
Grounding language acquisition by training semantic parsersusing captioned videos. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), (2018). at <http://aclweb.org/anthology/D18-1285>
Ross-et-al_ACL2018_Grounding language acquisition by training semantic parsing using caption videos.pdf (3.5 MB)
Incorporating Rich Social Interactions Into MDPs. (2022).
CBMM-Memo-133.pdf (1.68 MB)
Language and Vision Ambiguities (LAVA) Corpus. (2016). at <http://web.mit.edu/lavacorpus/>
D15-1172.pdf (2.42 MB)
Learning a Natural-language to LTL Executable Semantic Parser for Grounded Robotics. (Proceedings of Conference on Robot Learning (CoRL-2020), 2020). at <https://corlconf.github.io/paper_385/>
Learning a natural-language to LTL executable semantic parser for grounded robotics. (2020). doi:https://doi.org/10.48550/arXiv.2008.03277
CBMM-Memo-122.pdf (1.03 MB)
Learning Language from Vision. Workshop on Visually Grounded Interaction and Language (ViGIL) at the Thirty-third Annual Conference on Neural Information Processing Systems (NeurIPS) (2019).
]