Publication
 Learning Language from Vision. Workshop on Visually Grounded Interaction and Language (ViGIL) at the Thirty-third Annual Conference on Neural Information Processing Systems (NeurIPS) (2019).
 ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models. Neural Information Processing Systems (NeurIPS 2019) (2019).
 9142-objectnet-a-large-scale-bias-controlled-dataset-for-pushing-the-limits-of-object-recognition-models.pdf (16.31 MB)
 Assessing Language Proficiency from Eye Movements in Reading. 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2018). at <http://naacl2018.org/>
 1804.07329.pdf (350.43 KB)
 Deep sequential models for sampling-based planning. The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018) (2018). doi:10.1109/IROS.2018.8593947
 kuo2018planning.pdf (637.67 KB)
 Grounding language acquisition by training semantic parsersusing captioned videos. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), (2018). at <http://aclweb.org/anthology/D18-1285>
 Ross-et-al_ACL2018_Grounding language acquisition by training semantic parsing using caption videos.pdf (3.5 MB)
 The Wiley Handbook of Human Computer Interaction 2, 539-559 (John Wiley & Sons , 2018).
 Partially Occluded Hands: A challenging new dataset for single-image hand pose estimation. The 14th Asian Conference on Computer Vision (ACCV 2018) (2018). at <http://accv2018.net/>
 partially-occluded-hands-6.pdf (8.29 MB)
 Predicting Native Language from Gaze. Annual Meeting of the Association for Computational Linguistics (ACL 2017) (2017).
 Temporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI 2017) (2017). at <c>
 Anchoring and Agreement in Syntactic Annotations. (2016).
 CBMM-Memo-055.pdf (768.54 KB)
 Contrastive Analysis with Predictive Power: Typology Driven Estimation of Grammatical Error Distributions in ESL. (2016).
 memo-50.pdf (493.74 KB)
 Do You See What I Mean? Visual Resolution of Linguistic Ambiguities. (2016).
 memo-51.pdf (2.74 MB)
 Language and Vision Ambiguities (LAVA) Corpus. (2016). at <http://web.mit.edu/lavacorpus/>
 D15-1172.pdf (2.42 MB)
 Learning to Answer Questions from Wikipedia Infoboxes. The 2016 Conference on Empirical Methods on Natural Language Processing (EMNLP 2016) (2016).
 Morales-EMNLP2016.pdf (197.28 KB)
 A look back at the June 2016 BMM Workshop in Sestri Levante, Italy. (2016).
 Sestri Levante Review (359.33 KB)
 Universal Dependencies for Learner English. (2016).
 memo-52_rev1.pdf (472.67 KB)
 Contrastive Analysis with Predictive Power: Typology Driven Estimation of Grammatical Error Distributions in ESL. Nineteenth Conference on Computational Natural Language Learning (CoNLL), Beijing, China (2015).
 Do You See What I Mean? Visual Resolution of Linguistic Ambiguities. Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal. (2015).
 Towards a Programmer's Apprentice (Again). (2015).
 CBMM-memo-030.pdf (294.27 KB)
]