Export 46 results:
Filters: Author is Andrei Barbu [Clear All Filters]
Deep video-to-video transformations for accessibility with an application to photosensitivity. Pattern Recognition Letters (2019). doi:10.1016/j.patrec.2019.01.019
Learning Language from Vision. Workshop on Visually Grounded Interaction and Language (ViGIL) at the Thirty-third Annual Conference on Neural Information Processing Systems (NeurIPS) (2019).
ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models. Neural Information Processing Systems (NeurIPS 2019) (2019). 9142-objectnet-a-large-scale-bias-controlled-dataset-for-pushing-the-limits-of-object-recognition-models.pdf (16.31 MB)
Deep sequential models for sampling-based planning. The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018) (2018). doi:10.1109/IROS.2018.8593947 kuo2018planning.pdf (637.67 KB)
Grounding language acquisition by training semantic parsersusing captioned videos. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), (2018). at <http://aclweb.org/anthology/D18-1285> Ross-et-al_ACL2018_Grounding language acquisition by training semantic parsing using caption videos.pdf (3.5 MB)
Partially Occluded Hands: A challenging new dataset for single-image hand pose estimation. The 14th Asian Conference on Computer Vision (ACCV 2018) (2018). at <http://accv2018.net/> partially-occluded-hands-6.pdf (8.29 MB)
Temporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI 2017) (2017). at <c>
Anchoring and Agreement in Syntactic Annotations. (2016). CBMM-Memo-055.pdf (768.54 KB)
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities. (2016). memo-51.pdf (2.74 MB)
Language and Vision Ambiguities (LAVA) Corpus. (2016). at <http://web.mit.edu/lavacorpus/> D15-1172.pdf (2.42 MB)
A look back at the June 2016 BMM Workshop in Sestri Levante, Italy. (2016). Sestri Levante Review (359.33 KB)
A Compositional Framework for Grounding Language Inference, Generation, and Acquisition in Video. (2015). doi:doi:10.1613/jair.4556.
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities. Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal. (2015).
Abstracts of the 2014 Brains, Minds, and Machines Summer Course. (2014). CBMM-Memo-024.pdf (2.86 MB)
The Compositional Nature of Event Representations in the Human Brain. (2014). CBMM Memo 011.pdf (3.95 MB)
Seeing is Worse than Believing: Reading People’s Minds Better than Computer-Vision Methods Recognize Actions. (2014). CBMM Memo 012.pdf (678.95 KB)
Computer Vision – ECCV 2014, Lecture Notes in Computer Science 8693, 612–627 (Springer International Publishing, 2014).
Seeing What You’re Told: Sentence-Guided Activity Recognition In Video. (2014). CBMM-Memo-006.pdf (1.2 MB).
Seeing What You’re Told: Sentence-Guided Activity Recognition In Video. CVPR (IEEE, 2014). Publication (453.54 KB).
Seeing what you're told, sentence guided activity recognition in video. Appeared at CVPR (2014). poster-1701.pdf (4.61 MB).