Publication
Export 52 results:
Filters: Author is Andrei Barbu [Clear All Filters]
Encoding formulas as deep networks: Reinforcement learning for zero-shot execution of LTL formulas. (2020). CBMM-Memo-125.pdf (2.12 MB)
Encoding formulas as deep networks: Reinforcement learning for zero-shot execution of LTL formulas. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2020). doi:10.1109/IROS45743.2020.9341325
Learning a Natural-language to LTL Executable Semantic Parser for Grounded Robotics. (Proceedings of Conference on Robot Learning (CoRL-2020), 2020). at <https://corlconf.github.io/paper_385/>
Learning a natural-language to LTL executable semantic parser for grounded robotics. (2020). doi:https://doi.org/10.48550/arXiv.2008.03277 CBMM-Memo-122.pdf (1.03 MB)
PHASE: PHysically-grounded Abstract Social Eventsfor Machine Social Perception. Shared Visual Representations in Human and Machine Intelligence (SVRHM) workshop at NeurIPS 2020 (2020). at <https://openreview.net/forum?id=_bokm801zhx> phase_physically_grounded_abstract_social_events_for_machine_social_perception.pdf (2.49 MB)
Deep Compositional Robotic Planners that Follow Natural Language Commands. Workshop on Visually Grounded Interaction and Language (ViGIL) at the Thirty-third Annual Conference on Neural Information Processing Systems (NeurIPS), (2019). at <https://vigilworkshop.github.io/>
Deep video-to-video transformations for accessibility with an application to photosensitivity. Pattern Recognition Letters (2019). doi:10.1016/j.patrec.2019.01.019
Learning Language from Vision. Workshop on Visually Grounded Interaction and Language (ViGIL) at the Thirty-third Annual Conference on Neural Information Processing Systems (NeurIPS) (2019).
ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models. Neural Information Processing Systems (NeurIPS 2019) (2019). 9142-objectnet-a-large-scale-bias-controlled-dataset-for-pushing-the-limits-of-object-recognition-models.pdf (16.31 MB)
Deep sequential models for sampling-based planning. The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018) (2018). doi:10.1109/IROS.2018.8593947 kuo2018planning.pdf (637.67 KB)
Grounding language acquisition by training semantic parsersusing captioned videos. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), (2018). at <http://aclweb.org/anthology/D18-1285> Ross-et-al_ACL2018_Grounding language acquisition by training semantic parsing using caption videos.pdf (3.5 MB)
Partially Occluded Hands: A challenging new dataset for single-image hand pose estimation. The 14th Asian Conference on Computer Vision (ACCV 2018) (2018). at <http://accv2018.net/> partially-occluded-hands-6.pdf (8.29 MB)
Temporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI 2017) (2017). at <c>
Anchoring and Agreement in Syntactic Annotations. (2016). CBMM-Memo-055.pdf (768.54 KB)
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities. (2016). memo-51.pdf (2.74 MB)
Language and Vision Ambiguities (LAVA) Corpus. (2016). at <http://web.mit.edu/lavacorpus/> D15-1172.pdf (2.42 MB)
A look back at the June 2016 BMM Workshop in Sestri Levante, Italy. (2016). Sestri Levante Review (359.33 KB)
A Compositional Framework for Grounding Language Inference, Generation, and Acquisition in Video. (2015). doi:doi:10.1613/jair.4556
. Do You See What I Mean? Visual Resolution of Linguistic Ambiguities. Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal. (2015).
Abstracts of the 2014 Brains, Minds, and Machines Summer Course. (2014). CBMM-Memo-024.pdf (2.86 MB)
The Compositional Nature of Event Representations in the Human Brain. (2014). CBMM Memo 011.pdf (3.95 MB)
Seeing is Worse than Believing: Reading People’s Minds Better than Computer-Vision Methods Recognize Actions. (2014). CBMM Memo 012.pdf (678.95 KB)
Computer Vision – ECCV 2014, Lecture Notes in Computer Science 8693, 612–627 (Springer International Publishing, 2014).
Seeing What You’re Told: Sentence-Guided Activity Recognition In Video. CVPR (IEEE, 2014). Publication (453.54 KB)
.