Export 23 results:
Filters: Author is Andrei Barbu [Clear All Filters]
Deep compositional robotic planners that follow natural language commands . International Conference on Robotics and Automation (ICRA) (2020).
PHASE: PHysically-grounded Abstract Social Eventsfor Machine Social Perception. Shared Visual Representations in Human and Machine Intelligence (SVRHM) workshop at NeurIPS 2020 (2020). at <https://openreview.net/forum?id=_bokm801zhx>
Deep Compositional Robotic Planners that Follow Natural Language Commands. Workshop on Visually Grounded Interaction and Language (ViGIL) at the Thirty-third Annual Conference on Neural Information Processing Systems (NeurIPS), (2019). at <https://vigilworkshop.github.io/>
Learning Language from Vision. Workshop on Visually Grounded Interaction and Language (ViGIL) at the Thirty-third Annual Conference on Neural Information Processing Systems (NeurIPS) (2019).
ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models. Neural Information Processing Systems (NeurIPS 2019) (2019).
Deep sequential models for sampling-based planning. The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018) (2018). doi:10.1109/IROS.2018.8593947
Grounding language acquisition by training semantic parsersusing captioned videos. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), (2018). at <http://aclweb.org/anthology/D18-1285>
Partially Occluded Hands: A challenging new dataset for single-image hand pose estimation. The 14th Asian Conference on Computer Vision (ACCV 2018) (2018). at <http://accv2018.net/>
Temporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI 2017) (2017). at <c>
A Compositional Framework for Grounding Language Inference, Generation, and Acquisition in Video. (2015). doi:doi:10.1613/jair.4556.
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities. Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal. (2015).
Seeing is Worse than Believing: Reading People’s Minds Better than Computer-Vision Methods Recognize Actions. (2014).
Computer Vision – ECCV 2014, Lecture Notes in Computer Science 8693, 612–627 (Springer International Publishing, 2014).
Seeing What You’re Told: Sentence-Guided Activity Recognition In Video. CVPR (IEEE, 2014)..
Seeing what you're told, sentence guided activity recognition in video. Appeared at CVPR (2014)..