In Module 4, we are seeking to understand how complex data and knowledge are represented, manipulated, and learned. We aim to understand how humans perform abstract tasks, some of which are quintessentially human, such as the creation and use of concepts, logical reasoning, mathematics, and language, and long-range planning over long periods. Separate accounts for some of these abilities exist, but they are generally limited by not including perception and learning in real-world settings as well as not sharing knowledge or representation between abilities. Module 4 integrates capabilities from all modules in order to develop models that carry out these larger tasks, such as models that acquire language and allow agents to follow commands. Within Module 4, our projects span low-level perception for the symbols needed for higher-level reasoning, heading toward how to structure image understanding capabilities, expanding those capabilities to include social understanding and culminate in understanding long plans and stories.
“Partially Occluded Hands: A challenging new dataset for single-image hand pose estimation”, in The 14th Asian Conference on Computer Vision (ACCV 2018), 2018. ,
“Deep sequential models for sampling-based planning”, in The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018), Madrid, Spain, 2018. ,
“Grounding language acquisition by training semantic parsersusing captioned videos”, in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), , Brussels, Belgium, 2018. ,
“Direct Localization by Partly Calibrated Arrays: A Relaxed Maximum Likelihood Solution”, in 27th European Signal Processing Conference, EUSIPCO 2019, A Coruna, Spain, 2019. ,
“Synthesizing 3D Shapes via Modeling Multi-view Depth Maps and Silhouettes with Deep Generative Networks”, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017. ,