Publication
Zoom better to see clearer: Human and object parsing with hierarchical auto-zoom net. ECCV (2016).
auto-zoom_net.pdf (5.77 MB)
Zero-shot linear combinations of grounded social interactions with Linear Social MDPs. Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI) (2023).
Young children’s automatic and alternating use of scene and object information in spatial symbols. Budapest CEU Conference on Cognitive Development (2015).
Word-level Invariant Representations From Acoustic Waveforms. INTERSPEECH 2014 - 15th Annual Conf. of the International Speech Communication Association (International Speech Communication Association (ISCA), 2014). at <http://www.isca-speech.org/archive/interspeech_2014/i14_2385.html>
Why Are Face and Object Processing Segregated in the Human Brain? Testing Computational Hypotheses with Deep Convolutional Neural Networks . Conference on Cognitive Computational Neuroscience (2020).
What Matters In Branch Specialization? Using a Toy Task to Make Predictions. Shared Visual Representations in Human and Machine Intelligence (SVRHM) Workshop at NeurIPS (2021). at <https://openreview.net/forum?id=0kPS1i6wict>
What can human minimal videos tell us about dynamic recognition models?. International Conference on Learning Representations (ICLR 2020) (2020). at <https://baicsworkshop.github.io/pdf/BAICS_1.pdf>
Authors' final version (516.09 KB)
Visually indicated sounds. Conference on Computer Vision and Pattern Recognition (2016).
Owens_etal_2016_visually_indicated_sounds_CVPR.pdf (7.57 MB)
Visual Features for Invariant Coding by Face Selective Neurons . 2019 Conference on Cognitive Computational Neuroscience (CCN) (2019).
Visual Concept Recognition and Localization via Iterative Introspection. . Asian Conference on Computer Vision (2016).
Focusing on parts of interest (910.14 KB)
A Virtual Reality Experimental Approach for Studying How the Brain Implements Attentive Behaviors. Tri-Institute 2019 Gateways to the Laboratory Summer Program (2019).
Using task-optimized neural networks to understand why brains have specialized processing for faces . Computational and Systems Neurosciences (2020).
Using Multimodal DNNs to Study Vision-Language Integration in the Brain. ICLR 2023 (2023). at <https://openreview.net/pdf?id=OQQ1p0pFP4>
On the use of Cortical Magnification and Saccades as Biological Proxies for Data Augmentation. Shared Visual Representations in Human and Machine Intelligence (SVRHM) Workshop at NeurIPS (2021). at <https://openreview.net/forum?id=Rpazl253IHb>
Unsupervised Learning of Visual Structure using Predictive Generative Networks. International Conference on Learning Representations (ICLR) (2016). at <http://arxiv.org/pdf/1511.06380v2.pdf>
Unsupervised Discovery of 3D Physical Objects. International Conference on Learning Representations (2021). at <https://openreview.net/forum?id=lf7st0bJIA5>
Trajectory Prediction with Linguistic Representations. 2022 IEEE International Conference on Robotics and Automation (ICRA) (2022). doi:10.1109/ICRA46639.2022.9811928
Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images. NIPS 2016 (2016).
6590-training-and-evaluating-multimodal-word-embeddings-with-large-scale-web-annotated-images.pdf (1.57 MB)
Trading robust representations for sample complexity through self-supervised visual experience. Advances in Neural Information Processing Systems 31 () 9640–9650 (Curran Associates, Inc., 2018). at <http://papers.nips.cc/paper/8170-trading-robust-representations-for-sample-complexity-through-self-supervised-visual-experience.pdf>
trading-robust-representations-for-sample-complexity-through-self-supervised-visual-experience.pdf (3.32 MB)
NeurIPS2018_Poster.pdf (6.12 MB)
Toward human-like object naming in artificial neural systems . International Conference on Learning Representations (ICLR 2020), Bridging AI and Cognitive Science workshop (2020).
To find better neural network models of human vision, find better neural network models of primate vision. BioRxiv (2019). at <https://www.biorxiv.org/content/10.1101/688390v1.full>
Temporal information for action recognition only needs to be integrated at a choice level in neural networks and primates . COSYNE (2020).
Temporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI 2017) (2017). at <c>
]