%0 Journal Article %J arXiv %D 2020 %T ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation %A Chuang Gen %A Jeremy Schwartz %A Seth Alter %A Martin Schrimpf %A James Traer %A Julian De Freitas %A Jonas Kubilius %A Abhishek Bhandwaldar %A Nick Haber %A Megumi Sano %A Kuno Kim %A Elias Wang %A Damian Mrowca %A Michael Lingelbach %A Aidan Curtis %A Kevin Feigleis %A Daniel Bear %A Dan Gutfreund %A David Cox %A James J. DiCarlo %A Josh H. McDermott %A Joshua B. Tenenbaum %A Daniel L K Yamins %X

We introduce ThreeDWorld (TDW), a platform for interactive multi-modal physical simulation. With TDW, users can simulate high-fidelity sensory data and physical interactions between mobile agents and objects in a wide variety of rich 3D environments. TDW has several unique properties: 1) realtime near photo-realistic image rendering quality; 2) a library of objects and environments with materials for high-quality rendering, and routines enabling user customization of the asset library; 3) generative procedures for efficiently building classes of new environments 4) high-fidelity audio rendering; 5) believable and realistic physical interactions for a wide variety of material types, including cloths, liquid, and deformable objects; 6) a range of "avatar" types that serve as embodiments of AI agents, with the option for user avatar customization; and 7) support for human interactions with VR devices. TDW also provides a rich API enabling multiple agents to interact within a simulation and return a range of sensor and physics data representing the state of the world. We present initial experiments enabled by the platform around emerging research directions in computer vision, machine learning, and cognitive science, including multi-modal physical scene understanding, multi-agent interactions, models that "learn like a child", and attention studies in humans and neural networks. The simulation platform will be made publicly available.

%B arXiv %8 07/2020 %G eng %U https://arxiv.org/abs/2007.04954 %9 Preprint %0 Generic %D 2020 %T ThreeDWorld (TDW): A High-Fidelity, Multi-Modal Platform for Interactive Physical Simulation %A Jeremy Schwartz %A Seth Alter %A James J. DiCarlo %A Josh H. McDermott %A Joshua B. Tenenbaum %A Daniel L K Yamins %A Dan Gutfreund %A Chuang Gan %A James Traer %A Jonas Kubilius %A Martin Schrimpf %A Abhishek Bhandwaldar %A Julian De Freitas %A Damian Mrowca %A Michael Lingelbach %A Megumi Sano %A Daniel Bear %A Kuno Kim %A Nick Haber %A Chaofei Fan %X

TDW is a 3D virtual world simulation platform, utilizing state-of-the-art video game engine technology

A TDW simulation consists of two components: a) the Build, a compiled executable running on the Unity3D Engine, which is responsible for image rendering, audio synthesis and physics simulations; and b) the Controller, an external Python interface to communicate with the build.

Researchers write Controllers that send commands to the Build, which executes those commands and returns a broad range of data types representing the state of the virtual world.

TDW provides researchers with:

TDW is being used on a daily basis in multiple labs, supporting research that sits at the nexus of neuroscience, cognitive science and artificial intelligence.

Find out more about ThreeDWorld on the project weobsite using the link below.

%8 07/2020 %U http://www.threedworld.org/ %1

ThreeDWorld on Github - https://github.com/threedworld-mit/tdw

%0 Conference Proceedings %B 33rd Conference on Neural Information Processing Systems (NeurIPS 2019) %D 2019 %T Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs %A Jonas Kubilius %A Martin Schrimpf %A Kohitij Kar %A Rishi Rajalingham %A Ha Hong %A Najib J. Majaj %A Elias B. Issa %A Pouya Bashivan %A Jonathan Prescott-Roy %A Kailyn Schmidt %A Aran Nayebi %A Daniel Bear %A Daniel L K Yamins %A James J. DiCarlo %X

Deep convolutional artificial neural networks (ANNs) are the leading class of candidate models of the mechanisms of visual processing in the primate ventral stream. While initially inspired by brain anatomy, over the past years, these ANNs have evolved from a simple eight-layer architecture in AlexNet to extremely deep and branching architectures, demonstrating increasingly better object categorization performance, yet bringing into question how brain-like they still are. In particular, typical deep models from the machine learning community are often hard to map onto the brain’s anatomy due to their vast number of layers and missing biologically-important connections, such as recurrence. Here we demonstrate that better anatomical alignment to the brain and high performance on machine learning as well as neuroscience measures do not have to be in contradiction. We developed CORnet-S, a shallow ANN with four anatomically mapped areas and recurrent connectivity, guided by Brain-Score, a new large-scale composite of neural and behavioral benchmarks for quantifying the functional fidelity of models of the primate ventral visual stream. Despite being significantly shallower than most models, CORnet-S is the top model on Brain-Score and outperforms similarly compact models on ImageNet. Moreover, our extensive analyses of CORnet-S circuitry variants reveal that recurrence is the main predictive factor of both Brain- Score and ImageNet top-1 performance. Finally, we report that the temporal evolution of the CORnet-S "IT" neural population resembles the actual monkey IT population dynamics. Taken together, these results establish CORnet-S, a compact, recurrent ANN, as the current best model of the primate ventral visual stream.

%B 33rd Conference on Neural Information Processing Systems (NeurIPS 2019) %C Vancouver, Canada %8 10/2019 %G eng %0 Journal Article %J bioRxiv preprint %D 2018 %T Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like? %A Martin Schrimpf %A Jonas Kubilius %E Ha Hong %E Najib J. Majaj %E Rishi Rajalingham %E Elias B. Issa %E Kohitij Kar %E Pouya Bashivan %E Jonathan Prescott-Roy %E Kailyn Schmidt %E Daniel L K Yamins %E James J. DiCarlo %K computational neuroscience %K deep learning %K Neural Networks %K object recognition %K ventral stream %X

The internal representations of early deep artificial neural networks (ANNs) were found to be remarkably similar to the internal neural representations measured experimentally in the primate brain. Here we ask, as deep ANNs have continued to evolve, are they becoming more or less brain-like? ANNs that are most functionally similar to the brain will contain mechanisms that are most like those used by the brain. We therefore developed Brain-Score – a composite of multiple neural and behavioral benchmarks that score any ANN on how similar it is to the brain’s mechanisms for core object recognition – and we deployed it to evaluate a wide range of state-of-the-art deep ANNs. Using this scoring system, we here report that: (1) DenseNet-169, CORnet-S and ResNet-101 are the most brain-like ANNs. There remains considerable variability in neural and behavioral responses that is not predicted by any ANN, suggesting that no ANN model has yet captured all the relevant mechanisms. (3) Extending prior work, we found that gains in ANN ImageNet performance led to gains on Brain-Score. However, correlation weakened at 70% top-1 ImageNet performance, suggesting that additional guidance from neuroscience is needed to make further advances in capturing brain mechanisms. (4) We uncovered smaller (i.e. less complex) ANNs that are more brain-like than many of the best-performing ImageNet models, which suggests the opportunity to simplify ANNs to better understand the ventral stream. The scoring system used here is far from complete. However, we propose that evaluating and tracking model-benchmark correspondences through a Brain-Score that is regularly updated with new brain data is an exciting opportunity: experimental benchmarks can be used to guide machine network evolution, and machine networks are mechanistic hypotheses of the brain’s network and thus drive next experiments. To facilitate both of these, we release Brain-Score.org: a platform that hosts the neural and behavioral benchmarks, where ANNs for visual processing can be submitted to receive a Brain-Score and their rank relative to other models, and where new experimental data can be naturally incorporated.

%B bioRxiv preprint %G eng %U https://www.biorxiv.org/content/10.1101/407007v1 %R 10.1101/407007 %0 Journal Article %J Neuron %D 2018 %T A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy %A Alexander J. E. Kell %A Daniel L K Yamins %A Erica N Shook %A Sam V Norman-Haignere %A Josh H. McDermott %K auditory cortex %K convolutional neural network %K deep learning %K deep neural network %K encoding models %K fMRI %K Hierarchy %K human auditory cortex %K natural sounds %K word recognition %X

A core goal of auditory neuroscience is to build quantitative models that predict cortical responses to natural sounds. Reasoning that a complete model of auditory cortex must solve ecologically relevant tasks, we optimized hierarchical neural networks for speech and music recognition. The best-performing network contained separate music and speech pathways following early shared processing, potentially replicating human cortical organization. The network performed both tasks as well as humans and exhibited human-like errors despite not being optimized to do so, suggesting common constraints on network and human performance. The network predicted fMRI voxel responses substantially better than traditional spectrotemporal filter models throughout auditory cortex. It also provided a quantitative signature of cortical representational hierarchy—primary and non-primary responses were best predicted by intermediate and late network layers, respectively. The results suggest that task optimization provides a powerful set of tools for modeling sensory systems.

%B Neuron %V 98 %8 04/2018 %G eng %U https://www.sciencedirect.com/science/article/pii/S0896627318302502 %) Available online 19 April 2018 %R 10.1016/j.neuron.2018.03.044 %0 Generic %D 2013 %T Neural Representation Benchmark [code] %A Charles F. Cadieu %A Ha Hong %A Daniel L K Yamins %A Nicolas Pinto %A Najib J. Majaj %A James J. DiCarlo %X

A key requirement for the development of effective learning representations is their evaluation and comparison to representations we know to be effective. In natural sensory domains, the community has viewed the brain as a source of inspiration and as an implicit benchmark for success. However, it has not been possible to directly test representational learning algorithms directly against the representations contained in neural systems. Here, we propose a new benchmark for visual representations on which we have directly tested the neural representation in multiple visual cortical areas in macaque (utilizing data from [Majaj et al., 2012]), and on which any computer vision algorithm that produces a feature space can be tested. The benchmark measures the effectiveness of the neural or machine representation by computing the classification loss on the ordered eigendecomposition of a kernel matrix [Montavon et al., 2011]. In our analysis we find that the neural representation in visual area IT is superior to visual area V4. In our analysis of representational learning algorithms, we find that three-layer models approach the representational performance of V4 and the algorithm in [Le et al., 2012] surpasses the performance of V4. Impressively, we find that a recent supervised algorithm [Krizhevsky et al., 2012] achieves performance comparable to that of IT for an intermediate level of image variation difficulty, and surpasses IT at a higher difficulty level. We believe this result represents a major milestone: it is the first learning algorithm we have found that exceeds our current estimate of IT representation performance. We hope that this benchmark will assist the community in matching the representational performance of visual cortex and will serve as an initial rallying point for further correspondence between representations derived in brains and machines.

For more information and to download code, etc. please visit the project website - http://dicarlolab.mit.edu/neuralbenchmark