Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs [video]
- All Captioned Videos
- Publication Releases
[MUSIC PLAYING]
MARTIN SCHRIMPF: Before we started this project, research in neuroscience was typically at an individual, experimental level. You collected data from, for instance, V4 or IT, and then you test one V4 model and one IT model, and those are usually separate.
So what we're trying here is to start an integrative approach that really combines experiments at multiple levels and puts more constraints on the models to be more and more brain like.
For the first set of benchmarks, we connected two neural benchmarks and one behavioral benchmark. The two neural benchmarks were high quality, UTI recordings from V4 and IT, sort of high level areas in visual processing. And the behavioral benchmark was from humans doing a matching sample task.
The set of these benchmarks together is what we call Brain-Score . On the model side, we also collected daily use models in machine learning. So these were ranging from early AlexNet all the way to the latest and greatest ResNets or PNASnet at the time.
And then we evaluate those models on how well they could predict the neural activity in V4 and IT and on how well they could capture human behavior on a fine grain image level.
JONAS KUBILIUS: So when we benchmarked all of these models on brain score, we found that there is a very robust global correlation, such that models that are performing better on image nets are also more predictive of brain responses. However, the state of the art model on ImageNet is not the best model for predicting brain responses.
So it seems like, if you're only optimizing for ImageNet that that strategy may not be sufficient anymore to get the best models of the brain.
So when you look at the best Brain-Score models, they are doing their job. They're predicting neural and behavioral responses as we want it. However, they have many layers. And that is quite at odds how we tend to think about visual system, where there is just a handful of visual areas.
The mapping becomes pretty tricky between the models and visual system. And there is another problem. All of these models are feet forward, while the visual system is quite recurrent, and recurrence plays an important role in how we recognize objects.
So we decided to develop a model that would be shallow and recurrent. And that recurrence would be compensating for the lack of depth in the model.
MARTIN SCHRIMPF: Now testing CORnet on the ImageNet benchmark, we found that it was actually very competitive compared to other models, especially considering its shallowness.
JONAS KUBILIUS: And we also saw that it's actually doing really well on Brain-Score, which was our target goal. Now, on top of that, we thought, well, this is a recurring model. So how about we try to predict neural responses over time? Which is not what these feed-forward models could do.
Happily enough, we found a very good correlation between these measures.
MARTIN SCHRIMPF: In addition to that, we also tested how well could this model transfer to another data set? And we found that it really outperformed comparable shallow models. Now going forward, we're trying to expand our set of integrative benchmarks even more.
So we're going to put in V1, V2 processing, more behaviors and so forth. And our plan is to test CORnet all of them along with the other models. In addition, we're opening up the Brain-Score platform for new submissions. So if you think you have the best model for image processing in the brain, please send it our way.
[MUSIC PLAYING]
Lead authors Jonas Kubilius and Martin Schrimpf discuss the challenges of measuring how closely neural networks match the brain and present a new scoring method Brain-Score they have developed to evaluate models of the brain’s ventral stream at scale together with a novel shallow and recurrent network CORnet.
have an interactive transcript feature enabled, which appears below the video when playing. Viewers can search for keywords in the video or click on any word in the transcript to jump to that point in the video. When searching, a dark bar with white vertical lines appears below the video frame. Each white line is an occurrence of the searched term and can be clicked on to jump to that spot in the video.
