In Module 1, the stream of visual information from the ventral areas needs to be planned and interpreted in the context of prior knowledge, current tasks and memory. We will improve experimentally and theoretically upon a class of successful hierarchical models of the ventral stream — from HMAX to the recent deep learning models. Our focus on understanding the ventral steam of visual cortex in a two-prong approach based on theory and experiments. We want to:
- characterize mathematically why deep learning architectures work as well as they do and under which conditions. The reason is that our modeling of the ventral stream of primate visual cortex is based on deep learning architectures. If we use them to explain visual cortex we need to understand them in a deep way, ideally characterizing formally their advantages and pitfalls. This is clearly also critically important for engineering and for the development of better architectures.
- improve existing models of the ventral stream by joining forces between DiCarlo and Poggio, who have developed the last two generations of such models, and by combining modeling with experiments in humans and monkeys. Hierarchical models of the ventral stream from HMAX to the very recent deep learning models used by Yamins and DiCarlo have been able to describe with increasing accuracy the output of IT and the tuning properties of cells in V4 and IT during brief presentations of natural images. These models of the ventral stream predict about 50% of the response variance for briefly-presented, complex images in the central 10 degrees of the visual field. There is still much to do before we can say that we understand the computations performed by the ventral stream; our new efforts aim to produce the next generation of visually intelligent models.