|Efficient inverse graphics in biological face processing
|Year of Publication
|Yildirim, I, Freiwald, WA, J., T
The visual system must not only recognize and localize objects, but perform much richer inferences about the underlying causes in the world that give rise to observed sense data. Analyzing scenes by inverting causal generative models, also known as "analysis-by-synthesis", has a long history in computational vision, and these models have some behavioral support, but they are typically too slow to support online perception and have no known mapping to actual neural circuits. Here we present a neurally plausible model for efficiently inverting generative models of images and test it as a precise account of one aspect of high-level vision, the perception of faces. The model is based on a deep neural network that learns to invert a three-dimensional (3D) face graphics program in a single fast feedforward pass. It successfully explains both human behavioral data and multiple levels of neural processing in non-human primates, as well as a classic illusion, the "hollow face" effect. The model also fits qualitatively better than state-of-the-art computer vision models, and suggests an interpretable reverse-engineering account of how images are transformed into scene percepts in the primate ventral stream.
- CBMM Funded