The goal of the visual recognition system is to extract abstract features from the retinal image that correlate with real-world objects. The visual system achieves this goal by tuning layer-wise the synaptic weights defining neurons’ receptive fields. One essential feature of the resulting network is that it must be selective but simultaneously invariant: the same feature can appear under changing retinal positions, sizes, rotational angles or partial occlusion. We know that neurons in the adult macaque cortex show selectivity and invariance, but little is known about how they acquire these properties during early development. This is a conceptual gap with important implications for models of visual recognition. In particular, a theory developed over the last two years at CBMM — called i-theory — makes predictions about the neural circuits underlying different types of invariances. In the specific case of face recognition the theory is consistent with a number of properties of the face patches in visual cortex of the macaque monkey, such as the mirror-symmetric tuning of neurons in AL. So far, however, it has been impossible to test the theoretical predictions about development of invariance in visual neurons.
We propose to combine computational models and neurophysiological recordings to define how selectivity and invariance develop in infants and young adult macaques reared under either normal conditions or modified visual environments. We have a well-established research program that studies the macaque inferotemporal cortex using electrophysiology and functional imaging. Representing the category of faces is a prototypical function of the primate occipito-temporal cortex, which contains specific cortical regions (“domains”) selective to faces. We have hand-raised neonate macaques deprived of exposure to faces for their first year of life. After face-depriving one young macaque from birth to 1 year of age, we found that the animal’s expected “face patch” (based on anatomical location) showed stronger activation to hands than to faces, compared to normally reared monkeys. This imaging result raises multiple questions of cellular response properties in this altered face-patch region.
Specific Aim 1: Development of neuronal selectivity in normal and deprived face patch locations. Normally reared monkeys have a middle face patch composed of >90% of strongly face-tuned cells, with single-unit tuning measured by a face selectivity index (FSI, Fig. 1a). In contrast, face-deprived monkeys show stronger fMRI activation to hands than faces. What is the electrophysiology behind this modified selectivity? It is possible that this patch contains a lower percentage of units with high FSI values (replaced by hand cells, with a correspondingly high hand selectivity value or HSI, Fig. 1b, d). Alternatively, like most non-patch IT cortex, this patch may contain a mix of cells with low-to-mid-level FSI and HIS values (Fig. 1c). We will address this question by recording neuronal responses in- and outside face domain locations in normally reared and face-deprived animals. We will compare these neurophysiological recordings with state- of-the-art computational models of visual recognition. This is important because it tells us if cortical patch locations allow the implementation of multiple “expert” subpopulations or if patches instead become more like regular IT cortex.
Figure 1. a) Modeled tuning responses in the normal middle face patch (after Tsao et al. 2006). b-d) Hypothetical cell response profiles in abnormal face patch: b) mix of strongly tuned face and hand cells; c) mix of weakly tuned face and
Specific Aim 2: Development of generic and category-specific invariance in normal and deprived face patch locations. In normal animals, face-selective IT neurons show invariance to different types of transformations such as translation, size and viewpoint. Some of these transformations, regarded as image- plane transformations, are “generic”, that is common to all objects (position and size changes) while other transformations (viewpoint changes) depend on the 3D structure of the class of objects. A prediction of i-theory is that units in abnormal face patches will still show invariance for generic transformations but disproportionately reduced invariance for 3D viewpoint changes. It has been conjectured (Leibo et al., “The Invariance Hypothesis Implies Domain-Specific Regions in Visual Cortex”, 2015, PLOS Computational Biology) that the reason for modules in cortex with tuning for specific object classes (such as faces or bodies) is in fact due to the evolutionary need for invariance to object-specific transformations after seeing a single image of a new object, such as a new face. Our experiment will provide crucial evidence to falsify or confirm these theories thereby providing constraints on computational models of visual recognition.
In summary, by raising baby macaques without face exposure, we have created a unique model that will allow us to define the role of experience in developing selectivity and invariance and link the neural circuit machinery to the current theories of learning invariance in the domain of visual recognition .
Figure 2. Class-specific pose invariance for a single image of a new face presented in 0 (frontal). Model simulations (red and green curves expected for deprived monkeys, blue curve for monkey exposed to other faces during development) .