%0 Journal Article %J Developmental Science %D 2023 %T Preliminary evidence for selective cortical responses to music in one‐month‐old infants %A Heather L Kosakowski %A Norman‐Haignere, Samuel %A Mynick, Anna %A Takahashi, Atsushi %A Saxe, Rebecca %A Nancy Kanwisher %K auditory cortex %K fMRI %K infants %K music %K speech %X

Prior studies have observed selective neural responses in the adult human auditory cortex to music and speech that cannot be explained by the differing lower-level acoustic properties of these stimuli. Does infant cortex exhibit similarly selective responses to music and speech shortly after birth? To answer this question, we attempted to collect functional magnetic resonance imaging (fMRI) data from 45 sleeping infants (2.0- to 11.9-weeks-old) while they listened to monophonic instrumental lullabies and infant-directed speech produced by a mother. To match acoustic variation between music and speech sounds we (1) recorded music from instruments that had a similar spectral range as female infant-directed speech, (2) used a novel excitation-matching algorithm to match the cochleagrams of music and speech stimuli, and (3) synthesized “model-matched” stimuli that were matched in spectrotemporal modulation statistics to (yet perceptually distinct from) music or speech. Of the 36 infants we collected usable data from, 19 had significant activations to sounds overall compared to scanner noise. From these infants, we observed a set of voxels in non-primary auditory cortex (NPAC) but not in Heschl’s Gyrus that responded significantly more to music than to each of the other three stimulus types (but not significantly more strongly than to the background scanner noise). In contrast, our planned analyses did not reveal voxels in NPAC that responded more to speech than to model-matched speech, although other unplanned analyses did. These preliminary findings suggest that music selectivity arises within the first month of life.

%B Developmental Science %8 03/2023 %G eng %U https://onlinelibrary.wiley.com/doi/10.1111/desc.13387 %! Developmental Science %R 10.1111/desc.13387 %0 Journal Article %J Science Advances %D 2022 %T Brain-like functional specialization emerges spontaneously in deep neural networks %A Dobs, Katharina %A Julio Martinez-Trujillo %A Kell, Alexander J. E. %A Nancy Kanwisher %X

The human brain contains multiple regions with distinct, often highly specialized functions, from recognizing faces to understanding language to thinking about what others are thinking. However, it remains unclear why the cortex exhibits this high degree of functional specialization in the first place. Here, we consider the case of face perception using artificial neural networks to test the hypothesis that functional segregation of face recognition in the brain reflects a computational optimization for the broader problem of visual recognition of faces and other visual categories. We find that networks trained on object recognition perform poorly on face recognition and vice versa and that networks optimized for both tasks spontaneously segregate themselves into separate systems for faces and objects. We then show functional segregation to varying degrees for other visual categories, revealing a widespread tendency for optimization (without built-in task-specific inductive biases) to lead to functional specialization in machines and, we conjecture, also brains.

%B Science Advances %V 8 %8 03/2023 %G eng %U https://www.science.org/doi/10.1126/sciadv.abl8913 %N 11 %! Sci. Adv. %R 10.1126/sciadv.abl8913 %0 Journal Article %J Current Biology %D 2022 %T A neural population selective for song in human auditory cortex %A Norman-Haignere, Sam V. %A Jenelle Feather %A Boebinger, Dana %A Brunner, Peter %A Ritaccio, Anthony %A Josh H. McDermott %A Schalk, Gerwin %A Nancy Kanwisher %X

How is music represented in the brain? While neuroimaging has revealed some spatial segregation between responses to music versus other sounds, little is known about the neural code for music itself. To address this question, we developed a method to infer canonical response components of human auditory cortex using intracranial responses to natural sounds, and further used the superior coverage of fMRI to map their spatial distribution. The inferred components replicated many prior findings, including distinct neural selectivity for speech and music, but also revealed a novel component that responded nearly exclusively to music with singing. Song selectivity was not explainable by standard acoustic features, was located near speech and music-selective responses, and was also evident in individual electrodes. These results suggest that representations of music are fractionated into subpopulations selective for different types of music, one of which is specialized for the analysis of song.

%B Current Biology %8 02/2022 %G eng %U https://linkinghub.elsevier.com/retrieve/pii/S0960982222001312 %! Current Biology %R 10.1016/j.cub.2022.01.069 %0 Journal Article %J Human Brain Mapping %D 2022 %T Using child‐friendly movie stimuli to study the development of face, place, and object regions from age 3 to 12 years %A Kamps, Frederik S. %A Richardson, Hilary %A N. Apurva Ratan Murty %A Nancy Kanwisher %A Rebecca Saxe %X

Scanning young children while they watch short, engaging, commercially-produced movies has emerged as a promising approach for increasing data retention and quality. Movie stimuli also evoke a richer variety of cognitive processes than traditional experiments, allowing the study of multiple aspects of brain development simultaneously. However, because these stimuli are uncontrolled, it is unclear how effectively distinct profiles of brain activity can be distinguished from the resulting data. Here we develop an approach for identifying multiple distinct subject-specific Regions of Interest (ssROIs) using fMRI data collected during movie-viewing. We focused on the test case of higher-level visual regions selective for faces, scenes, and objects. Adults (N = 13) were scanned while viewing a 5.6-min child-friendly movie, as well as a traditional localizer experiment with blocks of faces, scenes, and objects. We found that just 2.7 min of movie data could identify subject-specific face, scene, and object regions. While successful, movie-defined ssROIS still showed weaker domain selectivity than traditional ssROIs. Having validated our approach in adults, we then used the same methods on movie data collected from 3 to 12-year-old children (N = 122). Movie response timecourses in 3-year-old children's face, scene, and object regions were already significantly and specifically predicted by timecourses from the corresponding regions in adults. We also found evidence of continued developmental change, particularly in the face-selective posterior superior temporal sulcus. Taken together, our results reveal both early maturity and functional change in face, scene, and object regions, and more broadly highlight the promise of short, child-friendly movies for developmental cognitive neuroscience.

%B Human Brain Mapping %8 03/2022 %G eng %U https://onlinelibrary.wiley.com/doi/10.1002/hbm.25815 %! Human Brain Mapping %R 10.1002/hbm.25815 %0 Journal Article %J Nature Communications %D 2021 %T Computational models of category-selective brain regions enable high-throughput tests of selectivity %A N. Apurva Ratan Murty %A Pouya Bashivan %A Abate, Alex %A James J. DiCarlo %A Nancy Kanwisher %X

Cortical regions apparently selective to faces, places, and bodies have provided important evidence for domain-specific theories of human cognition, development, and evolution. But claims of category selectivity are not quantitatively precise and remain vulnerable to empirical refutation. Here we develop artificial neural network-based encoding models that accurately predict the response to novel images in the fusiform face area, parahippocampal place area, and extrastriate body area, outperforming descriptive models and experts. We use these models to subject claims of category selectivity to strong tests, by screening for and synthesizing images predicted to produce high responses. We find that these high-response-predicted images are all unambiguous members of the hypothesized preferred category for each region. These results provide accurate, image-computable encoding models of each category-selective region, strengthen evidence for domain specificity in the brain, and point the way for future research characterizing the functional organization of the brain with unprecedented computational precision.

%B Nature Communications %V 12 %8 12/2021 %G eng %U https://www.nature.com/articles/s41467-021-25409-6 %N 1 %! Nat Commun %R 10.1038/s41467-021-25409-6 %0 Journal Article %J Proceedings of the National Academy of Sciences %D 2021 %T The neural architecture of language: Integrative modeling converges on predictive processing %A Martin Schrimpf %A Blank, Idan Asher %A Tuckute, Greta %A Kauf, Carina %A Hosseini, Eghbal A. %A Nancy Kanwisher %A Joshua B. Tenenbaum %A Fedorenko, Evelina %X

Significance

Language is a quintessentially human ability. Research has long probed the functional architecture of language in the mind and brain using diverse neuroimaging, behavioral, and computational modeling approaches. However, adequate neurally-mechanistic accounts of how meaning might be extracted from language are sorely lacking. Here, we report a first step toward addressing this gap by connecting recent artificial neural networks from machine learning to human recordings during language processing. We find that the most powerful models predict neural and behavioral responses across different datasets up to noise levels. Models that perform better at predicting the next word in a sequence also better predict brain measurements—providing computationally explicit evidence that predictive processing fundamentally shapes the language comprehension mechanisms in the brain.

Abstract

The neuroscience of perception has recently been revolutionized with an integrative modeling approach in which computation, brain function, and behavior are linked across many datasets and many computational models. By revealing trends across models, this approach yields novel insights into cognitive and neural mechanisms in the target domain. We here present a systematic study taking this approach to higher-level cognition: human language processing, our species’ signature cognitive skill. We find that the most powerful “transformer” models predict nearly 100% of explainable variance in neural responses to sentences and generalize across different datasets and imaging modalities (functional MRI and electrocorticography). Models’ neural fits (“brain score”) and fits to behavioral responses are both strongly correlated with model accuracy on the next-word prediction task (but not other language tasks). Model architecture appears to substantially contribute to neural fit. These results provide computationally explicit evidence that predictive processing fundamentally shapes the language comprehension mechanisms in the human brain.

%B Proceedings of the National Academy of Sciences %V 118 %P e2105646118 %8 11/2021 %G eng %U http://www.pnas.org/lookup/doi/10.1073/pnas.2105646118 %N 45 %! Proc Natl Acad Sci USA %R 10.1073/pnas.2105646118 %0 Journal Article %J Current Biology %D 2021 %T Selective responses to faces, scenes, and bodies in the ventral visual pathway of infants %A Heather L Kosakowski %A Cohen, Michael A. %A Takahashi, Atsushi %A Keil, Boris %A Nancy Kanwisher %A Rebecca Saxe %X

Three of the most robust functional landmarks in the human brain are the selective responses to faces in the fusiform face area (FFA), scenes in the parahippocampal place area (PPA), and bodies in the extrastriate body area (EBA). Are the selective responses of these regions present early in development or do they require many years to develop? Prior evidence leaves this question unresolved. We designed a new 32-channel infant magnetic resonance imaging (MRI) coil and collected high-quality functional MRI (fMRI) data from infants (2–9 months of age) while they viewed stimuli from four conditions—faces, bodies, objects, and scenes. We find that infants have face-, scene-, and body-selective responses in the location of the adult FFA, PPA, and EBA, respectively, powerfully constraining accounts of cortical development.

%B Current Biology %V 32 %8 11/2021 %G eng %U https://www.sciencedirect.com/science/article/pii/S0960982221015086 %& 1-20 %R 10.1016/j.cub.2021.10.064 %0 Journal Article %J Cortex %D 2020 %T Response patterns in the developing social brain are organized by social and emotion features and disrupted in children diagnosed with autism spectrum disorder %A Richardson, Hilary %A Hyowon Gweon %A Dodell-Feder, David %A Malloy, Caitlin %A Pelton, Hannah %A Keil, Boris %A Nancy Kanwisher %A Rebecca Saxe %B Cortex %V 125 %P 12 - 29 %8 Jan-04-2020 %G eng %U https://www.ncbi.nlm.nih.gov/pubmed/31958654 %! Cortex %R 10.1016/j.cortex.2019.11.021 %0 Journal Article %J NeuroImage %D 2020 %T The speed of human social interaction perception %A Leyla Isik %A Mynick, Anna %A Pantazis, Dimitrios %A Nancy Kanwisher %X

The ability to perceive others’ social interactions, here defined as the directed contingent actions between two or more people, is a fundamental part of human experience that develops early in infancy and is shared with other primates. However, the neural computations underlying this ability remain largely unknown. Is social interaction recognition a rapid feedforward process or a slower post-perceptual inference? Here we used magnetoencephalography (MEG) decoding to address this question. Subjects in the MEG viewed snapshots of visually matched real-world scenes containing a pair of people who were either engaged in a social interaction or acting independently. The presence versus absence of a social interaction could be read out from subjects’ MEG data spontaneously, even while subjects performed an orthogonal task. This readout generalized across different people and scenes, revealing abstract representations of social interactions in the human brain. These representations, however, did not come online until quite late, at 300 ms after image onset, well after feedforward visual processes. In a second experiment, we found that social interaction readout still occurred at this same late latency even when subjects performed an explicit task detecting social interactions. We further showed that MEG responses distinguished between different types of social interactions (mutual gaze vs joint attention) even later, around 500 ms after image onset. Taken together, these results suggest that the human brain spontaneously extracts information about others’ social interactions, but does so slowly, likely relying on iterative top-down computations.

%B NeuroImage %P 116844 %8 Jan-04-2020 %G eng %U https://www.ncbi.nlm.nih.gov/pubmed/32302763 %! NeuroImage %R 10.1016/j.neuroimage.2020.116844 %0 Conference Paper %B Computational and Systems Neurosciences %D 2020 %T Using task-optimized neural networks to understand why brains have specialized processing for faces %A Dobs, Katharina %A Alexander J. E. Kell %A Julio Martinez-Trujillo %A Michael Cohen %A Nancy Kanwisher %B Computational and Systems Neurosciences %C Denver, CO, USA %8 02/2020 %G eng %0 Conference Paper %B Conference on Cognitive Computational Neuroscience %D 2020 %T Why Are Face and Object Processing Segregated in the Human Brain? Testing Computational Hypotheses with Deep Convolutional Neural Networks %A Dobs, Katharina %A Alexander J. E. Kell %A Julio Martinez-Trujillo %A Michael Cohen %A Nancy Kanwisher %B Conference on Cognitive Computational Neuroscience %C Berlin, Germany %8 09/2020 %G eng %0 Conference Paper %B Conference on Cognitive Computational Neuroscience %D 2019 %T Are topographic deep convolutional neural networks better models of the ventral visual stream? %A K.M. Jozwik %A Lee, H. %A Nancy Kanwisher %A James J. DiCarlo %B Conference on Cognitive Computational Neuroscience %G eng %0 Journal Article %J Nature Neuroscience %D 2019 %T Divergence in the functional organization of human and macaque auditory cortex revealed by fMRI responses to harmonic tones %A Sam V Norman-Haignere %A Nancy Kanwisher %A Josh H. McDermott %A B. R. Conway %X

We report a difference between humans and macaque monkeys in the functional organization of cortical regions implicated in pitch perception. Humans but not macaques showed regions with a strong preference for harmonic sounds compared to noise, measured with both synthetic tones and macaque vocalizations. In contrast, frequency-selective tonotopic maps were similar between the two species. This species difference may be driven by the unique demands of speech and music perception in humans.

%B Nature Neuroscience %8 06/10/2019 %G eng %U https://www.nature.com/articles/s41593-019-0410-7 %! Nat Neurosci %R 10.1038/s41593-019-0410-7 %0 Conference Paper %B European Conference on Visual Perception %D 2019 %T Effects of Face Familiarity in Humans and Deep Neural Networks %A Dobs, Katharina %A Ian A Palmer %A Joanne Yuan %A Yalda Mohsenzadeh %A Aude Oliva %A Nancy Kanwisher %B European Conference on Visual Perception %C Leuven, Belgium %8 09/2019 %G eng %0 Journal Article %J Journal of Vision %D 2019 %T Eye movements and retinotopic tuning in developmental prosopagnosia %A M.F. Peterson %A Ian Zaun %A Hoke, Harris %A Jiahui, Guo %A Duchaine, Brad %A Nancy Kanwisher %B Journal of Vision %V 19 %P 7 %8 Jan-08-2019 %G eng %U https://www.ncbi.nlm.nih.gov/pubmed/31426085 %N 9 %! Journal of Vision %R 10.1167/19.9.7 %0 Journal Article %J Nature Communications %D 2019 %T How face perception unfolds over time %A Dobs, Katharina %A Leyla Isik %A Pantazis, Dimitrios %A Nancy Kanwisher %X

Within a fraction of a second of viewing a face, we have already determined its gender, age and identity. A full understanding of this remarkable feat will require a characterization of the computational steps it entails, along with the representations extracted at each. Here, we used magnetoencephalography (MEG) to measure the time course of neural responses to faces, thereby addressing two fundamental questions about how face processing unfolds over time. First, using representational similarity analysis, we found that facial gender and age information emerged before identity information, suggesting a coarse-to-fine processing of face dimensions. Second, identity and gender representations of familiar faces were enhanced very early on, suggesting that the behavioral benefit for familiar faces results from tuning of early feed-forward processing mechanisms. These findings start to reveal the time course of face processing in humans, and provide powerful new constraints on computational theories of face perception.

%B Nature Communications %V 10 %8 01/2019 %G eng %U http://www.nature.com/articles/s41467-019-09239-1 %N 1 %! Nat Commun %R 10.1038/s41467-019-09239-1 %0 Journal Article %J Current Opinion in Neurobiology %D 2019 %T An integrative computational architecture for object-driven cortex %A Ilker Yildirim %A Jiajun Wu %A Nancy Kanwisher %A Joshua B. Tenenbaum %X

Computational architecture for object-driven cortex

Objects in motion activate multiple cortical regions in every lobe of the human brain. Do these regions represent a collection of independent systems, or is there an overarching functional architecture spanning all of object-driven cortex? Inspired by recent work in artificial intelligence (AI), machine learning, and cognitive science, we consider the hypothesis that these regions can be understood as a coherent network implementing an integrative computational system that unifies the functions needed to perceive, predict, reason about, and plan with physical objects—as in the paradigmatic case of using or making tools. Our proposal draws on a modeling framework that combines multiple AI methods, including causal generative models, hybrid symbolic-continuous planning algorithms, and neural recognition networks, with object-centric, physics-based representations. We review evidence relating specific components of our proposal to the specific regions that comprise object-driven cortex, and lay out future research directions with the goal of building a complete functional and mechanistic account of this system.

%B Current Opinion in Neurobiology %V 55 %P 73 - 81 %8 01/2019 %G eng %U https://linkinghub.elsevier.com/retrieve/pii/S0959438818301995 %! Current Opinion in Neurobiology %R 10.1016/j.conb.2019.01.010 %0 Journal Article %J eLife %D 2019 %T Invariant representations of mass in the human brain %A Schwettmann, Sarah %A Joshua B. Tenenbaum %A Nancy Kanwisher %B eLife %V 8 %8 May-12-2020 %G eng %U https://www.ncbi.nlm.nih.gov/pubmed/31845887 %R 10.7554/eLife.46619 %0 Journal Article %J NeuroImage %D 2019 %T Representational similarity precedes category selectivity in the developing ventral visual pathway %A Cohen, Michael A. %A Dilks, Daniel D. %A Kami Koldewyn %A Weigelt, Sarah %A Jenelle Feather %A Alexander J. E. Kell %A Keil, Boris %A Fischl, Bruce %A Zöllei, Lilla %A Lawrence Wald %A Rebecca Saxe %A Nancy Kanwisher %B NeuroImage %V 197 %P 565 - 574 %8 Jan-08-2019 %G eng %U https://www.ncbi.nlm.nih.gov/pubmed/31077844 %! NeuroImage %R 10.1016/j.neuroimage.2019.05.010 %0 Conference Paper %B BioRxiv %D 2019 %T To find better neural network models of human vision, find better neural network models of primate vision %A K.M. Jozwik %A Martin Schrimpf %A Nancy Kanwisher %A James J. DiCarlo %X

Specific deep artificial neural networks (ANNs) are the current best models of ventral visual processing and object recognition behavior in monkeys. We here explore whether models of non-human primate vision generalize to visual processing in the human primate brain. Specifically, we asked if model match to monkey IT is a predictor of model match to human IT, even when scoring those matches on different images. We found that the model match to monkey IT is a positive predictor of the model match to human IT (R = 0.36), and that this approach outperforms the current standard predictor of model accuracy on ImageNet. This suggests a more powerful approach for pre-selecting models as hypotheses of human brain processing.

%B BioRxiv %G eng %U https://www.biorxiv.org/content/10.1101/688390v1.full %0 Journal Article %J NeuroImage %D 2018 %T What is changing when: decoding visual information in movies from human intracranial recordings %A Leyla Isik %A Jedediah Singer %A Nancy Kanwisher %A Madsen JR %A Anderson WS %A Gabriel Kreiman %K Electrocorticography (ECoG) %K Movies %K Natural vision %K neural decoding %K object recognition %K Ventral pathway %X

The majority of visual recognition studies have focused on the neural responses to repeated presentations of static stimuli with abrupt and well-defined onset and offset times. In contrast, natural vision involves unique renderings of visual inputs that are continuously changing without explicitly defined temporal transitions. Here we considered commercial movies as a coarse proxy to natural vision. We recorded intracranial field potential signals from 1,284 electrodes implanted in 15 patients with epilepsy while the subjects passively viewed commercial movies. We could rapidly detect large changes in the visual inputs within approximately 100 ms of their occurrence, using exclusively field potential signals from ventral visual cortical areas including the inferior temporal gyrus and inferior occipital gyrus. Furthermore, we could decode the content of those visual changes even in a single movie presentation, generalizing across the wide range of transformations present in a movie. These results present a methodological framework for studying cognition during dynamic and natural vision.

%B NeuroImage %V 180, Part A %P 147-159 %8 10/2018 %G eng %U https://www.sciencedirect.com/science/article/pii/S1053811917306742 %) Available online 18 August 2017 %R 10.1016/j.neuroimage.2017.08.027 %0 Journal Article %J Nature Communications %D 2017 %T Organization of high-level visual cortex in human infants %A Ben Deen %A Richardson, Hilary %A Dilks, Daniel D. %A Takahashi, Atsushi %A Keil, Boris %A Lawrence Wald %A Nancy Kanwisher %A Rebecca Saxe %X

How much of the structure of the human mind and brain is already specified at birth, and how much arises from experience? In this article, we consider the test case of extrastriate visual cortex, where a highly systematic functional organization is present in virtually every normal adult, including regions preferring behaviourally significant stimulus categories, such as faces, bodies, and scenes. Novel methods were developed to scan awake infants with fMRI, while they viewed multiple categories of visual stimuli. Here we report that the visual cortex of 4–6-month-old infants contains regions that respond preferentially to abstract categories (faces and scenes), with a spatial organization similar to adults. However, precise response profiles and patterns of activity across multiple visual categories differ between infants and adults. These results demonstrate that the large-scale organization of category preferences in visual cortex is adult-like within a few months after birth, but is subsequently refined through development.

%B Nature Communications %8 01/2017 %G eng %U http://www.nature.com/doifinder/10.1038/ncomms13995 %! Nat Comms %R 10.1038/ncomms13995 %0 Journal Article %J Proceedings of the National Academy of Sciences %D 2017 %T Perceiving social interactions in the posterior superior temporal sulcus %A Leyla Isik %A Kami Koldewyn %A David Beeler %A Nancy Kanwisher %X

Primates are highly attuned not just to social characteristics of individual agents, but also to social interactions between multiple agents. Here we report a neural correlate of the representation of social interactions in the human brain. Specifically, we observe a strong univariate response in the posterior superior temporal sulcus (pSTS) to stimuli depicting social interactions between two agents, compared with (i) pairs of agents not interacting with each other, (ii) physical interactions between inanimate objects, and (iii) individual animate agents pursuing goals and interacting with inanimate objects. We further show that this region contains information about the nature of the social interaction—specifically, whether one agent is helping or hindering the other. This sensitivity to social interactions is strongest in a specific subregion of the pSTS but extends to a lesser extent into nearby regions previously implicated in theory of mind and dynamic face perception. This sensitivity to the presence and nature of social interactions is not easily explainable in terms of low-level visual features, attention, or the animacy, actions, or goals of individual agents. This region may underlie our ability to understand the structure of our social world and navigate within it.

 

 
%B Proceedings of the National Academy of Sciences %V 114 %8 10/2017 %G eng %U http://www.pnas.org/content/early/2017/10/06/1714471114.short %N 43 %! PNAS %( PNAS October 9, 2017. 201714471; published ahead of print October 9, 2017 %R https://doi.org/10.1073/pnas.1714471114 %0 Journal Article %J The Journal of Neuroscience %D 2017 %T The Quest for the FFA and Where It Led %A Nancy Kanwisher %X

This article tells the story behind our first paper on the fusiform face area (FFA): how we chose the question, developed the methods, and followed the data to find the FFA and subsequently many other functionally specialized cortical regions. The paper's impact had less to do with the particular findings in the paper itself and more to do with the method that it promoted and the picture of the human mind and brain that it led to. The use of a functional localizer to define a candidate region in each subject individually enabled us not just to make pictures of brain activation, but also to ask principled, hypothesis-driven questions about a thing in nature. This method enabled stronger and more extensive tests of the function of each cortical region than had been possible before in humans and, as a result, has produced a large body of evidence that the human cortex contains numerous regions that are specifically engaged in particular mental processes. The growing inventory of cortical regions with distinctive and often very specific functions can be seen as an initial sketch of the basic components of the human mind. This sketch also serves as a roadmap into the vast and exciting new landscape of questions about the computations, structural connections, time course, development, plasticity, and evolution of each of these regions, as well as the hardest question of all: how do these regions work together to produce human intelligence?

%B The Journal of Neuroscience %V 37 %P 1056 - 1061 %8 02/2017 %G eng %U http://www.jneurosci.org/lookup/doi/10.1523/JNEUROSCI.1706-16.2016 %N 5 %! J. Neurosci. %R 10.1523/JNEUROSCI.1706-16.2016 %0 Journal Article %J Neuroimage %D 2017 %T What is changing when: Decoding visual information in movies from human intracranial recordings %A Leyla Isik %A Jedediah Singer %A Joseph Madsen %A Nancy Kanwisher %A Gabriel Kreiman %X

The majority of visual recognition studies have focused on the neural responses to repeated presentations of static stimuli with abrupt and well-defined onset and offset times. In contrast, natural vision involves unique renderings of visual inputs that are continuously changing without explicitly defined temporal transitions. Here we considered commercial movies as a coarse proxy to natural vision. We recorded intracranial field potential signals from 1,284 electrodes implanted in 15 patients with epilepsy while the subjects passively viewed commercial movies. We could rapidly detect large changes in the visual inputs within approximately 100 ms of their occurrence, using exclusively field potential signals from ventral visual cortical areas including the inferior temporal gyrus and inferior occipitalgyrus. Furthermore, we could decode the content of those visual changes even in a single movie presentation, generalizing across the wide range of transformations present in a movie. These results present a methodological framework for studying cognition during dynamic and natural vision.

%B Neuroimage %G eng %U https://www.sciencedirect.com/science/article/pii/S1053811917306742 %R https://doi.org/10.1016/j.neuroimage.2017.08.027 %0 Journal Article %J Journal of Neuroscience %D 2016 %T Color-Biased Regions of the Ventral Visual Pathway Lie between Face- and Place-Selective Regions in Humans, as in Macaques %A R. Lafer-Sousa %A B. R. Conway %A Nancy Kanwisher %X

The existence of color-processing regions in the human ventral visual pathway (VVP) has long been known from patient and imaging studies, but their location in the cortex relative to other regions, their selectivity for color compared with other properties (shape and object category), and their relationship to color-processing regions found in nonhuman primates remain unclear. We addressed these questions by scanning 13 subjects with fMRI while they viewed two versions of movie clips (colored, achromatic) of five different object classes (faces, scenes, bodies, objects, scrambled objects). We identified regions in each subject that were selective for color, faces, places, and object shape, and measured responses within these regions to the 10 conditions in independently acquired data. We report two key findings. First, the three previously reported color-biased regions (located within a band running posterior–anterior along the VVP, present in most of our subjects) were sandwiched between face-selective cortex and place-selective cortex, forming parallel bands of face, color, and place selectivity that tracked the fusiform gyrus/collateral sulcus. Second, the posterior color-biased regions showed little or no selectivity for object shape or for particular stimulus categories and showed no interaction of color preference with stimulus category, suggesting that they code color independently of shape or stimulus category; moreover, the shape-biased lateral occipital region showed no significant color bias. These observations mirror results in macaque inferior temporal cortex (Lafer-Sousa and Conway, 2013), and taken together, these results suggest a homology in which the entire tripartite face/color/place system of primates migrated onto the ventral surface in humans over the course of evolution.

SIGNIFICANCE STATEMENT Here we report that color-biased cortex is sandwiched between face-selective and place-selective cortex on the bottom surface of the brain in humans. This face/color/place organization mirrors that seen on the lateral surface of the temporal lobe in macaques, suggesting that the entire tripartite system is homologous between species. This result validates the use of macaques as a model for human vision, making possible more powerful investigations into the connectivity, precise neural codes, and development of this part of the brain. In addition, we find substantial segregation of color from shape selectivity in posterior regions, as observed in macaques, indicating a considerable dissociation of the processing of shape and color in both species.

%B Journal of Neuroscience %V 36 %P 1682 - 1697 %8 02/2016 %G eng %U http://www.jneurosci.org/cgi/doi/10.1523/JNEUROSCI.3164-15.2016 %N 5 %! Journal of Neuroscience %R 10.1523/JNEUROSCI.3164-15.2016 %0 Journal Article %J Proceedings of the National Academy of Sciences %D 2016 %T Functional neuroanatomy of intuitive physical inference %A Fischer, Jason %A Mikhael, John G. %A Joshua B. Tenenbaum %A Nancy Kanwisher %X

To engage with the world—to understand the scene in front of us, plan actions, and predict what will happen next—we must have an intuitive grasp of the world’s physical structure and dynamics. How do the objects in front of us rest on and support each other, how much force would be required to move them, and how will they behave when they fall, roll, or collide? Despite the centrality of physical inferences in daily life, little is known about the brain mechanisms recruited to interpret the physical structure of a scene and predict how physical events will unfold. Here, in a series of fMRI experiments, we identified a set of cortical regions that are selectively engaged when people watch and predict the unfolding of physical events—a “physics engine” in the brain. These brain regions are selective to physical inferences relative to nonphysical but otherwise highly similar scenes and tasks. However, these regions are not exclusively engaged in physical inferences per se or, indeed, even in scene understanding; they overlap with the domain-general “multiple demand” system, especially the parts of that system involved in action planning and tool use, pointing to a close relationship between the cognitive and neural mechanisms involved in parsing the physical content of a scene and preparing an appropriate action.

%B Proceedings of the National Academy of Sciences %V 113 %P E5072 - E5081 %8 06/2016 %G eng %U http://www.pnas.org/lookup/doi/10.1073/pnas.1610344113 %N 34 %! Proc Natl Acad Sci USA %R 10.1073/pnas.1610344113 %0 Journal Article %J Journal of Vision %D 2016 %T Individual Differences in Face Looking Behavior Generalize from the Lab to the World %A M.F. Peterson %A Jing Lin %A Ian Zaun %A Nancy Kanwisher %X

Recent laboratory studies have found large, stable individual differences in the location people first fixate when identifying faces, ranging from the brows to the mouth. Importantly, this variation is strongly associated with differences in fixation-specific identification performance such that an individual’s recognition ability is maximized when looking at their preferred location (Mehoudar, Arizpe, Baker, & Yovel, 2014; Peterson & Eckstein, 2013). This finding suggests that face representations are retinotopic and individuals enact gaze strategies that optimize identification, yet the extent to which this behavior reflects real-world gaze behavior is unknown. Here, we used mobile eye-trackers to test whether individual differences in face-gaze generalize from lab to real-world vision. In-lab fixations were measured with a speeded face identification task, while real-world behavior was measured as subjects freely walked around the MIT campus. We found a strong correlation between the patterns of individual differences in face-gaze in the laboratory and real-world settings. Our findings support the hypothesis that individuals optimize real-world face identification by consistently fixating the same location and thus strongly constraining the space of retinotopic input. The methods developed for this study entailed collecting a large set of high-definition, wide field-of-view natural videos from head-mounted cameras and the viewer’s fixation position, allowing us to characterize subject’s actually-experienced real-world retinotopic images. These images enable us to ask how vision is optimized not just for the statistics of the “natural images” found in web databases, but of the truly natural, retinotopic images that have landed on actual human retinae during real-world experience.

%B Journal of Vision %V 16 %8 05/2016 %G eng %U http://jov.arvojournals.org/article.aspx?articleid=2524135&resultClick=1 %N 7 %& 12 %R 10.1167/16.7.12. %0 Journal Article %J Journal of Vision %D 2016 %T Individual differences in face-looking behavior generalize from the lab to the world. %A M.F. Peterson %A J. Lin %A Ian Zaun %A Nancy Kanwisher %B Journal of Vision %G eng %0 Generic %D 2016 %T Measuring and modeling the perception of natural and unconstrained gaze in humans and machines %A Daniel Harari %A Tao Gao %A Nancy Kanwisher %A Joshua B. Tenenbaum %A Shimon Ullman %K computational evaluation %K computational modeling %K Computer vision %K empirical evaluation %K estimation of gaze direction %K Gaze perception %K joint attention %K Machine Learning %X

Humans are remarkably adept at interpreting the gaze direction of other individuals in their surroundings. This skill is at the core of the ability to engage in joint visual attention, which is essential for establishing social interactions. How accurate are humans in determining the gaze direction of others in lifelike scenes, when they can move their heads and eyes freely, and what are the sources of information for the underlying perceptual processes? These questions pose a challenge from both empirical and computational perspectives, due to the complexity of the visual input in real-life situations. Here we measure empirically human accuracy in perceiving the gaze direction of others in lifelike scenes, and study computationally the sources of information and representations underlying this cognitive capacity. We show that humans perform better in face-to-face conditions compared with recorded conditions, and that this advantage is not due to the availability of input dynamics. We further show that humans are still performing well when only the eyes-region is visible, rather than the whole face. We develop a computational model, which replicates the pattern of human performance, including the finding that the eyes-region contains on its own, the required information for estimating both head orientation and direction of gaze. Consistent with neurophysiological findings on task-specific face regions in the brain, the learned computational representations reproduce perceptual effects such as the Wollaston illusion, when trained to estimate direction of gaze, but not when trained to recognize objects or faces.

%8 11/2016 %1

arXiv:1611.09819

%2

http://hdl.handle.net/1721.1/105477

%0 Journal Article %J Current Biology %D 2016 %T Neural Representations Integrate the Current Field of View with the Remembered 360° Panorama %A Robertson, Caroline E. %A Katherine Hermann %A Mynick, Anna %A Kravitz, Dwight J. %A Nancy Kanwisher %X

We experience our visual environment as a seamless, immersive panorama. Yet, each view is discrete and fleeting, separated by expansive eye movements and discontinuous views of our spatial surroundings. How are discrete views of a panoramic environment knit together into a broad, unified memory representation? Regions of the brain’s “scene network” are well poised to integrate retinal input and memory [ 1 ]: they are visually driven [ 2, 3 ] but also densely interconnected with memory structures in the medial temporal lobe [ 4 ]. Further, these regions harbor memory signals relevant for navigation [ 5–8 ] and adapt across overlapping shifts in scene viewpoint [ 9, 10 ]. However, it is unknown whether regions of the scene network support visual memory for the panoramic environment outside of the current field of view and, further, how memory for the surrounding environment influences ongoing perception. Here, we demonstrate that specific regions of the scene network—the retrosplenial complex (RSC) and occipital place area (OPA)—unite discrete views of a 360° panoramic environment, both current and out of sight, in a common representational space. Further, individual scene views prime associated representations of the panoramic environment in behavior, facilitating subsequent perceptual judgments. We propose that this dynamic interplay between memory and perception plays an important role in weaving the fabric of continuous visual experience.

%B Current Biology %8 09/08/2016 %G eng %U http://www.cell.com/current-biology/abstract/S0960-9822(16)30753-9 %R 10.1016/j.cub.2016.07.002 %0 Journal Article %J NeuroImage %D 2016 %T The occipital place area represents the local elements of scenes %A Kamps, Frederik S. %A Julian, Joshua B. %A Jonas Kubilius %A Nancy Kanwisher %A Dilks, Daniel D. %X

Neuroimaging studies have identified three scene-selective regions in human cortex: parahippocampal place area (PPA), retrosplenial complex (RSC), and occipital place area (OPA). However, precisely what scene information each region represents is not clear, especially for the least studied, more posterior OPA. Here we hypothesized that OPA represents local elements of scenes within two independent, yet complementary scene descriptors: spatial boundary (i.e., the layout of external surfaces) and scene content (e.g., internal objects). If OPA processes the local elements of spatial boundary information, then it should respond to these local elements (e.g., walls) themselves, regardless of their spatial arrangement. Indeed, we found that OPA, but not PPA or RSC, responded similarly to images of intact rooms and these same rooms in which the surfaces were fractured and rearranged, disrupting the spatial boundary. Next, if OPA represents the local elements of scene content information, then it should respond more when more such local elements (e.g., furniture) are present. Indeed, we found that OPA, but not PPA or RSC, responded more to multiple than single pieces of furniture. Taken together, these findings reveal that OPA analyzes local scene elements - both in spatial boundary and scene content representation - while PPA and RSC represent global scene properties.

%B NeuroImage %V 132 %P 417 - 424 %8 02/2016 %G eng %U https://www.ncbi.nlm.nih.gov/pubmed/26931815 %! NeuroImage %R 10.1016/j.neuroimage.2016.02.062 %0 Journal Article %J Cerebral Cortex %D 2015 %T Functional organization of social perception and cognition in the superior temporal sulcus %A Ben Deen %A Kami Koldewyn %A Nancy Kanwisher %A Rebecca Saxe %X

The superior temporal sulcus (STS) is considered a hub for social perception and cognition, including the perception of faces and human motion, as well as understanding others’ actions, mental states, and language. However, the functional organization of the STS remains debated: Is this broad region composed of multiple functionally distinct modules, each specialized for a different process, or are STS subregions multifunctional, contributing to multiple processes? Is the STS spatially organized, and if so, what are the dominant features of this organization? We address these questions by measuring STS responses to a range of social and linguistic stimuli in the same set of human participants, using fMRI. We find a number of STS subregions that respond selectively to certain types of social input, organized along a posterior-to-anterior axis. We also identify regions of overlapping response to multiple contrasts, including regions responsive to both language and theory of mind, faces and voices, and faces and biological motion. Thus, the human STS contains both relatively domain-specific areas, and regions that respond to multiple types of social information.

%B Cerebral Cortex %V 25 %P 4596-4609 %8 11/2015 %G eng %U http://cercor.oxfordjournals.org/content/25/11/4596.full %N 11 %R 10.1093/cercor/bhv111 %0 Generic %D 2015 %T Functional organization of the human superior temporal sulcus %A Ben Deen %A Nancy Kanwisher %A Rebecca Saxe %X

The human superior temporal sulcus (STS) has been implicated in a broad range of social perceptual and cognitive processes, including the perception of faces, biological motion, and vocal sounds, and the understanding of language and mental states. However, little is known about the overall functional organization of these responses. Does the STS contain distinct, specialized regions for processing different types of social information? Or is cortex in the STS largely multifunctional, with each region engaged in multiple different computations (Hein, 2008)? Because prior work has largely studied these processes independently, this question remains unanswered. Here, we first identify distinct functional subregions of the STS, and then examine their response to a broad range of social stimuli.

%B Organization for Human Brain Mapping (OHBM 2015) %C Honolulu, HI %8 6/2015 %U https://ww4.aievolution.com/hbm1501/index.cfm?do=abs.viewAbs&abs=3635 %0 Generic %D 2014 %T Exploring the functional organization of the superior temporal sulcus with a broad set of naturalistic stimuli %A Ben Deen %A Nancy Kanwisher %A Rebecca Saxe