%0 Conference Proceedings %B Neural Information Processing Systems (NeurIPS 2019) %D 2019 %T Untangling in Invariant Speech Recognition %A Cory Stephenson %A Jenelle Feather %A Suchismita Padhy %A Oguz Elibol %A Hanlin Tang %A Josh H. McDermott %A SueYeon Chung %X

Encouraged by the success of deep convolutional neural networks on a variety of visual tasks, much theoretical and experimental work has been aimed at understanding and interpreting how vision networks operate. At the same time, deep neural networks have also achieved impressive performance in audio processing applications, both as sub-components of larger systems and as complete end-to-end systems by themselves. Despite their empirical successes, comparatively little is understood about how these audio models accomplish these tasks.In this work, we employ a recently developed statistical mechanical theory that connects geometric properties of network representations and the separability of classes to probe how information is untangled within neural networks trained to recognize speech. We observe that speaker-specific nuisance variations are discarded by the network's hierarchy, whereas task-relevant properties such as words and phonemes are untangled in later layers. Higher level concepts such as parts-of-speech and context dependence also emerge in the later layers of the network. Finally, we find that the deep representations carry out significant temporal untangling by efficiently extracting task-relevant features at each time step of the computation. Taken together, these findings shed light on how deep auditory models process their time dependent input signals to carry out invariant speech recognition, and show how different concepts emerge through the layers of the network.

%B Neural Information Processing Systems (NeurIPS 2019) %C Vancouver, Canada %8 11/2019 %G eng %0 Journal Article %J Proceedings of the National Academy of Sciences %D 2018 %T Recurrent computations for visual pattern completion %A Hanlin Tang %A Martin Schrimpf %A William Lotter %A Moerman, Charlotte %A Paredes, Ana %A Ortega Caro, Josue %A Hardesty, Walter %A David Cox %A Gabriel Kreiman %K Artificial Intelligence %K computational neuroscience %K Machine Learning %K pattern completion %K Visual object recognition %X

Making inferences from partial information constitutes a critical aspect of cognition. During visual perception, pattern completion enables recognition of poorly visible or occluded objects. We combined psychophysics, physiology, and computational models to test the hypothesis that pattern completion is implemented by recurrent computations and present three pieces of evidence that are consistent with this hypothesis. First, subjects robustly recognized objects even when they were rendered <15% visible, but recognition was largely impaired when processing was interrupted by backward masking. Second, invasive physiological responses along the human ventral cortex exhibited visually selective responses to partially visible objects that were delayed compared with whole objects, suggesting the need for additional computations. These physiological delays were correlated with the effects of backward masking. Third, state-of-the-art feed-forward computational architectures were not robust to partial visibility. However, recognition performance was recovered when the model was augmented with attractor-based recurrent connectivity. The recurrent model was able to predict which images of heavily occluded objects were easier or harder for humans to recognize, could capture the effect of introducing a backward mask on recognition behavior, and was consistent with the physiological delays along the human ventral visual stream. These results provide a strong argument of plausibility for the role of recurrent computations in making visual inferences from partial information.

%B Proceedings of the National Academy of Sciences %8 08/2018 %G eng %U http://www.pnas.org/lookup/doi/10.1073/pnas.1719397115 %! Proc Natl Acad Sci USA %R 10.1073/pnas.1719397115 %0 Book Section %B Computational and Cognitive Neuroscience of Vision %D 2017 %T Recognition of occluded objects %A Hanlin Tang %A Gabriel Kreiman %A Qi Zhao %B Computational and Cognitive Neuroscience of Vision %I Springer Singapore %G eng %U http://www.springer.com/us/book/9789811002113 %0 Generic %D 2016 %T Cascade of neural processing orchestrates cognitive control in human frontal cortex [code] %A Hanlin Tang %A Hsiang-Yu Yu %A Chien-Chen Chou %A Crone, Nathan E. %A Joseph Madsen %A WS Anderson %A Gabriel Kreiman %X

Code and data used to create the figures of Tang et al. (2016).  The results from this work show that there is a dynamic and hierarchical sequence of steps in human frontal cortex orchestrates cognitive control.

Used in conjunction with this mirrored CBMM Dataset entry

%I eLife %U http://klab.tch.harvard.edu/resources/tangetal_stroop_2016.html %0 Generic %D 2016 %T Cascade of neural processing orchestrates cognitive control in human frontal cortex [dataset] %A Hanlin Tang %A Hsiang-Yu Yu %A Chien-Chen Chou %A Crone, Nathan E. %A Joseph Madsen %A WS Anderson %A Gabriel Kreiman %X

Code and data used to create the figures of Tang et al. (2016).  The results from this work show that there is a dynamic and hierarchical sequence of steps in human frontal cortex orchestrates cognitive control.

Used in conjunction with this mirrored CBMM Code entry

%I eLife %U http://klab.tch.harvard.edu/resources/tangetal_stroop_2016.html %0 Journal Article %J eLIFE %D 2016 %T Cascade of neural processing orchestrates cognitive control in human frontal cortex %A Hanlin Tang %A Yu, HY %A Chou, CC %A NE Crone %A Joseph Madsen %A WS Anderson %A Gabriel Kreiman %X
Rapid and flexible interpretation of conflicting sensory inputs in the context of current goals is a critical component of cognitive control that is orchestrated by frontal cortex. The relative roles of distinct subregions within frontal cortex are poorly understood. To examine the dynamics underlying cognitive control across frontal regions, we took advantage of the spatiotemporal resolution of intracranial recordings in epilepsy patients while subjects resolved color-word conflict.We observed differential activity preceding the behavioral responses to conflict trials throughout frontal cortex; this activity was correlated with behavioral reaction times. These signals emerged first in anterior cingulate cortex (ACC) before dorsolateral prefrontal cortex (dlPFC), followed bymedial frontal cortex (mFC) and then by orbitofrontal cortex (OFC). These results disassociate the frontal subregions based on their dynamics, and suggest a temporal hierarchy for cognitive control in human cortex.
%B eLIFE %8 02/2016 %G eng %U http://dx.doi.org/10.7554/eLife.12352 %R 10.7554/eLife.12352 %0 Conference Proceedings %B 2016 Annual Conference on Information Science and Systems (CISS) %D 2016 %T A machine learning approach to predict episodic memory formation %A Hanlin Tang %A Jedediah Singer %A Matias J. Ison %A Gnel Pivazyan %A Melissa Romaine %A Elizabeth Meller %A Victoria Perron %A Marlise Arlellano %A Gabriel Kreiman %A Melissa Romaine %A Adrianna Boulin %A Rosa Frias %A James Carroll %A Sarah Dowcett %B 2016 Annual Conference on Information Science and Systems (CISS) %C Princeton, NJ %P 539 - 544 %G eng %U http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7460560&newsearch=true&queryText=A%20machine%20learning%20approach%20to%20predict%20episodic%20memory%20formation %R 10.1109/CISS.2016.7460560 %0 Journal Article %J Scientific Reports %D 2016 %T Predicting episodic memory formation for movie events %A Hanlin Tang %A Jedediah Singer %A Matias J. Ison %A Gnel Pivazyan %A Melissa Romaine %A Rosa Frias %A Elizabeth Meller %A Adrianna Boulin %A James Carroll %A Victoria Perron %A Sarah Dowcett %A Arellano, Marlise %A Gabriel Kreiman %X

Episodic memories are long lasting and full of detail, yet imperfect and malleable. We quantitatively evaluated recollection of short audiovisual segments from movies as a proxy to real-life memory formation in 161 subjects at 15 minutes up to a year after encoding. Memories were reproducible within and across individuals, showed the typical decay with time elapsed between encoding and testing, were fallible yet accurate, and were insensitive to low-level stimulus manipulations but sensitive to high-level stimulus properties. Remarkably, memorability was also high for single movie frames, even one year post-encoding. To evaluate what determines the efficacy of long-term memory formation, we developed an extensive set of content annotations that included actions, emotional valence, visual cues and auditory cues. These annotations enabled us to document the content properties that showed a stronger correlation with recognition memory and to build a machine-learning computational model that accounted for episodic memory formation in single events for group averages and individual subjects with an accuracy of up to 80%. These results provide initial steps towards the development of a quantitative computational theory capable of explaining the subjective filtering steps that lead to how humans learn and consolidate memories.

%B Scientific Reports %8 10/2016 %G eng %U http://www.nature.com/articles/srep30175 %N 1 %! Sci Rep %R 10.1038/srep30175 %0 Generic %D 2016 %T Predicting episodic memory formation for movie events [code] %A Hanlin Tang %A Jedediah Singer %A Matias Ison %A Gnel Pivazyan %A Melissa Romaine %A Rosa Frias %A Elizabeth Meller %A Adrianna Boulin %A James Carroll %A Victoria Perron %A Sarah Dowcett %A Marlise Arlellano %A Gabriel Kreiman %X

Episodic memories are long lasting and full of detail, yet imperfect and malleable. We quantitatively  evaluated recollection of short audiovisual segments from movies as a proxy to real-life memory  formation in 161 subjects at 15  minutes up to a year after encoding. Memories were reproducible within  and across individuals, showed the typical decay with time elapsed between encoding and testing,  were fallible yet accurate, and were insensitive to low-level stimulus manipulations but sensitive to  high-level stimulus properties. Remarkably, memorability was also high for single movie frames, even  one year post-encoding. To evaluate what determines the efficacy of long-term memory formation,  we developed an extensive set of content annotations that included actions, emotional valence, visual  cues and auditory cues. These annotations enabled us to document the content properties that showed  a stronger correlation with recognition memory and to build a machine-learning computational model  that accounted for episodic memory formation in single events for group averages and individual  subjects with an accuracy of up to 80%. These results provide initial steps towards the development of a  quantitative computational theory capable of explaining the subjective filtering steps that lead to how  humans learn and consolidate memories.


To view more information and dowload datasets, etc. please visit the project website - http://klab.tch.harvard.edu/resources/Tangetal_episodicmemory_2016.html#sthash.cj1STRah.bumwWxcX.dpbs


The corresponding publication can be found here.


The corresponding code entry can be found here.

 

%0 Generic %D 2016 %T Predicting episodic memory formation for movie events [dataset] %A Hanlin Tang %A Jedediah Singer %A Matias Ison %A Gnel Pivazyan %A Melissa Romaine %A Rosa Frias %A Elizabeth Meller %A Adrianna Boulin %A James Carroll %A Victoria Perron %A Sarah Dowcett %A Marlise Arlellano %A Gabriel Kreiman %X

Episodic memories are long lasting and full of detail, yet imperfect and malleable. We quantitatively  evaluated recollection of short audiovisual segments from movies as a proxy to real-life memory  formation in 161 subjects at 15  minutes up to a year after encoding. Memories were reproducible within  and across individuals, showed the typical decay with time elapsed between encoding and testing,  were fallible yet accurate, and were insensitive to low-level stimulus manipulations but sensitive to  high-level stimulus properties. Remarkably, memorability was also high for single movie frames, even  one year post-encoding. To evaluate what determines the efficacy of long-term memory formation,  we developed an extensive set of content annotations that included actions, emotional valence, visual  cues and auditory cues. These annotations enabled us to document the content properties that showed  a stronger correlation with recognition memory and to build a machine-learning computational model  that accounted for episodic memory formation in single events for group averages and individual  subjects with an accuracy of up to 80%. These results provide initial steps towards the development of a  quantitative computational theory capable of explaining the subjective filtering steps that lead to how  humans learn and consolidate memories.


To view more information and dowload datasets, etc. please visit the project website - http://klab.tch.harvard.edu/resources/Tangetal_episodicmemory_2016.html#sthash.cj1STRah.bumwWxcX.dpbs


The corresponding publication can be found here.


The corresponding code entry can be found here.

 

%0 Journal Article %J Frontiers in Systems Neuroscience %D 2015 %T Decrease in gamma-band activity tracks sequence learning %A Radhika Madhavan %A Daniel Millman %A Hanlin Tang %A NE Crone %A Fredrick A. Lenz %A Travis S Tierney %A Joseph Madsen %A Gabriel Kreiman %A WS Anderson %X

Learning novel sequences constitutes an example of declarative memory formation, involving conscious recall of temporal events. Performance in sequence learning tasks improves with repetition and involves forming temporal associations over scales of seconds to minutes. To further understand the neural circuits underlying declarative sequence learning over trials, we tracked changes in intracranial field potentials (IFPs) recorded from 1142 electrodes implanted throughout temporal and frontal cortical areas in 14 human subjects, while they learned the temporal-order of multiple sequences of images over trials through repeated recall. We observed an increase in power in the gamma frequency band (30–100 Hz) in the recall phase, particularly in areas within the temporal lobe including the parahippocampal gyrus. The degree of this gamma power enhancement decreased over trials with improved sequence recall. Modulation of gamma power was directly correlated with the improvement in recall performance. When presenting new sequences, gamma power was reset to high values and decreased again after learning. These observations suggest that signals in the gamma frequency band may play a more prominent role during the early steps of the learning process rather than during the maintenance of memory traces.

%B Frontiers in Systems Neuroscience %V 8 %8 01/21/2015 %G eng %U http://journal.frontiersin.org/article/10.3389/fnsys.2014.00222/abstract %! Front. Syst. Neurosci. %R 10.3389/fnsys.2014.00222 %0 Generic %D 2014 %T A role for recurrent processing in object completion: neurophysiological, psychophysical and computational evidence. %A Hanlin Tang %A Buia, Calin %A Joseph Madsen %A WS Anderson %A Gabriel Kreiman %X

Recognition of objects from partial information presents a significant challenge for theories of vision because it requires spatial integration and extrapolation from prior knowledge. We combined neurophysiological recordings in human cortex with psychophysical measurements and computational modeling to investigate the mechanisms involved in object completion. We recorded intracranial field potentials from 1,699 electrodes in 18 epilepsy patients to measure the timing and selectivity of responses along human visual cortex to whole and partial objects. Responses along the ventral visual stream remained selective despite showing only 9>25 of the object. However, these visually selective signals emerged ~100 ms later for partial versus whole objects. The processing delays were particularly pronounced in higher visual areas within the ventral stream, suggesting the involvement of additional recurrent processing. In separate psychophysics experiments, disrupting this recurrent computation with a backward mask at ~75ms significantly impaired recognition of partial, but not whole, objects. Additionally, computational modeling shows that the performance of a purely bottom>up architecture is impaired by heavy occlusion and that this effect can be partially rescued via the incorporation of top>down connections. These results provide spatiotemporal constraints on theories of object recognition that involve recurrent processing to recognize objects from partial information.

%8 04/2014 %1

arXiv 1409.2942

%2

http://hdl.handle.net/1721.1/100173

%0 Journal Article %J Neuron %D 2014 %T Spatiotemporal Dynamics Underlying Object Completion in Human Ventral Visual Cortex %A Hanlin Tang %A Buia, Calin %A Radhika Madhavan %A NE Crone %A Joseph Madsen %A WS Anderson %A Gabriel Kreiman %K Circuits for Intelligence %K vision %X

Natural vision often involves recognizing objects from partial information. Recognition of objects from parts presents a significant challenge for theories of vision because it requires spatial integration and extrapolation from prior knowledge. Here we recorded intracranial field potentials of 113 visually selective electrodes from epilepsy patients in response to whole and partial objects. Responses along the ventral visual stream, particularly the Inferior Occipital and Fusiform Gyri, remained selective despite showing only 9-25% of the object areas. However, these visually selective signals emerged ~100 ms later for partial versus whole objects. These processing delays were particularly pronounced in higher visual areas within the ventral stream. This latency difference persisted when controlling for changes in contrast, signal amplitude, and the strength of selectivity. These results argue against a purely feed-forward explanation of recognition from partial information, and provide spatiotemporal constraints on theories of object recognition that involve recurrent processing.

%B Neuron %V 83 %P 736 - 748 %8 08/06/2014 %G eng %U http://linkinghub.elsevier.com/retrieve/pii/S089662731400539Xhttp://api.elsevier.com/content/article/PII:S089662731400539X?httpAccept=text/xmlhttp://api.elsevier.com/content/article/PII:S089662731400539X?httpAccept=text/plain %N 3 %! Neuron %R 10.1016/j.neuron.2014.06.017