%0 Generic %D 2014 %T The Compositional Nature of Event Representations in the Human Brain %A Andrei Barbu %A Daniel Barrett %A Wei Chen %A N. Siddharth %A Caiming Xiong %A Jason J. Corso %A Christiane D. Fellbaum %A Catherine Hanson %A Stephen José Hanson %A Sebastien Helie %A Evguenia Malaia %A Barak A. Pearlmutter %A Jeffrey Mark Siskind %A Thomas Michael Talavage %A Ronnie B. Wilbur %X

How does the human brain represent simple compositions of constituents: actors, verbs, objects, directions, and locations? Subjects viewed videos during neuroimaging (fMRI) sessions from which sentential descriptions of those videos were identified by decoding the brain representations based only on their fMRI activation patterns. Constituents (e.g., fold and shirt) were independently decoded from a single presentation. Independent constituent classification was then compared to joint classification of aggregate concepts (e.g., fold -shirt); results were similar as measured by accuracy and correlation. The brain regions used for independent constituent classification are largely disjoint and largely cover those used for joint classification. This allows recovery of sentential descriptions of stimulus videos by composing the results of the independent constituent classifiers. Furthermore, classifiers trained on the words one set of subjects think of when watching a video can recognize sentences a different subject thinks of when watching a different video.

%8 07/14/2014 %1

arXiv:1505.06670v1

%2

http://hdl.handle.net/1721.1/100175

%0 Generic %D 2014 %T Seeing is Worse than Believing: Reading People’s Minds Better than Computer-Vision Methods Recognize Actions %A Andrei Barbu %A Daniel Barrett %A Wei Chen %A N. Siddharth %A Caiming Xiong %A Jason J. Corso %A Christiane D. Fellbaum %A Catherine Hanson %A Stephen José Hanson %A Sebastien Helie %A Evguenia Malaia %A Barak A. Pearlmutter %A Jeffrey Mark Siskind %A Thomas Michael Talavage %A Ronnie B. Wilbur %X

We had human subjects perform a one-out-of-six class action recognition task from video stimuli while undergoing functional magnetic resonance imaging (fMRI). Support-vector machines (SVMs) were trained on the recovered brain scans to classify actions observed during imaging, yielding average classification accuracy of 69.73% when tested on scans from the same subject and of 34.80% when tested on scans from different subjects. An apples-to-apples comparison was performed with all publicly available software that implements state-of-the-art action recognition on the same video corpus with the same cross-validation regimen and same partitioning into training and test sets, yielding classification accuracies between 31.25% and 52.34%. This indicates that one can read people’s minds better than state-of-the-art computer-vision methods can perform action recognition.

%8 09/2014 %2

http://hdl.handle.net/1721.1/100176