Title | Using Multimodal DNNs to Study Vision-Language Integration in the Brain |
Publication Type | Conference Paper |
Year of Publication | 2023 |
Authors | Subramaniam, V, Conwell, C, Wang, C, Kreiman, G, Katz, B, Cases, I, Barbu, A |
Conference Name | ICLR 2023 |
Date Published | 03/2023 |
Abstract | We leverage a large stereoelectroencephalography (SEEG) dataset consisting of neural recordings during movie viewing and a battery of unimodal and multimodal deep neural network models (SBERT, BEIT, SIMCLR, CLIP, SLIP) to identify candidate sites of multimodal integration in the human brain. Our data-driven method involves three steps: first, we parse the neural data into discrete, distinct event-structures, i.e., image-text pairs defined either by word onset times or visual scene cuts. We then use the activity generated by these event-structures in our candidate models to predict the activity generated in the brain. Finally, using contrasts between models with or without multimodal learning signals, we isolate those neural arrays driven more by multimodal representations than by unimodal representations. Using this method, we identify a sizable set of candidate neural sites that our model predictions suggest are shaped by multimodality (from 3\%-29\%, depending on increasingly conservative statistical inclusion criteria). We note a meaningful cluster of these multimodal electrodes in and around the temporoparietal junction, long theorized to be a hub of multimodal integration. |
URL | https://openreview.net/pdf?id=OQQ1p0pFP4 |
Associated Module:
CBMM Relationship:
- CBMM Funded