Ben Deen: Multivoxel Pattern Analysis for Understanding Representational Content
Date Posted:
June 5, 2014
Date Recorded:
June 5, 2014
CBMM Speaker(s):
Ben Deen All Captioned Videos Brains, Minds and Machines Summer Course 2014
Description:
Topics: Motivation for multivoxel pattern analysis (MVPA); correlation based classification analysis; results of analysis of EBA and pSTS cortical regions: EBA patterns carry information about body pose that is invariant to body motion kinematics, pSTS patterns carry information about body motion kinematics that is invariant to body pose; examples of other MVPA results; application of different kinds of classifiers; pattern resolution; representational similarity analysis; RSA example: gaze direction codes
BEN DEEN: OK, does that sound OK? OK, good. So I'll be talking about multivoxel pattern analysis, which is a relatively new or newer method for fMRI data analysis that can tell us in a bit more detail about the representations that the [? different regions ?] that Nancy's been talking about employing. OK, so as Nancy described, ostensibly, we find a good amount about the functional organization of the human brain from fairly straightforward methods of fMRI data analysis-- so just looking at which brain regions responds more to certain categories of stimuli than others.
However, we might want to know a lot more than just what certain brain regions respond to. In particular, we might want to know what dimensions of the simulated regions are actually representing. So for instance, I have two images of [INAUDIBLE] on the right here. And if you measure my responses in a number of subjects in face regions [INAUDIBLE], you'll probably get pretty similar responses.
So [INAUDIBLE] center responds quite strongly to just about any face that you present [INAUDIBLE] differentiate strongly between different faces. However, if we think that this region is representing certain dimensions of these faces such as expected facial features or facial identity, presumably there are some differences in the underlying neuro processes in these regions to these two face images. And the thought behind MVPA is that maybe if on top of that, there's actually some special organizations [INAUDIBLE] in the underlying no responses, we might actually be able to pick up on that difference by looking at spatial patterns of fMRI responses across the region rather than just averaging the response across the region in sort of a standard [? anaylsis set. ?]
OK, so just to give you a sense of the sorts of features, thinking about new sets of [? analyses, ?] particularly for category specific regions like the ones that Nancy has been talking about. So for face regions, you could think that that could represent aspects of specific facial features like eyes, noses, mouth, et cetera, facial identity, head orientation, gaze direction, expression, and so on. For scene selected regions, so regions that respond specific to the scenes, some hypotheses are that they might be involved in representing the identity of a specific place. They might be involved in representing scene categories, so whether it's a beach or a forest or so on.
They might be involved in representing the geometric layout of the scene, your orientation within that scene, and so on. And for body regions, you might think that they're representing something like the specific part that you're looking at, features of that body part like the shape or muscle tone, or the configuration of a full body-- so the structure of the body. OK so I'll just dive into specifically how we're going to do this.
So I'll talk first about the most basic type of analysis that is used in MVPA. The basic idea is to see what it is-- a consistent difference in the evoked pattern of response across two conditions within a certain region. So basically, you're going to scan a number of subjects while they view a certain number of stimulus categories. You're going to split your data in half, generate activation maps for each category in a given region-- so for instance, let's say we have one category that's a visual image of chairs and other that's images of cards.
And we're going to take the first half of the data and it's extract. Evoked patterns of response to the two categories-- we're going to also do this in the second of the data, the point of this being to get a sense of our noise limit and how consistent the pattern of response are just within a given category. And we're going to compute correlations between this patterns of response.
So we'll compute these within category. Again, just get a sense of how reliable these patterns are and what our noise level is. Ask them a few correlations between categories and if they were going to ask what is the difference between the within category correlations and between category prohibitions and the idea being that if there's a reliable difference, that indicates that there's some information in the pattern of response in this region that differentiates these categories.
So again, we'll give you some sort of statistical comparison of the within and between category correlations. And if there's a significant difference between those, we'll claim that there's information in a pattern in this region that can discriminate those categories. OK, so I'll walk you first just one study that uses this method that I think is a particularly nice example of a way of getting at which features a region represents a certain type of stimulus.
So this is a stimulus [INAUDIBLE] point light display. It was introduced by a psychologist named Johansson in the '70s. The kind of cool thing about this stimulus is that you can get pretty detailed information about the nature of the motion of this human agent.
You can see that he's doing a jumping jack and then also the form of his body. So you can see where his arms and legs are in spite of the total lack of static body information. So you're just looking at dots that are placed at the end points of the limbs of this agent.
So if you want something about where in the brain this type of stimulus is processed, you could do a sort of standard fMRI analysis. So for [? pronounced ?] animations depicting point light displays of human emotion like the one I just showed you, and compare that to point light displays depicting object motion. So for instance, I'm just rigidly rotating a translating object.
if you do that comparison, you get something that looks like this. So this is a sort of large meta analysis looking at 121 subjects' responses. So you get activation in a pretty broad swath.
This temporal cortex-- and in particular, there's sort of two foci. So one turns out It correspond to the extrastriate body area, which, as Nancy mentioned, also responds to static images of bodies and one that's a bit more superior. It's called the pSTS, so posterior superior temporal sulcus. So you might want to know then are these regions doing different things in processing stimuli. And just for this sort of standard unit varied analysis, we don't really know anything about that.
We just know that these two regions both responses to human motion over object's motion. There's not an obvious next step in terms of sort of standard fMRI analyses to get at what might be the difference about they're getting. OK, so [INAUDIBLE] et all in 2014 tried to use MVPA to get at this question.
So they presented subjects with four different types of point light displays depicting humans walking. So the point light displays are either going to be facing toward the left or toward the right. So there's the simulator in the left hand column and the right hand column.
And they're also either going to be walking forwards or walking backwards. So the PLDs on top are walking forwards and the ones on the bottom are walking backwards. And they're basically going to try to see whether patterns in these two regions, EBA and pSTS, can discriminate across these two dimensions-- so the body form or body orientation dimension, so leftward or rightward facing, and body motion dimension as the walking forward versus walking backwards.
OK, and the way that this sort of split happened also is that it describes before will play out in this specific case is that we're going to again split our data into two different halves-- so the first and second half of the data. And to look at form information, we're going to collapse across the motion dimension. So we're going to collapse across forward and backward moving point light displays and just separate data based on whether the walker is facing to the left or facing to the right.
The thing that as I described can compute correlations across the two halves of the data in patterns that respond [? from ?] these two regions both within category and between category, see if there's a significant difference between those. If there is, we're going to claim that the patterns in this region have some information about body form or body orientation. And likewise, for motion, we're going to collapse across the [INAUDIBLE] dimension-- so collapse across leftward and rightward facing.
Separate the data in terms of the whether the point light walker was going forward or backward and then again do the same sort of comparison of within condition correlations to between condition correlations. And if there's a significant difference, we'll claim that there's information and patterns in this region about body motion. OK, so I'll just show you the results. So this study is sort of a separate scan to localize these two regions-- the EBA and pSTS-- and then look at patterns of response to these four conditions within these regions.
And this is showing results from the EBA. So what I'm plotting is sort of form indexes-- the difference in correlations within category and between categories for the form separation. And the motion index is the same thing but for the two sets of motion conditions.
And basically, what you can see is that there's a significant discrimination for the form index but not for the motion index. So it seems that EBA patterns discriminate between walkers that face to the left and face to the right but not between walkers that are moving forwards and backwards. And in contrast, in pSTS, you get actually the opposite pattern of results.
So here, patterns seem to discriminate between walkers that are moving forwards and backwards but not those that are facing left and right. So the argument here is that an EBA gives some representation of the orientation of a body and in pSTS in contrast gives some representation of the specific kinematics of body motion that the agents. [INAUDIBLE] Yeah?
AUDIENCE: [INAUDIBLE] V1 or MP at all?
BEN DEEN: Yes, so they look in MC and don't find either discrimination. They don't look at [INAUDIBLE]. And one thing I didn't mention is that they [? do ?] said they were having [INAUDIBLE] and they do vary the positions a bit-- the presentations a bit [INAUDIBLE]. All right, so one other important thing to ask-- so you can imagine-- so say in this experiment, you just had two conditions.
| say just the leftward and rightward facing walker both moving forward. If you were able to decode orientation in that sort of experiment, you might want to claim that discrimination was about the heading orientation of the actor. But in fact, it could also be about just little [INAUDIBLE] [? scenario. ?] There are similarities in just the individual motion trajectories and specific bounce in the animations.
And that could be driving this sort of discrimination. So an important thing for MVPA studies in general is to demonstrate some sort of invariance with respect to other stimulus properties. So the study nicely has this two by two design so I can directly look at this so I can ask for the form information.
Does this generalize across the two different motion conditions and vice versa for the motion information? So in this analysis, instead of just splitting the data in half in the sort of standard way, we're going to split it in half across the orthogonal dimension. So to look at form information, we'll split it by the forward and backward moving walkers and then again do the sort of standard comparison of in category and between category correlations.
And for the motion information, now I'm going to split my form information-- so split the leftward facing and rightward facing walkers and again look at within versus between category correlations. And indeed they find that the EVA has form information. So those patterns can discriminate the rightward and leftward facing walkers even when you're generalizing across different motion scenarios-- so generalizing from forward moving versus backward moving walkers. And likewise, you still get this effect for the motion index and the pSTS.
Just seems to be pattern information that can discriminate forward and backward moving walkers even when you're generalizing across facing left and facing right. So there's a pretty strong demonstration that this effect is not sort of driven by multiple stimulus features like [INAUDIBLE] position or the specific motion of individual dots but it's actually about these sort of higher level properties like the orientation of a body or the motion trajectory of a body. OK, so the conclusions from this study are that EBA patterns carry information about body pose and in fact information that's in variance to body motion kinematics. And in contrast, pSTS patterns carry information about body motion and kinematics, which is actually invariance to the orientation of a body.
One important thing to mention about this study, and this applies basically to MVPA studies in general, is that there are some potentially interesting null results in this study. But we don't really want to strongly interpret these null results. The reason is basically that it's entirely possible that, for instance, the EBA does actually have a representation of body motion but that there's just not sort of a clear enough spatial organization to that representation such that it can't be read out at the level of a relatively low resolution fMRI signal. So if you have strong positive results in MVPA studies, this strong demonstration of invariance, you can make strong claims from that. But for negative results, it's hard to interpret the lack of representation. OK--
AUDIENCE: I [INAUDIBLE] the objection. I don't really object, but it would be unparsimonious to think that this body form information exists at a scale you can see with MVPA in one region and exists but at a different scale in the [? next-door ?] region. [INAUDIBLE] possible but [INAUDIBLE]
BEN DEEN: Unparsimonious [INAUDIBLE] possible. But there are certainly examples of representations that are not facial organized like [INAUDIBLE] an hippocampus. So if you do this-- if you had to decode facial position, the hippocampus [INAUDIBLE] probably wouldn't be [INAUDIBLE] and you wouldn't want to conclude that there's not a [INAUDIBLE] there.
OK, so just a quick summary of a few other things that we've learned about-- how [INAUDIBLE] visual region choosing MVPA. So there's been a fairly extensive set of studies on object shape representations in lateral [INAUDIBLE] which is the reason that responds to object over other kind of non-object visual stimuli like pictures. There's extensive evidence that these regions do in fact represent the shape of objects. Studies of the face regions have demonstrated that there are view-invariant representations of face identity in a number of these regions.
So you can process identity even subject to changes in viewpoints. The author of this is Stefano Anzellotti, who will be here at some point. You can talk to him if you want. So
In scene-selective regions, there's some evidence of representations of a scene category-- so whether it's seen as a forest or a room or a beach or so on-- and also the geometric layout of a scene. And then lastly, the rTPJ, which is the region that seems to be involved in processing others' metal states. As Nancy briefly related to, there's some evidence for representations specific types of mental states. And I think Rebecca will talk more about that tomorrow.
OK, just a brief bit on different types of classifiers that can be used for MVPA. So the sort of correlation analysis that I described before is sort of the most simple way to analyze data or analyze having [INAUDIBLE] fire responses. You can think of the problem in MVPA more generally in a machine learning problem context as an authentication problem.
So this would be a number of individual data points corresponding to individual stimuli. So for instance, individual images that you present to the subject, he data points corresponds to patterns of response to each of these stimuli. So you have a space in which each dimension corresponds to the response of a given voxels within a region. And so points in that [INAUDIBLE] correspond to a pattern the response across voxels.
And this is a problem of asking whether patterns in that region discriminate to conditions corresponds to a classification problem. So you're asking whether there's some separation of data points corresponding to those two categories in this case. And it turns out that the correlation method that I described is basically a very simple one nearest neighbor classifier where you take all your data points for the two categories and just average them to get a mean response and then just do one nearest neighbor classification based on that mean response.
But as many of you are aware, they are often better things that you can do than that. So some of the things that have been used in the literature are Gaussian classifiers such as LDA or SDN classifiers. And in some these have been shown to do somewhat better in terms of classification.
So if your goal is to kind of maximize classification based on patterns in the region, it's potentially better to use the slightly more sophisticated methods. And one thing to note is that you might have noticed that I'm sort of only mentioning linear classifiers in this discussion. There's sort of two reasons for that. So non-linear classifiers have typically not been used in the literature in part because just empirically, they don't seem to work very well.
So in cases where linear and non-linear classifiers have been explicitly compared, linear classifiers actually seem do better for this sort of data. The other reason is this sort of more conceptual. It's the idea that-- so you want to know about what a region is representing.
A good thing to ask you what can be linearly read out in population codes in that region rather than non-linearly read out. And a good way to illustrate this is in terms of individual neurons-- so for instance, if you take responses of individual neurons in the retina to arrange in different images, in principle, there will be some non-linear classifier that can do all kinds of stimulus discrimination-- so things as high-level as object recognition. And we know that because that's what the visual system is doing-- that it's applying some [? non-linear ?] information and going from retina level set of responses and getting high level things like object recognition.
But we wouldn't want to say that the retina is involved in object recognition, per se. So it's more appropriate to make conclusions about what a region represents by asking what can you [INAUDIBLE] able to decode from that region. That's sort of a more high level motivation to use linear classifiers for this research.
Another question of relevance to this research is what sort of patterns can we pick up on. So in particular, fMRI is sort of a low resolution method. As Nancy mentioned, that's one of the primary limitations of fMRI. But is it possible to pick up on somewhat finer spatial scale patents using MVPA which takes advantage of this pattern information?
And this has been asked in a number of ways but I think most importantly with orientation selectivity. So within the large scale of retinotopic organization of early visual cortex that Nancy mentions there's also a finer scale organization in terms of orientation preference. There are different regions of visual cortex that have preferences for edges at certain orientations. And this organization is sort of embedded within the retinotopic organization of visual cortex and is at a relatively fine spacial scale. So it's at a scale of about 300 to 500 microns.
So this image is showing sort of the schematic of [INAUDIBLE] visual cortex where different colors respond to regions with a certain orientation preference. And the black grid, which hopefully you can see, corresponds to sort of a superimposition of a grid of fMRI voxels on this pattern-- so the lines here. And then, as you can see, the boxes are quite a bit bigger than the orientation columns.
However, you can imagine that if there's some sort of biases in basically the number of orientation columns of a certain type within a given voxel, you might still be able to pick out-- so individual voxels might still have preferences, if weak preferences, for certain orientation. This is shown here. So this voxel in the upper left responds a little bit more strongly to this orientation shown in green than the one that's shown in blue.
And therefore, it still might in principle be possible to decode orientation from patterns of fMRI response even though it's the cell [INAUDIBLE] measure. OK, so Kamitani and Tong in 2005 tried to do this. They measured the responses in the early visual cortex of B1 and B2.
A subject watched a number of different oriented gradings-- so gradings of 16 different orientations. And they tried to build an SDM classifier to predict the perceived orientation based on B1 and B2 patents. And they were in fact able to do this quite well.
So the images on the right in the upper left is the orientation that was shown to a given subject. And the radial [? plots ?] show the predicted orientation based on this SDM. And you can see that they're doing quite well. They almost always get roughly the right orientation.
OK, so they conclude from this that orientation can in fact be decoded from patterns of fMRI response in the early visual cortex. And furthermore, they sort of postulate that this is based on an ability of this relatively low resolution method to pick up on this finer spatial scale information in the orientation [INAUDIBLE]. So I should mention at since this paper, whether it is this theoretical claim about picking up on science [INAUDIBLE] scale information has basically been debated for the past nine years and it's actually still a matter of debate.
Tong and other groups have been going back and forth on this for quite a while. But I'll just give you a brief account of sort of where that debate has gone and the current status of it. So the major complaints about this was that it turns out there's actually also, in addition to the fine scale orientation column organization, a sort of weak large scale orientation preference across visual cortex.
So what I'm showing here is just the standard retinotopic organization of B1 and B2. So the different colors correspond to regions that have preferences for stimulating a certain angular condition within the individual fields. And then it turns out if you torture your subjects and scan them for almost two hours looking at oriented ratings and different orientations, you can actually find a similar map of orientation preference that's sort of overlaid on this angular position preference in the early visual cortex. This is done by Jeremy Freeman in [? David Heeger's ?] lab.
So the claim of this paper is that, in fact, orientation decoding could be just based on this large scale orientation map and could not be picking up on sort of the fine spatial scale information that's known to your presence in the orientation columns. OK, and so the most recent stuff on this is actually so recent that it's not published. So unfortunately, I don't have figures to show you on this.
But the claim is basically that in fact, you can sort of model out this large scale orientation preference and still do [INAUDIBLE] classification of orientations from B1 and B2 patterns. So that suggests that these large [INAUDIBLE] patterns might be playing a role in the original paper. But you could actually get rid of them and still have some information about orientations, suggesting that you do get some of this fine scaled spatial scale structure in MVPA responses.
And also sort of particular interest, you can find individual voxels that have stable preferences for multiple orientations, which suggests that these preferences can't be [? gestured ?] in by this large-scale [? map. ?] That would just [INAUDIBLE] single reference for [? each particular ?] voxel. But that paper has yet to come out.
So we still have to evaluate that, see if we think it's true. But there's at least some hope that some of the patterns that we're reading out in MVPA could be driven by these very fine spatial scale patterns. OK, so I don't have too much time, but I'll talk a little bit about representational similarity analysis, which is a somewhat more distant tool that sort of provides a more full or rich description of a representational space of a region.
The idea here is to not just look at patterns across two or a small number of conditions but to look at patterns of response in a region to sort of a large number of different categories of stimuli or actually individual stimuli themselves and look at the full matrix correlations between patterns and responses to stimuli and see if there's just sort of reconstructive similarity space of that region to sort of get at the features that this region might be representing. And actually, the standard way to visualize this in the literature is not actually a correlation matrix but a dissimilarity matrix, which is often just computed as 1 minus the correlation between patterns in a region. So an example of that is shown here.
So this is a region that responds similarly to different images of houses and dissimilarly to different images of-- restarting-- dissimilarly to images of faces and houses. And the motivation for this sort of method is that it just provides sort of a more full description of what a region is representing and try to do things like figure out what features of a larger space a region cares about and what features modulate patterns of response in that region. You can try to do things like dimensionality reduction and see if there's sort of a low dimensional space that can account for similarity patterns in this region.
You can do things like structure discovery and try to find what the structure similarity patterns in that region affect. So is it something like a tree structure, something like a [? rig, ?] something like a grid, so on. And the really nice thing about this method is that you can also use it for cross modal comparison.
So this representational dissimilarity matrix, or RDM, abstract away from the patterns themselves. So you can therefore use it not just with human fMRI data but also with other modalities. So you can do this with monkey [INAUDIBLE] physiological data.
So you can take the population reportings in monkeys to a large number of images, correlate responses across different neurons, and use that to build up an RDM for a region of the monkey cortex. You can also get an RDM out of a computational model. So you can build a computational model that applied some process to some sort of stimuli and then extracts responses. For instance, in the neural network model, adds some sort of high level nodes in the model, correlate across stimuli, and get an RDM through that. You can also do this with behavior.
So you can, for instance, ask people [? how similar ?] a set of stimuli are or derive this [INAUDIBLE] some other [INAUDIBLE] measure. So you can use this to compare similarity spaces across [? a range of ?] different modalities, across species, across [INAUDIBLE] models and behavior. So it's a potentially very powerful tools for sort of integrating different modalities.
So this is sort of the first example of an implementation of RSA from Niko Kriegeskorte, who sort of came up with this term and has been popularizing this method over the past few years. So in this study, he's going to compare RSAs for a large number of visual objects in both human and monkey [INAUDIBLE] for a temple cortex-- so basically, object [INAUDIBLE] in humans and monkeys. And with humans, he's doing this with fMRI. With monkeys, he's doing this with population electrophysiological reportings in IT.
And he uses a set of 100 images that are categorized a lot of [INAUDIBLE]. So animate versus inanimate, human versus non-human, body versus face, and so on. And this is the RDM for human IT. It's on a [INAUDIBLE] responses.
And you can see there's a very strong organizing future in this response, which is the animate, inanimate distinction. So generally, if you take two images of animate things, they have a fairly similar pattern of response across IT. If you take an animate thing and an inanimate thing, they have a really different pattern of response.
You can also see some sort of meaningful variation within the animate square. So faces and bodies have somewhat different patterns. If we compare this to the RDM acquired in monkey IT using electrophysiology, you see a surprisingly similar-- or not surprisingly but strikingly similar pattern of results. So again, you see this very strong manipulation by [INAUDIBLE]. And in fact, if you correlate these, you get a correlation of about 0.5, which is high significance in this data.
So I find it quite striking that you can use these totally different methods in different species and end up with a similarity space that seems quite similar. And one thing to mention is that the correlation is not just driven by the animate, inanimate divide. So you can look, for instance, just within the inanimate stimuli and [INAUDIBLE] correlation-- likewise, within the animate stimuli or in the off diagonal block of animate to inanimate.
I think I'll actually have time for this. So this is going to give us one last example of sort of a cool use of RSA for answering a scientific question. So the question here is where can we find in the brain representations of gaze direction in perceived faces that are invariant to low level image properties like head orientation?
So humans are pretty good at discriminating the gaze of others. I think you'll probably hear from Danny [INAUDIBLE] at some point. And [INAUDIBLE] will tell you more precisely how good they there. But for instance, if you look at this middle images, you can sort of roughly tell that all these are looking directly at you despite quite a substantial variation in the specific image you're receiving based on variations in head orientation.
So this study is going to present subjects with of all these different images of faces with different views orientations and head orientations and ask if there are reasons whose RDMs reflect basically differences in face direction that are invariant to differences in head orientation. So we can build an RDM for gaze direction similarity. [INAUDIBLE] this block diagonal structure just based on the way that these stimuli are arranged from left to right. You can also build RDMs for low level image properties that we want to sort of abstract out.
So we can build an RDM on the top here just based on pixel-wise image similarity. You can also build one based on [? pet ?] pose. These RDMs are going to be correlated with the gaze direction RDM but not perfectly correlated.
So they're picking up on slightly different information. So this example is going to use something called the searchlight method. So basically, instead of looking at a predefined [INAUDIBLE] patterns can discriminate versus we're sort of going to look across the brain for where [? local ?] patterns have RDMs that fit these theoretical RDMs corresponding to gaze direction differences.
So basically, each voxel in the brain, we're going to put a little sphere around that voxel, consider the patterns of response to all the images within that sphere, build up an RDM for each of those spheres, and then compare those RDMs to these theoretical RDMs. So this is the result of that analysis. So here, we're basically, for each voxel, computing a local RDM and correlating that with the theoretical face direction RDM.
So voxels that show up in yellow here basically have significant correlations with the gaze direction RDM. You can see the two regions of the superior temporal sulcus with posterior and anterior where the RPM appears to reflect differences in gaze direction. Furthermore, we can look for regions in which the RDM relates to the gaze direction RDM even after we account for effects as [INAUDIBLE] image features and head orientation. So you're going to be looking for regions where the RDM from left and right patterns is correlated with the gaze direction RDM even after parcelling out these low level RDMs.
And basically, it appears that this anterior STS region still correlates to the gaze direction RDM even after we parcel out these low level features whereas the posterior STS RDM essentially does not. So the conclusion here is that the anterior STS actually has a representation of each direction that's in variance to low level [INAUDIBLE] features as well as head orientation and that [INAUDIBLE] representation can be actually read out with [? patterns of ?] fMRI response. That this pretty much all I had to say.
[INAUDIBLE] time [INAUDIBLE]. Just to summarize what I've been going to, so MVPA can potentially [INAUDIBLE] that we can conclude from sort of standard fMRI analyses and tell us about what a region is actually representing. These patterns may reflect some fMRI voxel resolution patterns and it's really exciting that we can potentially pick up on those.
But in some cases, it also may not reflect that. So you have to be sort of careful about making conclusions about this. Lastly, representational similarity analysis can provide sort of a more rich picture of the overall similarities [INAUDIBLE] region and can be used a sort of very powerfully to combine data across different species, different modalities, and across [? neural ?] data as well as computational model [INAUDIBLE] here. OK, so that's all I have. Any questions?
AUDIENCE: [INAUDIBLE] learned something. Test is really nice [INAUDIBLE] the method. I have no question on that.
Can you see the STS pretty much [INAUDIBLE] in both biological motion and social attention like the gaze direction? We also know that it's pretty much involved in language processing. So it seems like this [INAUDIBLE] is really uncommon. And that's probably not just [? in these three ?] regions.
It's probably-- the whole brain is going like a distributed network. So I don't know how informative is the selective approach to understand the brain. It must be really hard.
BEN DEEN: That's a very good question. So actually, we've asked this question specifically about the STS. So we looked at responses to faces, biological [? mission, ?] language, mental states, and voices.
And it turns out that you actually can find highly selective regions within the STS for those things. So you can find a region that responds to biological motion over object motion but doesn't care at all about language, doesn't care about faces versus object, doesn't care about voices, and doesn't care about metal states. You'll also find one for mental states.
So you can find one for voices. You can sort of find one for language, although there is some modulation by mental state content. And interestingly, at least in our data, you can't find one for faces. So you get an overlapping response to faces and voices. But it does seem to be substantial selectivity even within the STS.
AUDIENCE: Actually [INAUDIBLE] your particular example of overlap between language and [? some ?] social stuff, then the data actually does show some overlap in there between those two, which is something that's a matter of ongoing investigation. So I think the picture is that there really is quite a bit of segregation but there's also some patterns of overlap. And both could be important.
BEN DEEN: Yep. Anything else? Yep.
AUDIENCE: Do people look at structured models to look at representations in the brain?
BEN DEEN: Yeah. Yeah, that's a great question. Yes, so if you look at some of Kriegoskorte's papers, he does some explicit comparison with neural RDMs to a wide variety of models. So what exactly do you mean by structured models? Like--
AUDIENCE: I don't know exactly. But it seems like you're talking about [INAUDIBLE] representations in the brain. But [INAUDIBLE] dimensions over [INAUDIBLE] variable.
[? AUDIENCE: Alice ?] will talk a bunch about the [INAUDIBLE]
BEN DEEN: OK.
AUDIENCE: Something that you can definitely do-- you can generate RDMs from any kind of really richly structured model and make predictions of how similar different objects will be. And then you compare those with the neural data. So we'll be actually, like, directly using the model on the neural data. But you can read predictions about what--
AUDIENCE: Don't you think-- I mean, I thought-- [INAUDIBLE] with using fancy [INAUDIBLE] evasion, [INAUDIBLE] papers.
AUDIENCE: Well, this is the [INAUDIBLE] thing, which is much simpler, really.
AUDIENCE: Basically, it's EMM. [INAUDIBLE]
AUDIENCE: Yeah, and that's cool but different. I mean, you're asking if you have a particular computational model, can you use that structure to query where [INAUDIBLE].
And now [INAUDIBLE] popping. Some manage to do that. But also, as we just said, this RSA method is another way to do that. You could get a representational similarity state out of whatever model it is and then look for that representation in the similarity structure [INAUDIBLE].
BEN DEEN: Right, yeah, I don't think they're great examples of actually specifically doing a search for
AUDIENCE: [INAUDIBLE] not a good--
AUDIENCE: Not a [INAUDIBLE]
BEN DEEN: Sure, but things like what happened to the mobile for discovering social form to RDMs. [INAUDIBLE] like, things that people have done are-- they're just sort of [INAUDIBLE] MDS to an RDM and just sort of look at the structure that you get in sort of a [INAUDIBLE] [? subspace. ?] And I can also [INAUDIBLE].
If you look at regions that respond to color, in some of these regions, it'll see a structure that looks like a ring, suggesting that they're sort of representing [INAUDIBLE]. This [INAUDIBLE] methods for searching for the structural form that underlie RDMs haven't been applied too extensively. That stuff--
AUDIENCE: [INAUDIBLE] question. One is [INAUDIBLE] discovered these structures in RDMs, in the neural RDMs. In other ones, having more structured models [INAUDIBLE].
BEN DEEN: Sure, yeah.
AUDIENCE: [INAUDIBLE]
BEN DEEN: Yeah, yeah.
AUDIENCE: You can do the--
BEN DEEN: Yeah, and that I guess [INAUDIBLE].
AUDIENCE: [INAUDIBLE] I just have a question. So [INAUDIBLE] beginning [INAUDIBLE] what you're going to be [INAUDIBLE] with such methods. And you [INAUDIBLE]. So my question is if you're the [INAUDIBLE] but in 10 years from now, how far do you think you can push this method?
AUDIENCE: Great question.
BEN DEEN: How far do I think I can push this method? So I feel like that's really an empirical question. It depends on to what extent these representations are spatially structured. Depends on--
AUDIENCE: What was the question?
BEN DEEN: How far we can push these methods in, say, 10 years. Ideal situation.
AUDIENCE: I guess he's saying how much do the limits of the [INAUDIBLE] signal limit how far we will ever go compared [INAUDIBLE]?
BEN DEEN: Yeah, it's a very good question. So yeah, I mean, there are fundamental limits on the resolution you can obtain in fMRI data. Basically, your signal just decreases as you decrease voxel size.
And you can get around that to some extent with things like higher field strength, but only to some extent. That sort of [INAUDIBLE] other problems. So let's say something like one millimeter or a bit below one millimeter is probably as far as we're realistically going to be able to get with fMRI.
So in that sense, that is going to be a fundamental limitation of fMRI. If there are representations that are not spatially organized at that sort of scale, at least with MVPA, we probably won't be able to do that. There are potentially other methods like adaptation that we can--
AUDIENCE: [INAUDIBLE] taking it back to what you said-- what kind of questions are i
AUDIENCE: Yeah, can I just add to that? It's a very important question. I tear my hair out about this all the time.
I actually am personally worried about the conclusion. If you're seriously into the high level vision, it's [INAUDIBLE]. The last 50 years have been a hallmark.
We discovered some stuff and I think we're kind of hitting a wall. Not that there's nothing left to be done, but it's very much diminishing returns. And further, one of the things we have learned from [INAUDIBLE] variety in humans is that the mental visual pathway in humans and monkeys is damn similar.
And that means why not use real neural recordings, where you can [INAUDIBLE] pathway, where I think the answers are going to be really similar in monkeys. For other things like studying language and theory of mind, monkeys can't do it. There is no alternative.
And so we just push it as far as we can. But where monkeys are similar to humans, we can exploit the parallel stuff that we've seen with functional MRI in order to license importing the physiology data for monkeys and making progress for a stronger argument that they probably apply in humans. So it's not time wasted. It's just, you have to be clever about the way you bring in other methods to talk about [INAUDIBLE].
BEN DEEN: Do you got a-- yep.
AUDIENCE: [INAUDIBLE]. . For example, one of the sort of very obvious open questions that doesn't depend on the spacial resolution is, what's the right way of predicting [INAUDIBLE] response, right? If there's no reconstitutional model, [INAUDIBLE] response to a bunch of different stimuli.
And those types of questions just taking the overall response magnitude, you know the quote about many [INAUDIBLE] and trying to come up with some computational model that can predict the response of a [INAUDIBLE] important [INAUDIBLE] people can make progress on. And it's really limited not by the spacial resolution but more by the sort of sophistication of the actual model that you're [INAUDIBLE].
AUDIENCE: But then how long do you think it's going to be before the models are going to [INAUDIBLE] adjudicate between models with a [INAUDIBLE] mean response [INAUDIBLE].
AUDIENCE: I [INAUDIBLE] models a lot. So for example, I don't think a B1 model, for example, will be able to predict the [INAUDIBLE] response to a bunch of [INAUDIBLE].
AUDIENCE: And you [INAUDIBLE] linearly, I think that the ability to detect faces is something that is probably not linearly decodable from many different types of model. But there are certain models where they will be linearly decodable. So that's the type of question you could ask that's entirely non-trivial and is sort of helping that question that really depends on [INAUDIBLE] computational models and not on the sort of underlying limitation of the fMRI data.
AUDIENCE: I just sort second that. I think [INAUDIBLE] yesterday and he was saying that their models mesh very well with IT data. But one thing he didn't talk was they also mesh very well with human fMRI data. So that's specifically face recognition, but it's sort of more general.
AUDIENCE: And all of [INAUDIBLE] patterns that he found were supposedly-- many of the qualitative patterns that he found by measuring [INAUDIBLE] also replicate qualitatively in the fMRI data so that the types of units in a neural network that can predict the response of single units in different parts of the brain are also the types of units that can predict the response of fMRI [INAUDIBLE].
AUDIENCE: That's nice. I mean, [INAUDIBLE] of where I've gotten more depressive is, you see-- actually, it skipped over all this. But there's a bunch of different [INAUDIBLE] reasons. And 10 years ago, I was like, great. Here's an opportunity to map out a series of computations, the whole hierarchy.
Let's characterize the representations in each one. And so we used the method that Dan just described, MVPA, seven or eight years ago within a maximal resolution, hours of data on each subject. And we could not decode two different [INAUDIBLE] two hours of data. And at that point, I just got really depressed. And I went, there just can't see the representations that we want to see.
And since then, there are over labs where people have been able to decode some aspects of base information from [INAUDIBLE] patches. So it's not quite zero. It's just [INAUDIBLE].
AUDIENCE: That's because it happens to be into-- if you measure the responses to facial identity in [INAUDIBLE], you can see that they're just not facially clustered. So [? getting ?] a response to a particular face in a monkey that doesn't predict a nearby unit will also respond [INAUDIBLE]. But there are things that are spatially closer. So for example, there is the spacial clustering for head orientation.
So a neuron exhibits [INAUDIBLE] particular right now. Left facing head orientation is very likely that a nearby unit a millimeter away will also give it to that thing at orientations. So it's something that will be spatially clustered, where you can potentially [INAUDIBLE] a response [INAUDIBLE].
AUDIENCE: Yeah, and it actually-- and it seems like [INAUDIBLE] much better. You can actually use MVPA to see all kinds of those [INAUDIBLE].
AUDIENCE: That does seem like the things you can also decode out of people.
AUDIENCE: Out of people, that's true. [INAUDIBLE]
AUDIENCE: And how much can you push with this adaptation of this?
AUDIENCE: Yeah, that sort of got cut because we ran out of time. Adaptation, just briefly for those of you who haven't heard about this, is like many different measures. So it's looking time [INAUDIBLE] infinite [INAUDIBLE] responses [INAUDIBLE] back to select [? physical ?] measures.
If you use functional variety in, for example, two identical faces back to back compared to two different faces, the response in that [INAUDIBLE] is higher to two different faces than to identical faces. And that's very powerful because whenever you have any kind of measure that's sensitive to the same as the two stimuli, you can use it as what that region thinks is its [? name. ?] And that's the [INAUDIBLE] direction representation. So there's a very powerful lever much like the MVPA method is.
But crucially, unlike MVPA, it doesn't require spacial segregation of the realms at all times. So it's complementary to MVPA in the sense that it doesn't require that. And in fact, we have been able to see, for example, this thing I just described that [INAUDIBLE] and most other people have seen that.
So you can show identity specific adaptations for faces in that [INAUDIBLE]. And [INAUDIBLE] properties like you can get it from the [INAUDIBLE] not when they reported [? it. ?] matching behavioral advantage. So [INAUDIBLE] humans are very good at face discrimination when the faces are upright, not when they're inverted.
And so functional adaptation is quite [INAUDIBLE] it hates itself. It delivers sensible answers the way you hope in many cases, even when MVPA is [INAUDIBLE]. But every method has issues and there are issues behind MVPA as well.
So the best you can do is triangulate between all of these and not make it too hard [INAUDIBLE].
AUDIENCE: I wonder if the whole enterprise is perhaps handicapping itself unnecessarily in that-- I think that the problem is, like, trying to decode face identity from FFA. We know that FFA is necessary for face processing. But there's no human walking around whose brain is just FFA who decodes faces with just using FFA.
We all have fully functioning brains in which the activity is happening all the time everywhere. And also, faces are always existing in contexts except in laboratory experiments. So I wonder if we're, by restricting ourselves to just these spatially localized regions, are we handicapping ourselves unnecessarily?
AUDIENCE: It's a good question because it's really important-- [INAUDIBLE] I'm sorry. [INAUDIBLE] video that-- and [INAUDIBLE] our YouTube video with the simulation of FFA [INAUDIBLE].
[? AUDIENCE: Similar ?] [? to platinum. ?]
AUDIENCE: OK, and I think it speaks to this question. I mean, it's an important question [? behind ?] us. It's not just the methodological one, but the necessity and sufficiency of both codes for [INAUDIBLE] questions. So let me just back up for a second right to-- ran out of time.
I left out an important caveat. So I gave this kind of intentionally simple-minded picture all these specific bits of the brain and all these general bits of the brain. The method that Ben talked about and as an important caveat to all of [INAUDIBLE] is one, it will use MVPA to look into each of those specialized bits. You can find information about other stuff beyond the [INAUDIBLE] specialization.
They use MVPA to tell, for example, whether a person was looking at chairs versus [INAUDIBLE] just looking at the FFA. So that's an important challenge because there is information in the FFA about other stuff beyond facial identity. If you take the most extreme simple-minded view that I showed in the [INAUDIBLE] here about screen specificity in [INAUDIBLE] responses-- well, wait a minute. Then why is there information on other stuff? I think that's actually a very important challenge. I think we're not really through [INAUDIBLE] the answer to that.
The rightful answer [INAUDIBLE]. Really, what we want to know is the [INAUDIBLE] of that region and behavior. [INAUDIBLE] to mention that is to say that MVPA is not just a cool method.
It's a very [INAUDIBLE] that's challenged the view that I put forth so far. It also challenges that in the broader sense of suggestion that having [INAUDIBLE] brain and focusing on regions might be the wrong units of analysis. So when Jim [INAUDIBLE] put forth that method over 10 year ago, he was very much potentially saying also the FFA is just [INAUDIBLE] [? parkway. ?]
But who cares about that [INAUDIBLE]? It's part of a broader landscape of activation. And maybe that whole landscape is the representation. Why should you focus on the peak when there's information in the [INAUDIBLE] as well?
And I think that's actually a hugely important question. Like, what are the relevant units in the brains? I tried to argue that the units are the selective [INAUDIBLE] and maybe [INAUDIBLE] of this margin for that.
But it's important to keep in mind that that might not be right. Maybe the relevant representations span much bigger pieces of brain as you're suggesting. But this is where I think the real test will be-- not MVPA, not even neurophysiology. It will be [INAUDIBLE] go in and [INAUDIBLE].
And this is just beginning. It's hard and rare, and there's not very much of this kind of thing. But there's this cool [INAUDIBLE] demo.
[INAUDIBLE]. So this is a patient studied by [INAUDIBLE]. This guy here is shown in his hospital bed where he just had electrodes places on the surface of his brain.
And by chance, two of those electrodes landed right on top of his FFA. They know this because they scanned him with functional MRI and [INAUDIBLE] know exactly where his FFA is. The reason he's got his electrodes in there is he has intractable epilepsy that's not treatable.
The drugs [INAUDIBLE]. Surgeons are considering taking bits out and they're [INAUDIBLE] in there with electrodes mapping out the source of that seizure. While he's in the hospital with electrodes against his brain being mapped with the source of the seizure, they can go in and do various things like measure response of some of those electrodes while they look [INAUDIBLE] stimulating.
[INAUDIBLE] probably talked about that briefly. [INAUDIBLE] talked yesterday but [INAUDIBLE] tomorrow [INAUDIBLE]. Anyways, that will [INAUDIBLE] also doing similar stuff.
But these guys further had the amazing opportunity to micro-stimulate, put in a current, right in that region and ask what happened. This is important because it's a direct causal manipulation, not just recording. It actually [INAUDIBLE] proficiency and [INAUDIBLE] of that region.
So let's just watch the videotape.
VIDEO: [INAUDIBLE] One, two--
AUDIENCE: This is [? hand ?] stimulation.
VIDEO: [INAUDIBLE] One, two--
AUDIENCE: Behavioral stimulation.
VIDEO: --three. [INAUDIBLE] It almost looked like somebody I've seen before [INAUDIBLE]
AUDIENCE: OK, so that's just one little [INAUDIBLE] there. But that's the kind of data you need to touch those questions of the direct causal role of different patches of the brain. And we'll expect the data over the next decade from precious little examples like there with a human patient.
But this is the sort of thing that, particularly with optogenetics, and [INAUDIBLE] combined with functional MRI that can be done on a large scale. So I think we'll really learn a lot in the next five years even about the causal role of different patches of the brain, which will really get at this question of whether it's right to think about representations of habits [INAUDIBLE]. And we'll [INAUDIBLE] a little bit of cortex that we stimulated here and you can see the effects on the patient.
AUDIENCE: [INAUDIBLE] there are methods to take the full brain and throw out-- kind of do a classification of [INAUDIBLE] look at [INAUDIBLE]. You don't need to [INAUDIBLE].
AUDIENCE: I'm worried about this marathon there. Maybe we should just let people go and just maybe those questions should come up here and we'll [INAUDIBLE] starting at 4:00.
AUDIENCE: We can start a little later. I happen to--
AUDIENCE: If anybody needs to leave, go right ahead. What's your question?
AUDIENCE: [INAUDIBLE] question. When that [INAUDIBLE] response [INAUDIBLE] purely visual or is it [INAUDIBLE] depends on what [INAUDIBLE] visual information are necessary?
AUDIENCE: Which response? The functional MRI stuff?
AUDIENCE: Yeah, in FFA, which [INAUDIBLE]. Does it matter how the visual [INAUDIBLE]
AUDIENCE: Yeah, no, great question. Does it matter what you're doing with the thing? Yeah, so we did all kinds of experiments-- identity information, [INAUDIBLE]. To a first approximation, it doesn't matter if there's a face landing on your retina and you're not dead
That's not quite true. You can modulate by tension and dimension. If there's a face in the same place and you're [INAUDIBLE] like crazy to another part of space, you will [? band ?] on that response but not shut it out. But it doesn't depend on visual presentation strictly.
It includes your [INAUDIBLE] to imagine a face and selectively turn on your fusiform face area. Or you can imagine a scene and selectively turn on your [INAUDIBLE] area. But I would say in that case, you are top down constructing a visual representation and [INAUDIBLE] should listen [INAUDIBLE].
AUDIENCE: I saw a paper about using [INAUDIBLE] stimulation to activate part of the brain [INAUDIBLE] as well. Will it help establish [INAUDIBLE]?
AUDIENCE: Absolutely, yeah. I've had that in mind to talk, and I took it out because there wasn't time. Yeah, I think your-- the question was if the [INAUDIBLE] transcranial magnetic simulation is helpful [INAUDIBLE] role of different regions of the brain. The answer-- absolutely. It has-- does everybody know what TMS is?
Anybody [INAUDIBLE] three seconds on TMS? It's sticking a magnetic-- it's sticking a [INAUDIBLE] core like this next to your bed. It makes a brief magnetic transit and it should disrupt neural activity underneath the coil.
You can now position that coil with respect to functional MRI data in the same [? visual. ?] So unfortunately, the fusiform face area is too medial. The [INAUDIBLE] region there. I know because [INAUDIBLE] when I found this, yeah, that's the right [INAUDIBLE].
Stick it right there, crank it to the max [INAUDIBLE] it can work. FFA's too medial. Many people have tried that, very disappointing. But David Pitcher did a [? CHC ?] in England once years ago, and had the smart idea that, in addition to the FFA, there's another big reason that's out on the lateral service.
Just ask me for it. So you can position the [INAUDIBLE] right over that region and zap it, and the picture shows this astonishing result where the three regions [INAUDIBLE] there's the occipital face area.
Nearby is the extrastriate body area that I talked about. Near that is another region that also [? happens ?] to get [INAUDIBLE] occipital complex that seems to be involved in processing object shape. David shows a triple [? dissociation ?] with the [INAUDIBLE] of TMS, zapping each of these regions one at a time, showing that when you zap the occipital stage here--
Associated Research Thrust: