Functional imaging of the human brain: A window into the organization of the human mind - Part 1

Functional imaging of the human brain: A window into the organization of the human mind - Part 1

Date Posted:  August 23, 2021
Date Recorded:  August 13, 2021
CBMM Speaker(s):  Nancy Kanwisher
  • All Captioned Videos
  • Brains, Minds and Machines Summer Course 2021
Associated CBMM Pages: 

NANCY KANWISHER: I want to start by showing you what my plan is for today. And I don't know if I planned this right, but the first lecture in the morning is going to be pretty basic stuff. So if you've heard my talks pretty much any time in the last five years or so or read my stuff, it might be repetitive. And if people feel like, OK, I've heard this stuff, I will not be insulted if you want to just skip that part and come back later or read your email or whatever.

But I figured there are people from different backgrounds. And I didn't want to either presume that everybody knows what I've been up to for the last bunch of time. Then after the break, I'll talk about some newer stuff from my lab that's not published yet in which we're using deep neural networks to understand functional specificity in mind and brain and what we think that means and what we can learn about it from deep networks.

And I encourage you guys to butt in at any time with questions. That will make me feel better, like I'm not doing a monologue. So good. All right. So let's go.

Not that long ago, we just didn't know a whole hell of a lot about the overall functional organization of the human brain. This is kind of a sketch of the state of knowledge circa 1990 from earlier work that had gone on for over a century studying patients with focal brain damage.

We knew the very approximate locations of brain regions involved in language, stuff in the lateral left hemisphere, regions involved in something like directing attention up here at the parietal lobe, and regions involved in face recognition somewhere in the back end of the right, at the bottom of the right hemisphere. But that was more or less it.

And then functional MRI came along. And here is roughly the state of knowledge today. And so the point of this very simple-minded sketch is that there are now dozens of regions of cortex for which we have a pretty good idea of what that little patch of cortex does.

And each of these regions shown here is present in approximately the same location in pretty much every normal person. And so this is kind of a diagram of just the organization of the generic human brain.

Of course, this diagram is highly schematic. What does it mean to make a little colored blob on the brain and stick a one-word label on it? I will talk more about that. But to start, let me just show you some of the kind of data that go into these claims just so it doesn't seem too abstract.

OK, so this is a bottom surface of the left and right hemisphere of one of my lab members a few years ago who generously participated in a whole bunch of functional MRI scans. And so we're looking up at the bottom of the brain with the back of the brain on the two sides, a view kind of like this.

And so this shows you in gray the cortical surface, mathematically inflated so you can see all the stuff that used to be inside the sulci mathematically inflated because the cortex is a sheet. If you want to understand how it's organized, we want to see the whole sheet, not hide little bits of it up in the folds. So the dark bits are the bits that were inside the sulcus before it was mathematically inflated.

OK. So what are the colored bits? Colored bits are bits that we have functionally scanned with functional MRI. And what they show is selective responses. So for example, this blue bit here, I've only listened to a few of the lectures, but Christof Koch mentioned face responses down here in the fusiform gyrus. And here it is in my former lab tech, showing you a significantly selective response to faces in that region.

What that means is just if you measure the fMRI response in that region, which I'm sure somebody will do the basics or you already know, is very indirectly related to neural activity by way of blood flow. What you see is a higher magnitude of response when people look at faces than when they look at other kinds of stimuli, like images of bodies and body parts in purple, scenes in green, objects in gray, words in yellow, and so forth.

OK, so the claim that a patch of cortex responds selectively to faces is just to claim that you can find data like this using functional MRI or multiple other methods. OK. What about some of these other regions while we're at it? What is that green blob?

Well, that's a blob that we named years ago, the parahippocampal place area. It responds much more when you look at images of scenes than when you look at all these other kinds of stimuli.

Over here in the left hemisphere is a particularly interesting region that's been called by others the visual word form area. It responds strongly and selectively when you look at visually presented words and letter strings compared to lots of other kinds of stimuli.

That's a particularly interesting region of the brain because it's one of the few where we know that the selectivity of that patch of brain is wired up by that individual's experience. And we know that, one, because you don't see that selective response until people learn to read and, two, once they have it, you see it only for orthographies so they know how to read.

So if you scan me reading English and Chinese and Arabic, you'll see a response only for English, not Chinese and Arabic, because I don't read those languages. But if you scan somebody who reads those languages, you'll find a high response in that region to those orthographies. We'll get back to it if I can talk fast enough. I'll talk about some of the, I think, extremely interesting questions about how this region gets wired up and where it lands in that particular place. OK.

OK, so moving out onto the lateral surface of the brain, there's a bunch more good stuff, like this little kind of archipelago of purple blobs there. It tends to be a bunch of different little subregions that respond selectively to images of bodies and body parts. And so we showed way back when we first reported this region that, in all of these cases, you need to worry about, oh, is it just some low-level compound of that stimulus, right? It's a classic question in neuroscience that applies to all of this, too.

With bodies, we showed, for example, that it responds to stick figures of bodies more than very similar images where you just move a few of the lines around. So the visual features are very similar. And yet it still responds selectively to bodies. OK.

And, oh, over here on the lateral-- actually, it's bilateral, but here, it's, for some reason, only showing up on the left. This little orange patch is a patch just next to primary auditory cortex that responds selectively to speech sounds. So right now, that patch of your brain is firing like mad even if you're not paying any attention to what I'm saying or thinking about the meaning of what I'm saying. That's fine. Happens to all of us.

Even if that's going on, that patch of your brain, the neurons are firing like mad. They like speech sounds. And they will not fire to a whole host of other kinds of sounds, including music and dogs barking and birds singing and toilets flushing and whatever. Yeah. And I'll get more to that in a bit.

OK, so you might think, as I thought for a while, that these kind of hyper-specialized little patches of cortex might be just a property of perceptual systems. Maybe it just makes sense that perception is inherently a kind of computationally compartmentalized process that had these different computations you need to do. And so maybe it makes sense to allocate different patches of cortex to each of those computations.

And that might be true of perception, where we have these kind of relatively concrete computation problems for the brain to solve. But it wouldn't happen for something really abstract and high level, like thinking. So is this only found for perception? Or is it found for higher level abstract cognition?

Well, a bunch of years ago, Ev Fedorenko, now a fellow faculty member in my department at MIT, then a postdoc working with me, decided to go after these left hemisphere regions which, as I mentioned, have been known for centuries, actually, that were implicated in language processing and asked the basic question of, are those regions really specific for language per se? Or do they do lots of other cognitive things, as most of the literature at the time suggested?

And so first what she did is just developed simple methods to identify those regions in each subject. In this case, these red bits are the bits in that subject that respond more when you read sentences written out, although you can hear the networks get the same bit. You read sentences versus you read nonsense control stimuli.

And importantly, you have to make sure people are not just spacing out or closing the rise for the nonsense stimuli. So we add a probe task. You read some pronounceable spring of non-words-- made for flig, blah, blah, whatever.

And then there's a probe at the end. And you have to say, was this non-word in that string? And with the sentences, you have to do the same thing. The point is that task is harder in the non-word case. So it's not just that people are zoning out for the non-words because they're boring. Even though the control condition is a harder task, you get a higher response to the sentences.

OK. So as I mentioned, these regions had been known for a long time. What Ev did was just develop robust ways to identify those regions in each subject individually.

And just to say a little bit about why that's important, before that, a lot of the work on language, almost all of it, had been done doing group analyses where you take a bunch of different subjects. You scan them all. You align their brains as best you can. And you do some kind of analysis across a given voxel in this standardized group brain and ask, is that voxel consistently activated in that contrast across subjects?

That's not a stupid thing to do. If you get something, hats off to you. It's great. You've shown that response generalizes. But it has a bunch of problems largely deriving from the fact that these regions are not in exactly the same place across subjects. And there are no perfect ways to register one brain to another functionally.

And so you really lose a lot of the resolution of your signal if you blur across brains. And that's basically why the whole literature before Ev came along said, yeah, language overlaps with music and arithmetic and cognitive control and working memory and you name it. Language overlaps with that. That's because they've been doing these group analyses that blur the hell out of the data.

What Ev did was say, OK, let's identify those regions in each subject individually. And then let's subject them to new tests, looking at that region we found in that subject. And then you can pool across subjects in the regions you found separately in each subject. Does that make sense? OK.

Most of the field has finally caught onto this. It took a 20-year battle to get people to do this. I don't know why. I think it's something that a neurophysiologist wouldn't think twice. Like, you wouldn't just start recording from the back of the head and not bother to figure out if you're in a V1 or V2 and just say, oh, I found some neurons. Good for me.

But for a long time, functional MRI was like that. It's like, oh, we don't know where we are. We just tell her we're at coordinate X, Y, Z, which is not very meaningful.

AUDIENCE: [INAUDIBLE]

NANCY KANWISHER: You mean, if you put them in the best standardized space you can, what kind of overlap do you get? Is that what you're-- yeah, yeah, exactly. Not very good. The best overlap you get is about 50-- the most overlap in voxel will have a significant response for the language localizer in about 50% of subjects.

It's maybe slightly better for the FFA, but not much. And as a sidebar, it's not just that you kind of degrade your data by averaging across subjects. You often don't find a fusiform face area at all if you do a group analysis. And then you look, and it's there in every damn subject, just in slightly different positions, right?

And so it's just a weird accident of history that it's a good thing I had no one to teach me anything when I started functional imaging because group analyses existed at the time. But I didn't know about them. So I knew how to do just your basic contrast in one subject.

And the way I found the fusiform face area was just I was working at Harvard at the time, and I just printed out a picture of each person's brain with activations. And I just taped them to the wall, and I walked back and forth and looked at them. And I used the group analysis in my own very smart visual system back here, which has very good statistical analysis methods. And it said, hey, most subjects have a blob approximately there. And a group analysis never would have found them.

Anyway, group analyses are fine when you do find something. It's just you often miss things. And when you find things, you underestimate the selectivity of the regions you find because you're averaging some of the thing with some of the non-thing. Yeah. OK. Was that use-- raise your hand if that was useful. I mean, you've all heard me say this for years. OK, a few people got something about-- OK, OK.

OK, so what Ev did, people have been studying languages with functional MRI for a long time before she came along. But what she did was just say, OK, let's get serious. Let's find that stuff in each subject. And then let's study it.

And in particular, let's go back to this kind of classic question that people have been fighting about since Broca and Wernicke and before then, whether the language regions in the brain are really specific for language per se. And so Ev did that by basically scanning subjects on a whole bunch of other tasks. The details don't matter.

They're basically all the other main tasks that people have argued overlapped with language previously, so mental arithmetic, listening to music, holding information in working memory, multiple different measures of cognitive control, whatever that is. I don't know what it is either. It's a phrase people throw around. Never mind. We can test it.

And so then we just measured the response of those regions to each of these different conditions. And then we averaged that across subjects. And then what you see is in what's basically Wernicke's area back here. This is a response when you read sentences. This is the response when you read non-word strings. And here are the responses in all those other conditions. Basically, big bunch of nothing. OK.

And it's not that those are crappy tasks because they activate other brain regions like mad, often regions right next door, right? Just not the language regions. Similarly, if you look in Broca's area, which is a deeply troubled phrase, but never mind-- it's widely used. It's some bit of language cortex in the left frontal lobe. And in our case, we meant the bit up there that we find in each subject with this localizer. You get something similar, just really quite extreme selectivity for language and not showing activation in all these other tasks.

And so what I like about this is it's not just another selective blob we can stick on the brain, much as I do indeed love to do that. I think this is a particularly interesting case because I think it tells us something that people have wondered about for a long time.

Probably every one of you thought about this but way before you got into the field. It's like, OK, what is the relationship between thought and language? Are my thoughts in language? If I didn't have language, could I think? What is up with that? That is just a really fundamental, basic question.

And I think these humble data give a really big piece of an answer to that question. Namely, in the brain, thought and language are completely different things. That is, for all these other kinds of cognition that we tested here, they activate other parts of the brain separate from the language regions.

I'll do one more bit, and I'll take your question. Even stronger evidence comes from work I have nothing to do with. I'm just a big fan of this work. It's done by Rosemary Varley in England. And she has been, for several decades, testing a small number of people who have the extreme misfortune of having massive left hemisphere strokes that pretty much eliminate all their language abilities-- not just production, but understanding language.

Just the whole language system is just basically gone in these people. And you can see it in brain scans, and you can see it by testing them behaviorally. But then what Rosemary does is figure out smart ways to communicate with these people. And of course, we have loads of ways of communicating without language-- charades and pointing at diagrams and all kinds of stuff.

And she's shown that these people who have essentially no language are pretty much cognitively able to do essentially every task you might have thought would require language. That includes arithmetic, logic problems, thinking about other people's thoughts.

What else? Holding information in working memory. Basically, everything she's tested, these people can do. So that's, I think, even stronger evidence that thought and language are not the same thing and that much of thought can go on without language.

Doesn't mean that you don't need language to learn to think as a kid because those people all had language during development. And as I think Josh mentioned, whenever that was, a really important point that most of what we humans know, we didn't go do the experiments and find out ourselves. Somebody told us that stuff, right? And so it's a very powerful way to get smart and know a lot of stuff is have language.

Absolutely. And in the second half of the lecture, I'll talk about some of the things that I've done to address exactly that problem using deep neural networks. But yes, this is all very hypothesis-driven science. We sit around, and we say, uh, language. That's a pretty basic function. There's all this stuff about aphasia. Let's go test it.

What else might it do? Oh, people have said it does this. Let's test those things. And then you publish that. And then your colleagues say, oh, you didn't test X, Y, and Z. And then you go test X, Y, and Z.

But all of that is modulo what people can think of. Actually, I also talk in this lecture about another method to get outside of our heads to try to do more data-driven science to test a broader hypothesis space. Do you have another question?

OK. So what else? OK, but my favorite or maybe the most surprising to me, a patch of cortex. It's primarily the work of Rebecca Saxe. And that is this little pink blob here sometimes called the TPJ for Temporoparietal Junction. And that region, very remarkably, is extremely selectively responsive when you think about what another person is thinking.

And if that strikes you as crazy, reflect on the fact that we humans do this all the time. It's pretty much the essence of being a human being is to spend a lot of your waking minutes thinking about what other people are thinking. It is the essence of literature. We don't have great novels about mountains or objects that can't think or where we can't infer their thoughts. Or if there wasn't, there probably is a novel about a mountain. But you may be sure that novel has some way to make you think about what the mountain is thinking.

This is just what we do as human beings. When you decode or when you debug your code, you're thinking about, what does the code think I meant here? And what is it trying to do here? That's just basically what we do.

So it turns out it's an incredibly abstract cognitive task. And yet there's a patch of brain that's remarkably selective for it. I'll just say one more bit, and I'll take your question.

I could spend hours telling you about all the control conditions. But I'll just say that Rebecca Saxe did this work first as a grad student in my lab. And I just kept saying, that's cool, but-- nah. Nah, nah. Go do more controls. And she did one after another control. I mean, dozens of controls.

And it really, really is very specific for thinking about the contents of another person's mind-- actually, specifically thinking about their thoughts and beliefs. In one of her remarkable studies, even thinking about their bodily sensations. If I think about whether you guys are cold-- you probably are. I was here when I was sitting there. It's, I guess, more energetically intensive to speak than to sit there and listen.

But if I think about your physical bodily sensations-- thirst, hunger, temperature, whatever-- that does not activate my TPJ, only if I think about the contents of your thoughts. So it's really remarkably specific.

So I was going to say this later, but I'm glad you're pointing this out. So the claim here is that for each of these regions that I'm pointing out here, that region does a very functionally specific thing. The claim is absolutely not that it does it alone.

No brain region can act alone. Every brain region is massively interconnected to lots of other regions. It needs inputs from a certain set of regions. It needs to send outputs to other regions, or there'd be no damn point if it didn't tell the rest of the brain about the results of its fabulous computations, right?

So of course every region works with and interacts with lots of other brain regions. In no way is this supposed to go against that claim.

Yes, but not specifically, right? And so remember, the red bits here are the bits that respond more when you read a sentence compared to when you read a string of non-words, right? And then you'd have to respond to a probe.

So that's also a very cognitively demanding task. So that's why we use those control tasks is not to say, these are the only brain regions that are involved in the task, but what brain regions are specific for that task? OK.

So in fact, I will get to your point next, which is I think these regions are particularly interesting because I think they-- frankly, I think this is kind of like a sketch of the human mind. It's like we are not generically smart. We're specifically smart in very particular ways.

So I think that's fun. But it's not true of the whole cortex. In fact, all that white stuff up there is kind of like the opposite. There are many names for all that white stuff. It's sometimes called the frontoparietal attention network or the-- what is that ridiculous phrase, the task-positive network?

John Duncan calls it the multiple demand network. I like that phrase best because it says those regions respond to multiple different kinds of cognitive demand. Not just one, multiple kinds of cognitive demand. And so they're like the opposite of all the regions I've talked about. These are the kind of completely generic-- whenever you exert mental effort, these regions are engaged almost no matter what the task is.

It's a great question, a really fascinating question. To answer it, we need to figure out how to identify a homology between species. And that is not impossible, but it's a deeply theoretical enterprise. We'd have to agree on what counts as the same, right?

So for example, you can find face patches in monkey brains. And there's a bunch of face patches in monkey brains just like there's a bunch of face-selective patches in human brains. And if you look at them with whatever measures are available, they seem quite similar in many respects.

But then it would be really nice to know which of those monkey patches corresponds to which of those human patches because of course, with monkeys, you can get much better data. Like, [? Ko ?] can just go, as he just did, and collect responses from hundreds of neurons to thousands of faces. And he just does it, and two days later, he says, oh, yeah, I got the data.

Wow. You know, I've been averaging over all that stuff my whole kind of scientific life. And [? Ko can ?] look at individual neurons. It's mind blowing. So there are many, many reasons to do this stuff in monkeys.

But we'd like to know what the correspondence is of those patches. And I think we actually don't know. And there are ways we can do it. We can characterize each of them functionally and say, OK, this one cares about whether the faces are moving, and this one doesn't. And here are these ones in humans.

We can try to make guesses about where they are in the processing hierarchy, which we know better in monkeys, where we actually know the connections, than in humans, where we actually don't despite what everyone tells you. Oh, the Human Connectome Project. Don't believe it.

There's no good methods for doing this in humans. We have no idea what's connected to what in the cortex in human. It's a very big drag, but it's true. I went on a whole-- I pushed my own button and went on a thing and forgot where we started.

Oh, homologies. Right. Homologies, right. So you could look at function. You can make a guess about connectivity. You might decide-- I think, actually, the way these things will ultimately be resolved is by looking at all of these methods but especially including the kind of cell-type-specific identification of regions that the Allen Institute is doing now on animal brains and is getting serious about doing in human brains.

And I'm talking to them about how to try to do that and look at these regions to ask whether there are cell-type-specific signatures of each of these regions. There might not be. Many of the current claims say, oh, you know, those things change very broadly across the cortex in the work that's been done so far on human brains.

They sample, I don't know, a hundred different locations. And they look at the cell types in those parts of the brain. But if you don't sample exactly the right spot, you won't see those signatures. So the bottom line is you need functional MRI and single-cell omics in the same human brain. And if you think about it, that's a challenge for a number of reasons, but it's not impossible.

Anyway, interspecies homology is an extremely interesting question. Very difficult, I think unresolved now. One can make guesses and then-- I mean, what I would love to know is not just the homologies between, say, face regions and scene regions and body regions between humans and monkeys. All those things exist in macaques and are probably pretty homologous and someday we'll figure out how to do that.

What I want to do is, once we do that, then use those methods to find a perhaps cytoarchitectonic/connectivity homolog of language regions and the TPJ in monkeys and then ask, what are those things doing in monkeys? Wouldn't that be cool? I know, dream on.

But in your scientific lifetime, that will be possible. I'm not sure it'll be possible in mine. But anyway, OK. So I was talking about these multiple demand regions. And there's loads and loads of evidence for the kind of generic functional signature of these regions.

But here is a nice one. This is some of Ev Fedorenko's data showing you some of those same contrasts she used to identify language regions. In each case, the darker bar is the more cognitively demanding thing. But it's all the same task-- music and language and arithmetic and working memory, blah, blah, blah.

And in all of those, the more demanding one produces a higher response than the less demanding version of that same task. That's what we mean by multiple demands. OK. Make sense? OK.

So that's the whirlwind overview. There's loads of other interesting work on other patches of brain. These are just some of my favorites.

So point of all this is I think this is real progress. I think we've now found at least a set of functionally very distinctive regions that are present in pretty much every normal person. And that includes regions that are specialized for very abstract, uniquely human functions, like language and thinking about each other's thoughts.

And I think this is just kind of an initial sketch of who we are. And I think that's fun and cool. But it is also the absolute barest beginning, right? And so you guys, I'm sure, are sitting there being polite, thinking a million questions about how can she say this, and what does that mean, and that's really an impoverished claim, and all of that.

And so I really think the way to look at this is just kind of a roadmap for future research. So what I'm going to do in the rest of this talk is just give you some of the sketch of some of the other lines of work that are trying to build on this picture and say a little more on it. And then I'll actually do more of that using [INAUDIBLE] of the second half.

OK. So the first and perhaps-- this is just a version of the same question. But a very sensible response is, like, really? Really? Those things just do one mental function? Nobody else thinks that. How come Kanwisher still says this? All my other textbooks say we have moved beyond that way back in the '90s people [? by ?] phrenologists. Everybody's moved on.

Actually, many of these things really, really are functionally specific. And I'll show you some of the reasons why I think that. So there are many critiques of this kind of somewhat extreme position on functional specificity that I've been serving up.

Some of them are OK but have been refuted widely by evidence. Others are just not very clear. There's one that I think is really smart and been paying a lot of attention to. And that is [? concern ?] Jim Haxby's work, starting a long time ago, where he pointed out in a paper that really led to multiple-voxel pattern analysis as a method later with the work of others.

He said, look, if you look in the ventral visual pathway of a single subject, if you look at the response in-- these are two successive slices of the brain like this. You look at the response of this person when they look at images of chairs, I guess, compared to a fixation baseline. I'm not sure.

You look at that pattern. Then you look at the pattern of response when they look at, say, shoes. A neuroscientist take these data and tell, if you had another set of data of that person looking at chairs and shoes, which thing they were looking at kind of, sort of, a little bit.

OK. That's the basis of multiple-voxel pattern analysis. We look at the whole pattern of response over an entire big chunk of brain or within a functionally defined region, either way, and ask what we can read out from that region. OK.

And so what Haxby pointed out is if you identify just a face-selective region, you can still read out information about things that aren't faces. You can tell a little bit-- not very well, but significantly above chance-- if the person was looking at cars or shoes or trees or whatever.

Therefore, Haxby says, why should people claim this region is more involved in processing faces than other things when there's information in there about other stuff? And we should care about information more than the activation. He's absolutely right about that. If we think that the brain is doing computations, we should care about computations. So all of that, absolutely good point. OK, everybody get the critique? OK.

So I think this is a very serious critique. And so I think it's worth taking some time to talk about it. Well, one response is, OK, you, the neuroscientist, can decode information about shoes out of the fusiform face area. But is the rest of the brain using that information, right? Is it epiphenomenal, right?

And as a sidebar, because I didn't manage to bring this in later, this is one of the many uses of deep nets to understand neural data. If you train a deep net just on face recognition and you look at its activation for chairs and shoes, you can also decode chairs versus shoes. And to me, that says the fact that a system has information about some things outside of its domain doesn't mean that that's what it's fundamentally designed to do, right? OK.

That's just kind of in principle. But empirically, what about actual brains? Are they doing other stuff?

So this is why I was very psyched when I got a phone call a few years ago telling me that this guy right here was awaiting neurosurgery in a hospital in Japan. And the neurosurgeons had implanted electrodes all over the bottom of his brain here, both to identify functions and plan their surgical route and also to identify the focus of his seizures so they knew what bits to take out.

And so I was very excited about that because here's a picture of the bottom of my brain. And your brain would look very similar. These red bits are my face areas. The green bits are my place-selective regions. The purple bits are bits that respond not quite as selectively but more to colored stimuli than grayscale.

And so it looks like those electrodes are over some pretty good stuff. And so when they said, do you want to collaborate, it's like, damn straight I want to collaborate. Yes, indeed, I will stay up all night and have a few students stay up all night making stimuli and ship them to you immediately so you can test them on this guy tomorrow. Absolutely.

So we did that and sent them some stimuli. And the first thing they did is record responses, and just-- probably other people have talked about this. These are grids set on the surface of the brain as contrasted with depth electrodes, which are now more commonly used with neurosurgery.

The grids that sit on the surface, the contacts are maybe a couple millimeters across. So they're not vastly smaller than a typical functional MRI voxel. But because functional MRI is based on blood flow, they pool over far fewer neurons. And they give you time information. OK.

So what we see here is these are now responses of each of these electrodes in this strip. And this cluster of electrodes here, the response is shown in each one. So this is time on the x-axis and response to different stimuli. And so you see this is a response when he looked at faces. And this is a response when he looked at everything else.

So if you saw those functional MRI bar charts I started with and you were kind of underwhelmed, it's like, it's not that selective. I think the deal is when you have the real neuroscience data, like these rare opportunities to record from human brains. You get really strong selectivity that often is less impressive than functional MRI data, where you're just pooling over bigger chunks of brain.

Anyway, that's just to say we can replicate-- and many people have data like this going way back, actually, to Greg McCarthy before we ever did any functional MRI stuff. He had a few subjects with intracranial electrodes.

OK, so that just shows we can replicate the functional MRI stuff. The real question is, what happens when you do a causal test? What happens when this region of the brain stimulated?

And neurosurgeons do that sometimes to try to identify seizure foci or to identify functions. So I'll show you a videotape of what happens when this guy was stimulated in his fusiform face area.

And first, I want to remind you that he doesn't know there's a fusiform face area. He doesn't know where in the brain he's being stimulated. He's just asked to look at different things and report whether anything changes when they stimulate his brain. Oh, and I don't know if the sound's going to work, but it's in Japanese anyway.

[VIDEO PLAYBACK]

So you can read subtitles.

- [SPEAKING JAPANESE]

NANCY KANWISHER: Perfect subject, this guy. As you can see, he's [INAUDIBLE] here.

- [SPEAKING JAPANESE]

NANCY KANWISHER: OK, so that shows that region's causally involved in face perception. Is it causally involved in the perception of other things, as Haxby might want to argue?

- [SPEAKING JAPANESE]

NANCY KANWISHER: This is a cartoon character on a card here [INAUDIBLE].

- [SPEAKING JAPANESE]

[END PLAYBACK]

NANCY KANWISHER: OK, so we only had one hour of data from this patient. They repeated that across all these electrodes and then fusiform gyri on both sides and repeated that multiple times.

So it's one patient. It's a small amount of data. But it's very powerful in telling us that that region's, one, causally involved in face perception and, two, as far as we can tell, very specifically causally involved in face perception. So Haxby is right that you decode information there about other things, but apparently, that information is not playing a small causal role.

OK. So you might wonder, OK, we are social primates. We're very interested in faces. We see faces in clouds and coffee and everything else. And maybe you see faces whenever you stimulate anywhere in the brain. So what happens when you stimulate right next door?

Yeah, what I would say is, if those patterns of response in that region, which Haxby and lots of other people, including me, can detect in there about things that aren't faces-- if that information was being read out by other parts of the brain, then you would think that there would be other perceptual consequences besides seeing a face when you stimulate that region.

The shape of the box would change, the shape of the ball, the nature of the kanji. Now, as I say, it's only a small amount of data. So maybe there wasn't the right kind of stimulation or the right kind of stimulus to be able to change. But at least with that initial screen, it's the absence of effects on other aspects of the percept that, to me, is the evidence for the specificity of [INAUDIBLE].

It's a really good question. And if we did it again, we would put up a blank screen in part because people always ask me that. I think that would be interesting.

I'm not sure that the conclusion I'm drawing from it depends on that. But let's do-- [COUGHS] excuse me. Let's suppose he's behind a blank screen, and he doesn't see anything. He doesn't see a face when stimulated there on a blank screen.

Would that undermine the argument? I'm just thinking with you here. Would that undermine the argument? I'm not sure.

It may be that the visual system needs some feed forward stuff, some little bits of structure for the top-down or the-- [INAUDIBLE] top down when it's the side end-- activation to produce some intelligible percept? I'm not sure.

Anyway, I don't have the answer, unfortunately. But I'm also not sure it matters for the argument I'm making. If you think it does, if anybody thinks it does, tell me about it because we may someday get another chance. These are very rare. It's been done in a couple other labs before, but it's very rare to get this opportunity.

OK. But I just was, for fun, going to show you that this is really very anatomically specific. And if you look at the adjacent electrodes that responded more to color than grayscale images, similar to functional MRI data we have, here's what happens when you stimulated there.

[VIDEO PLAYBACK]

Again, there's no idea where he's being stimulated. He doesn't know there's color-preferring regions where he's getting stimulated while looking at the box.

- [SPEAKING JAPANESE]

[END PLAYBACK]

NANCY KANWISHER: So he doesn't know why it's the left side, but we do. He's being stimulating on the right. Stuff is contralateral even at those stages of visual processing. And it's pretty remarkable how his just reporting what he sees reveals apparently a pretty specific function of that region.

So I would say these data and lots of other data-- I'm just picking out my favorite-- suggest that, at least for some regions of the brain, yes, they are really remarkably functionally specific, despite the smart Haxby argument. OK.

But I do want to be clear about what the claim of functional specificity is. So first of all, I pointed out I don't think this is true of every little patch of cortex. I think it's the minority case. OK.

But also, when we see it, to say that a patch of cortex is functionally specific is not a claim that it's innate. You guys wouldn't make that mistake, but you'd be amazed how many people in the literature go straight from this claim to that point as if they're the same thing. They're very much not the same.

They raise the gripping question of, how does the cortex fire up to have that thing? I would desperately like to know. But in no way does the specificity on its own say anything about whether it's innate. I forget your name over here, but you-- what's your name?

AUDIENCE: [INAUDIBLE]

NANCY KANWISHER: Sorry?

AUDIENCE: [INAUDIBLE]

NANCY KANWISHER: Hi. When you asked before, is it only that region, absolutely not. So the claim of functional specificity is not a claim that only that region is engaged in that process. It's just that that region is specific.

Or the claim that that region acts alone. It's not a claim that that region acts alone. No region of the brain acts alone. OK. OK, so that was all question one. Like, really? Just trying to give you a little more evidence that yes, yes, really.

I didn't hear this-- if I've looked at that with functional MRI. Yeah. So I have a little bit, and other people have more. And just as background, if you look behaviorally at people's face recognition ability, there's a huge range.

So people at the bottom end, the bottom 2% of the people-- just take a random 100 people. The bottom two routinely fail to recognize family members. The range is just enormous. The bottom end of the normal distribution is just really, really bad. These are people who are otherwise perfectly normal but just can't recognize faces pretty much at all.

The top end is so good that people who are that good-- super recognizers-- routinely hide their abilities because other people find it creepy. If somebody says to you-- you're standing in line at a coffee shop, and somebody says, oh, yeah, I saw you at a movie theater five years ago. You'd be like, get away from me, right?

But there are people who can do that. And so first of all, that range is ginormous. Second of all, it's heritable. People are more similar in their face recognition abilities. They're identical twins and fraternal twins, which is very complicated, what that means. There could be many causal routes between genes and those behavioral outcomes.

But all of this to underline Sammy's question of whether there are differences in the brain. And the disappointing story is not big ones. Originally, many people, including me, scanned people with developmental prosopagnosia, these people who were otherwise normal but just really can't recognize faces at all.

And they had perfectly nice looking fusiform face areas. And my first reaction is, oh, shit. But my second reaction is, OK, that's not actually evidence against any of the claims. It just says it's not sufficient to get a big univariate differential signal.

It doesn't mean that region's functioning the same. It doesn't mean that we can read out the information. In fact, a bunch of works by Marlene Behrmann have argued that the big fiber bundles that go down the temporal lobe and that presumably take information from the FFA to higher regions-- that those regions are messed up a little bit in people with developmental prosopagnosia.

So there's lots of possible other sources of that variability. And all that said, more recently, there's a few papers saying, actually, if you look really carefully at a big enough n, you can find that the FFA is a little bit smaller, less selective [? than ?] [INAUDIBLE] probably both in people who are worse. But what's remarkable to me is how small those effect sizes are.

And to me, that's just one of the many disappointments of the available methods in studying human brains. They're just crappy compared to what you can do in animals, unfortunately.

Yeah, good question. The trouble is it kind of can only barely decode identity at all into subjects. It sucks. When MVPA first came along, it's like, OK, great. We had one method before that, which was functional MRI adaptation, which was a way to ask what's coded in a region and what's just habituation of working time in infants. You're just habituating neurons instead of infants.

And you ask if they dishabituate to a new stimulus, that means that brain region can tell the difference between those things, right? So that method had been around for a long time and told us a bunch about what's coded there. And by that method, you can find evidence that that region responds differently to different individual faces-- interestingly, when they're upright, not when they're inverted. That's good. Check, check. Fits the behavioral data.

But with MVPA, it's really lousy in the fusiform face area. And a few papers have reported that you can read out face identity from the FFA. But the first one was just stupid. It was just two different photographs.

It's like, you can decode this photograph from that photograph. That's not face recognition. This could be brightness. It could be anything, right? That's not interesting. It's got to generalize to novel faces and pass variance test to be real decoding. A few papers have done that better, but the decoding is really weak and so not enough that you can really convince yourself that if you didn't see it in developmental prosopagnosics, that would be a real result.

But I think the essence of your prediction is exactly right. I'd love to know. I guess I would bet that would be true. But I don't know how to see it well with our current methods. Adaptation might work better than MVPA. Trying to think if anybody's done that. Probably.

Are you saying maybe the fusiform face area is involved in detecting faces, not discriminating one from another? Is that what you're proposing? Totally. It's a totally sensible hypothesis based on what I've said.

And I think, shockingly, I don't think we know the answer to that, which is really lame, having worked on this region for 20 years. How could we not know this absolutely basic thing?

But I think the evidence suggests it's not that, one, because of adaptation studies where you can show sensitivity to individual faces. Two, because that region is right on top of or right in approximately the same location as a region that gives you prosopagnosia.

Now, that, too, could be interpreted other ways. Maybe you have a face detection system that's a necessary prior stage that goes into a face recognition system. So if you knock out the detection system, you also won't be able to identify. So there's wiggle room in all these hypotheses.

I guess maybe the stronger evidence is-- this is all pretty pathetic lines of evidence. But there's a few snippets. One other snippet is in the work [INAUDIBLE] [? Spector ?] did, actually, earlier at the face stimulation study. You guys have probably seen this video. It's fabulous.

This guy is looking at the neurosurgeon's face, and they stimulate his FFA. He says, wow, your face just metamorphosed. What he doesn't say is, your nose moved a little to the left. Your eyes changed. Then he says, you look like a different person. He doesn't say, your face got distorted and didn't look normal. He says, you look like a different person.

So that's not incredibly impressive. But it suggests that there's something to do with coding of identity. And the final snippet, I would say, is the very, very beautiful work from-- has [? been ?] [? traveled ?] already spoken in here?

He'll probably talk about newer stuff. But sometimes he talks about all of his early stuff on the face patches in monkeys with [INAUDIBLE]. And they have beautiful data decoding when you have the resolution of individual neurons decoding face identity from patches which are our best guess of homologs to that. So I'm guessing that's not just-- it seems kind of weird to have a whole chunk of cortex just to detect this basic structure.

That's possible. I think the monkey data is stronger here. And the monkey data really nicely shows the kind of progression of different kinds of representations across the patches in monkeys where you have really good evidence that that actually is a prosopagnosic hierarchy. You can see different kinds of cell activities.

And I wish we could say the same for humans. There's some kind of sort of-- I published some stuff trying to make that claim. But I would lean much harder on the monkey data than my data suggesting there really is a process in [? changing ?] apparently a lot like a CNN.

I was going to talk briefly about some work Leyla Isik has done recently looking at another high-level social perceptual function, perceiving social interactions. I guess I'll just keep going. I don't know. Do people want to redirect me or suggest what I should do here?

I was going to talk about a little bit on perceiving social interactions, a little bit on what we mean by the intuitive physics, what we know about intuitive physics in the brain. It's not much yet. A little bit on auditory cortex, which is cool because it uses data-driven methods.

So there's a whole suite of different kinds of things we do in social perception, not just face perception. We recognize voices. We think about the emotions in voices and faces. We do all of this stuff on individuals, all these different kinds of perceptual analysis of what we see-- emotions, all that stuff.

But we also are interested not just in single people, but in groups of people, right? So you could call that intuitive sociology, just as we have intuitive psychology, right? So if you look at a group of people, what can you see?

And in fact, if you ask, for example, here are two people. Are they interacting with each other? That's a pretty basic thing we're actually really good at determining, even if they're not actually speaking.

And I would argue this is a pretty basic aspect of social perception. It's present in lots of animals, bonobos. There's some really nice studies. Even fish can tell if two other fish are interacting with each other. Yes. There's an amazing behavioral study looking at that.

Six-month-old babies are very, very sensitive to which agents are interacting with each other. In fact, they can tell with simple cartoon or puppet stimuli doing simple actions. They can tell whether this agent is helping or hindering that agent. They will later accept a toy from the agent that helped the other agent more than from the agent that hindered the other agent, right?

So this is a very fundamental thing that's present in animals and developed early. And so are there specialized brain regions for that? So Leyla Isik, who's TA of this course for many years and who's now a faculty member at Hopkins working with Kami Koldewyn in my lab, did this initial set of studies where she identified this little patch of brain, this little red thing here, which is near lots of other good stuff.

Here's visual motion area MT right there in the subject. Here's the TPJ that I talked about for thinking about other people's thoughts. And this yellow bit is a bit that responds to dynamic faces and, it turns out, also voices in the superior temporal sulcus.

So near all this other good stuff but separate from it is a region that seems to be sensitive to interactions between pairs of people. So let me just show a few things she did. Here's one stimulus, really pared down, basic interaction with this guy, who gestures, stand up, and the other guy stands up. OK.

So here's a control condition where you have two people acting and doing interesting things but not interacting. Yeah. And so what we see is much higher responses in that little red blob. Here is when they're interacting with each other. And here's where they're not.

And if you look at those same two conditions in the STS face region, you get a weaker effect. It's not, like, only in that one little patch. It spills over a bit. But you don't see it in the TPJ. And you don't see it in the area MT. So it's probably not some massive [? ocean. ?] Yeah.

So this generalizes to animated stimuli. So here's our version of the famous [? Hyder ?] and similar kind of stimuli. So this guy is in the box. He's trying to get out. And just watch this.

And you see him struggling and trying to do stuff. And then the red guy comes along and very heart-warmingly helps him. And then the two of them celebrate, right? You didn't need me to narrate that. You would have understood that on your own.

Versus-- it's actually hard to have two moving shapes that aren't interacting. Oh, sorry, this is the hindering one. This guy can hinder. Oh, what a meanie, right? OK.

So we showed stimuli like that and other controlled stimuli where they're just kind of bouncing around like billiard balls, not behaving socially and not socially interacting. And we show both that you get a higher response to the social interaction than physical interaction cases and that you can decode from that region whether it's a helping versus a hindering interaction.

So you would just see one agent. But their action would imply that their interacting. Absolutely. It would be a great control condition. I haven't done that control condition.

This was primarily the work of Leyla and Kami when they were in my lab. And so I try as much as possible to cede the territory to them when they leave the lab rather than compete with them after they leave. And so I can't always live up to that standard. But I try.

And I start running out of cortex. It's, like, a problem. So I have a statute of limitations. After a few years, I can go back.

But more seriously, so Leyla is pursuing that and lots of other related questions. For example, you get-- actually, we'll be able to look at this, too, in a study we're coding up right now.

But do you get that, actually? You can just read verbal stimuli, right? So you want to get away from the surface. These things really are doing high-level representations of social interactions. It shouldn't depend a lot on surface form. We already have pretty-- these are pretty different from those. But it'd be cool to use verbal stimuli and implied social interactions and all of that.

And it's just very early days in this research enterprise. But it's pretty interesting. OK, that was just a little snippet.

OK, I think I've already talked about functional regions of interest before. OK, so we can add a candidate of a little blob here, kind of sort of maybe. More needs to be done, but it's suggestive.

But what about intuitive physics? OK, so let me tell you the basics of where we are. And this is all work with me and Josh Tenenbaum and various other people who have joined us on this. And it also is very early days. I don't think we have nailed a physics engine in the brain. But I think we have some suggestive evidence that there's-- at least we know where to look to study it further.

So just to remind you, when you look at an image like this, you don't just see a fork and a glass and a table and OJ and a placemat. You see a glass supported by the table that contains OJ that could be picked up but that might spill and stain the placemat, right? So I don't know how much of that you guys processed when you saw this.

But I think it's pretty-- feels pretty automatic that you do all that stuff. And so the point is all this stuff in italics, these are all the other things we do besides just sticking categories on the [? walls, ?] right? So as Josh would say, you don't just-- the human system doesn't just behave like a CNN and extract some categories. It builds a model [INAUDIBLE], right?

So vision is much more than determining what is where. It includes determining physical relationships, action affordances, possible future states, et cetera. OK. And all this seems to happen frequently, rapidly, apparently effortlessly, apparently automatically. And it's essential for all kinds of stuff-- understanding what we're looking at, for planning any action on what we're doing.

You can't perform a single action on the world without having some knowledge of the physics of the world you're acting on, even if just the weight of an object you're going to pick up or the solidity of the surface you're going to stand on. Or I can't walk through these chairs. But if they were made of something else, I might barge my way through them.

So every action you do has to be planned with some knowledge of the physics of the world you're acting on. OK. So how do we do it? I can't tell you the answer.

But one way through that problem is to try to at least figure out where it is in the brain. So this is work that first was done with me and Josh and Jason Fischer, also now at Johns Hopkins. And what we found is a set of regions-- I think Josh showed you those pictures-- that are basically at a high-level part of the parietal lobe and kind of premotor frontal regions that you find in most people-- here are four subjects here-- that show the following patterns.

One, they are more engaged when you do a physical prediction task than a control task that's of equal difficulty on the same stimuli. So the physical prediction task was to watch this rotating tower and predict whether it would fall more on the red side or the green side. The color control task was to say, are there more yellow or blue blocks in the tower?

And when you do that contrast, you find basically these regions in those subjects. But then we did a few other things. We showed subjects-- I think I have the movie in here. Yeah. We showed subjects movies in which these dots move around either in a physical fashion so they're bouncing off each other or in a social fashion, so they are chasing each other, flirting with each other, doing any of those social things you can depict with just the motion trajectories of two dots that are actually much more interesting than the physical case. Yet those same regions are more engaged in the physical than social interactions.

And in that task, we also had subjects watch the video for a few seconds. And then one dot kept moving, and you had to predict where the other dot would go. So there was also a predictive element in that task. OK. Activates more or less the same regions.

We also showed that if you just show people videos-- actually, we had this data and recycled them. So this is what the case looked like. It wasn't perfectly designed for this, but we had video data from people watching videos of objects that were just acting physically. So it as objects rolling down ramps and colliding with each other and gears and pendulums and stuff like that versus videos of faces doing things, bodies doing things.

And that region is also much more engaged when you look at physical objects and interactions than social ones even though a lot of the ventral-- all the other regions I've been talking about so far are all way more engaged with social videos than the physical ones. These regions like the physical stuff. And then double dissociation is always a useful thing to learn all kinds of stupid compounds.

OK, so further, using multiple-voxel pattern analysis, Sarah Schwettmann showed in a paper we published a couple years ago that you can decode the mass of an object from the pattern of response across these regions abstract to the scenario that reveals that mass. So we showed videos of objects being dropped into a bowl of water.

And a heavy object makes a big splash and lands on the bottom. And a light object just bounces on the top, and not much water splashes. And that reveals the mass.

In other scenarios, the objects dropped on a pillow and made different amounts of denting on the pillow or sat on a surface and got blown by a hair dryer, with the heavy object barely moving and the light object just skittering across the surface, that being a nice control for amount of motion because, in that case, the lighter object has more motion. And in the water case, the heavy object has more total motion.

Nonetheless, we can decode whether it's a heavy versus light object in those regions across those scenarios. So that's a baby step. The decoding is pretty lousy. I forget, like, 58% correct or 61% or something where chance is 50%. It's statistically significant, but I critique Haxby for not showing the behavioral relevance of the decoding. And we haven't yet shown the behavioral relevance of that decoding, either. So it's just a baby step.

In more recent work by my postdoc Pramod, also a former student in this course, he's shown that you can decode with natural scenes whether that scene is showing an unstable or stable configuration of objects with all different kinds of objects. And he can decode across very different kinds of images of stable and unstable scenes.

But the same region does not decode whether you have an unstable situation with a person, actually, in which-- it's usually a person-- how does this go? A person being threatened by an animal or something like that versus a person peacefully side by side with a friendly animal.

Things where you predict motion and danger in pending action but based on animate and agent-based things, you can't decode from that region. So again, early days.

So anyway, so the hypothesis, which I think is still very much just a hypothesis now but a fun one, is that those regions implement our understanding of the physical world possibly based on a generative model that can run forward simulations akin to a physics engine, as Josh argued. So much more needs to be done.

High on my prior-- well, lots of things are on my priority list. But one of the things I would really love to see is see if we can find evidence for actually forward simulation being done in those regions. That's very difficult because, to me, the essence of forward simulation is collecting a bunch of time points.

Maybe we can do that with MEG. God knows. I kind of doubt it. It's right near the edge, but we might try. I'm building higher hopes for being able to look at this with actual neural recordings from within the brain, either in human epilepsy patients or in monkeys who presumably have very similar intuitive physics abilities. They need them in some ways even more than we do for their lifestyles.