Winrich Freiwald: Taking Apart the Neural Circuits of Face Processing
June 7, 2014
June 7, 2014
All Captioned Videos Brains, Minds and Machines Summer Course 2014
Topics: Connection of face recognition to intelligence: social cognition and perception start with faces, e.g. facial expression and communication; requirements for face recognition: (1) detect face, (2) encode structural properties, (3) encode dynamic properties, (4) extract different kinds of information (e.g. species, ethnicity, gender, age, identity, mood, direction of attention, attractiveness, likeability, trustworthiness), and (5) activate other parts of the brain to generate emotional response, activate a memory, draw attention, or elicit a motor response; early IT studies; fusiform face area; fMRI reveals face patches in the temporal lobe; response properties of middle face patch neurons; combining stimulation and fMRI to discover connectivity of face patches; connections of face network to lateral amygdala and pulvinar; development of face selectivity along a hierarchy that evolves from responses to faces in general, to faces in a particular location, to individual faces; connection to models of the ventral stream
GABRIEL: Moving on [INAUDIBLE]. Does everyone [INAUDIBLE] group picture, at the end, outside, after English presentation. So just hang on for a few minutes outside after the talk.
So, someone asked me yesterday [INAUDIBLE] was, and I told them that it was distinguishing very easily, by his stature. Not only distinguished by his typical attributes and high build like his intellectual height. Really, you've got a very great glimpse of his work from Marge the other day when she introduced a lot of the beautiful work that she's been doing actually [INAUDIBLE] today. As Marge said, a lot of the hard questions that you [INAUDIBLE]. We'll go get all the others now. So, Winrich.
WINRICH FREIWALD: OK. Thanks so much, Gabriel, for that kind introduction. Yes, [INAUDIBLE], I know a little bit what you guys have been exposed to in the last couple days. And I'll try to make some connections but I'll probably skip over some parts if you've already heard about those.
So I came in yesterday and I was actually getting pretty sentimental. I would think, this is the course [INAUDIBLE] 22 years ago. For me it was--
AUDIENCE: We can't really hear you in the back. Sorry.
WINRICH FREIWALD: Can't hear me?
AUDIENCE: We can't really hear you, yeah.
WINRICH FREIWALD: I can hear myself through the speakers here, but you can't.
WINRICH FREIWALD: This is better? Better?
WINRICH FREIWALD: OK. So, yes, I took this course in-- [INAUDIBLE] it louder or-- is it just me that I hear it? OK. That good?
AUDIENCE: I don't know. Can we crank it a little bit?
AUDIENCE: Little bit more?
WINRICH FREIWALD: OK, so how's this?
WINRICH FREIWALD: Still not better? Maybe you can come closer then.
WINRICH FREIWALD: You're having fun? Oh, you're fine. OK.
WINRICH FREIWALD: All right. OK. So I took this course and met with some competition. There were [INAUDIBLE], really like life-changing experience. And one of the interesting things was, just yesterday I was talking to a friend of mine I actually met at this course for the first time. About doing [INAUDIBLE] on my lab. So there's lots of things that you can get away from this course. And I hope you're going to have a really good experience with the course that you're taking, aside from the one that I took.
Email came out from Tommy yesterday, saying we shouldn't be lecturing you guys so much but give you problems to solve. I guess because that taking a course on intelligence, you should prove your own. Or you want to prove your own. So I thought maybe I'll start out with a couple of problems.
So, first question that I would have for you, what, if anything, does face recognition have to do with intelligence? Did anyone tell you that? I am actually not going to tell you that, but maybe you know the answer to that question. No? No one knows what the relation might be? I don't know either. OK.
So, another problem. So, why are there face-selective cells? Did anyone tell you that? I mean, it's an observation that there are [INAUDIBLE] face-selective. But why would there be such a thing as a face-selective cell?
AUDIENCE: Well, evolutionarily, it's advantageous to be able to recognize people. To know if they're enemies or [INAUDIBLE].
WINRICH FREIWALD: OK. But maybe you could also do this with just a broad coat of lower level things.
AUDIENCE: It's a stimulus with a recurring statistic given how social we are. So if you have cells that are sensitive to recurring statistics, you might just get--
AUDIENCE: Can you speak louder?
AUDIENCE: Right. Oh, sorry. Yeah. So, faces are a stimulus that have common statistics that recur over time. Very consistently, given that we're such social animals. So, if you have cells that are sensitive to repeated statistics then you might get things that are just specialized for--
WINRICH FREIWALD: OK.
AUDIENCE: Solving face-recognition should be a different problem than solving generic-object recognition. You can use coded machinery for that. So it's like not face-selectiveness [INAUDIBLE] scope of the cell [INAUDIBLE]. Like calling it just face-selective may not give you that much of a story [INAUDIBLE].
WINRICH FREIWALD: OK. Anyone? All right, so we have three possible answers to the problem. So why might there face-selective areas? Which Nancy just covered and maybe she told you why she thinks that there are such things as [? stationary? ?] It's a weird thing, right? OK.
AUDIENCE: Well, we heard something about how you need anatomical closeness to get expert performance. So, we were told that.
WINRICH FREIWALD: OK.
AUDIENCE: Face-selective cells need to be connected to each other to some extent, or connected by other cells that connect them. And it's possible that having them close together can optimize wiring length.
WINRICH FREIWALD: Right. OK. Good. And another problem, so how can we figure out how face-cells work? And this may be something that-- OK, maybe you have some idea so you can tell me. So, this is maybe like a question you can think about as we go through my talk. But I think it's a very fundamental question and we have to answer it. It's a phenomenon that these cells exist, but how [INAUDIBLE] to gain the properties that they have?
And this question I brought up because Jim [? Dekalo ?] was speaking before and Marge was speaking before. And, I don't know if you guys notice that there are different attitudes and different approaches to study object recognition. Was that highlighted to you or-- No. OK so maybe I'll--
OK, so [INAUDIBLE] what they told you, but my hunch is going to be that-- So Jim's approach is to the characterize what a cell is doing, and then look at lots of cells and see what the population code might can be. And so the approach that Marge has taken, but I don't know if she talked to you about this, and it's also the approach that we've been taking mostly, in addition to some population decoding. Results would be to analyze, why is the element responding to a stimulus that way it does. And we select two very fundamental, very different ways of looking at the problem. So, I just wanted to highlight from the outset.
OK. So here's-- OK, this used to work. So this is the problem that we are interested in the lab, or it's one of the problems of interest in the lab. So these are Tonkean macaque monkeys, which are very similar to the macaque monkeys that we studied, the rhesus monkeys. You can see that they're very social in different ways that you can immediately recognize. Even if you're not familiar with these monkeys you can understand what they're doing. Kind of like a story. A big male monkey probably chasing a younger monkey [INAUDIBLE] sitting there.
And so basically we try to understand what's going on and what's going on in their brains as they are processing the social [INAUDIBLE], how they can understand them. The second part shows the good use of tool use.
So there are two basic theories about the evolution of primate intelligence. One has to do with the social pressures of living in social groups. The other one has to do with tool use. And if you figure out how you can use this pick to clean your nose, then you're probably a pretty intelligent animal.
So the cool thing about primates is that they're not only social, but they know a lot about the social environment. And that sets them apart from other social animals. It's one thing to be social, like being cuddly with others, but the other thing is to know exactly what's going on there.
And there's one tale to illustrate this. This is the story of Ahla. Ahla was a female baboon and in Southwest Africa some farmers had the habit of using baboons as herding animals. So they would basically replace the dogs with baboons.
And so Ahla was captured at age two. And she would basically herd goats. You can see her here. She would adapt some of the behaviors that the goats had. It's a little hard to see, but there are a couple of goats here, there's a salt lick here and she starts licking this basalt stone. Which is something that baboons would naturally not do, but the goats would do.
But she would also retain some of the very typical behaviors that baboons are engaged in. So she would basically groom these goats. She might be here and here.
But the most amazing thing that she did, and it's a little hard-- very hard to see here, is that at the end of the day when she was done herding the animals in the outside and they can be brought back to the stables, and the farmers would engage in the stables would separate the adults from the the young kids. And Ahla would basically, manically, try to pair up again, correspondingly the mother animal with their offspring.
And she would even do this [INAUDIBLE] farmers to a mother-- like three offsprings, they would like to separate them to distribute them more evenly across [INAUDIBLE] animals. And Ahla would not have this, so she would really put all these three offsprings to the mother animal.
And so this is something that the farmers not have been able to do themselves. They would not be able to tell which goat is with which. They would not know what the pairing was. But Ahla was just really knowing this very precisely.
So one conclusion that we can draw from this is that primates not only behave socially, but they have social knowledge. And basically this knowledge can be grouped into three categories. So most basically they have knowledge of other individuals that they know.
And that's, again, something that's not clear if rodents have this, for example. So they know the age of their peers, like if it's a juvenile or [INAUDIBLE], they know the gender, female or male.
The second level of understandings is about the attractions. And that's also a focus in the setup [INAUDIBLE] understand how the brain can recognize different interactions that can take place. So it's like grooming behavior, [INAUDIBLE] and mothering behavior. So interactions that can take place between two animals. And if you can understand this concept that means somehow you have to be able to recognize it.
And they understand the relationships between individuals. So they understand things like friendship, kinship, or hierarchy. And to me, this is pretty amazing because if you can represent hierarchy it means that you have to have a very elaborate data structure in your brain somewhere that is not associative, simply, but has a direction to it. If you know that A is the mother of E, it means you know that E is not the mother of A. So there's a relationship the direction to it.
And all of this is rooted in the person concept. So primates have the concept of a person in which they group information they have about others which they know. So this would be one person, it's a juvenile female, the daughter of X and all this knowledge would be stored in one compact format. Currently, there is absolutely no idea how this is done in the brain.
But the nice thing about this person concept, when you get to it, to one stimulus that we can understand pretty well [INAUDIBLE] understand a bit about and that's the face. Mm-hm?
AUDIENCE: They know [INAUDIBLE]. It's not [INAUDIBLE], right? Do they know that people are different than other monkeys or whatever, right?
WINRICH FREIWALD: Say it [INAUDIBLE].
AUDIENCE: Do they interact with [INAUDIBLE] differently than they do--
WINRICH FREIWALD: OK, so these observations are from the wild. But there's some behavioral studies from [INAUDIBLE] individuals. From your interaction with the animals, you would assume that they would have a person concept of you as well.
But the example shows that if you have this cognitive structure, right? You're trying to impose it on whatever environment you're living in. So if you're living with goats, this is going to become your group and then they're going to apply it, the structures, to the goats.
OK, so the primate face is also very special. So primates are expressing their emotions, otherwise private internal states, through the face. Darwin famously noticed this in 1872. So you can see a facial expression again, off a Tonkean macaque. And you can elicit something like an expression [? electrically ?] in the facial musculature.
But it can also be conveyed by body posture. And you saw a little bit of this in the movie that I showed initially. By this one monkey was chasing the other, there was these displays of teeth in between, so there were definitely emotions involved in that and they were being displayed at the face.
And it's something that not all animals can do so if you're a fish or a frog, you can do lots of things, even very [INAUDIBLE] things like sitting on a pouch and a porch and enjoy life. But there's certain things that you cannot do and that is change your facial structure. So that's something that's very typical of mammals. You can see on video from [INAUDIBLE] group. From the top you might not see the whiskers. So these rats are encountering each other and they touch themselves by whiskering against each other. And the reason they can do this is there are very fine muscles in the face [INAUDIBLE] directly to the skin or, in this case, to the whiskers. And therefore the facial configuration can be changed. And so this allows primates in particular with some 24 muscles that are very similar across different species, to signal their internal emotional states through the face.
OK so primates are also very interested in faces. I always like to show this movie here. It's a three-day-old macaque monkey and you can see it's already very interested in the faces. And after some time, there's going to be some kind of facial mimicking behavior. So the experimenter is going to move his mouth and then the little [INAUDIBLE] is going to move his mouth too.
The other thing I use this movie to is to illustrate how we are spending a lot of time during the day watching faces. What you're doing now is what you're doing a lot of time during the day, watching [INAUDIBLE] and other faces.
OK, so here it goes. He's moving his mouth. And after some inspection, there's some facial movement here as well. And that ability to mimic the facial movement is something that's very specific, happens early in life. The macaque monkey's is only there for two or three weeks then it goes away. It's a very strong predictor of the future cognitive ability. To more engage in this mimicking behavior, the better their cognitive and motor behaviors later on.
The third reason I like to show this is that I see a lot of smiles on faces here in the audience. So as you're watching this little, very cute critter, I got into your emotional brain. You have a very strong emotional reaction, hopefully. [INAUDIBLE] couple hundred times this movie, and every time I find this very endearing. So faces also have a way to get-- activate emotions right.
So the consequence of us looking at faces very frequently, we're also very good looking at faces. This is a demonstration from [INAUDIBLE] from years ago. We recognize, even these are blurred images, famous individuals who you know very easily. It's also an illustration that there's something about face recognition that is holistic. You don't need great detail to recognize the face, that there's something about the gist of the face that allows you to recognize them very easily.
OK. So, social perception really can rely on many, many--
AUDIENCE: [INAUDIBLE], does that work? It seems like I'm using a lot of the edges of the faces, the hair, some of the shoulders-- like if Bill Clinton weren't wearing a jacket, I'm not sure I'd recognize Bill Clinton.
WINRICH FREIWALD: OK so I think the question--
AUDIENCE: Would it just work if you just had the face with no hair, with no external cues.
WINRICH FREIWALD: Yeah. OK. So the question is whether you can do the same thing if you're taking away all the external cues. Like even the glasses of Woody Allen. That's going to make it harder. I'm not sure that this-- and I don't know if Nancy knows-- I don't know if this has been formally done. Without hair it'd be harder. I think Nixon you could recognize by the facial outline still. Yeah, Bill Clinton I think we'd be doing well but-- hm?
AUDIENCE: It just doesn't feel like it's probably the faces, per se.
WINRICH FREIWALD: It doesn't feel like a face, per se?
AUDIENCE: It's going to be the [INAUDIBLE]. If you imagine plotting out all of the internal face features, you wouldn't be able to recognize it.
AUDIENCE: It's like we do that better than [INAUDIBLE].
AUDIENCE: We hardly even recognize full-resolution faces without hair though, right?
AUDIENCE: Tendency to think of hair as the other stuff we need to control for with faces. But I think it's just a key part of what we use in face recognition. It's not the only key, but it's an important key.
WINRICH FREIWALD: Yeah. [INAUDIBLE] Marge, I remember when I was in the lab, she changed her hairstyle and her color. And she reported afterwards that people got really mad at her for doing that because it made it harder for her to be recognized. So it's an important [INAUDIBLE] change.
AUDIENCE: In computer vision, the study of face identification started with work by Takeo Kanade. And the system there was to detect features like eyes and mouth and nose. Which he did by hand because computers did not manage to do it at the time. And then use the geometry, the relative distance of eyes and mouth and nose, to infer identity.
Now it turns out that if you do even great level of correlation, after some normalization you get much better results than using features. Especially if you do correlation of parts of the face. Like the part that contains the eyes [INAUDIBLE], the part that contains the mouth--
AUDIENCE: You mean just pixel-based correlate, rather than--
AUDIENCE: [INAUDIBLE] of work, it's not as telling as just template matching, essentially.
AUDIENCE: That sounds right because one of the core of our face recognition ability is to use the [INAUDIBLE]. It's like the spacing between those features rather than the feature-- because the features are really, really similar across different people, even gender and race. I feel it's more like the spacing between the features that are more--
WINRICH FREIWALD: Yeah, so some of the information would be preserved.
AUDIENCE: Even if you extract those as features, it does not do that well. It's the low level information that carries a lot.
AUDIENCE: It's also probably [INAUDIBLE] the fact that, given this presentation, that you're [INAUDIBLE] people, that there are probably famous people on this slide here, right? I mean, how well will this work if you just took some random person? Look at their Facebook, grabbed some pictures of their friends, and did this same test here, right? It'd be much harder. You think about this with faces-- I mean, so you have no hair at all. Right? And you just put up a pair of glasses and a big cigar. You might say, oh, [INAUDIBLE] Groucho Marx. You don't need [INAUDIBLE] objects. For Woody Allen, yeah the glasses really mean a lot. [INAUDIBLE] the moustaches themselves are more informative than a lot of things. That is [INAUDIBLE], by the way?
WINRICH FREIWALD: Yeah.
AUDIENCE: OK. Good.
WINRICH FREIWALD: Yeah, most of these people have already deceased and [INAUDIBLE].
AUDIENCE: Hey, Woody Allen did that.
WINRICH FREIWALD: All right, so, from the face, the face is the most important social source of information. To get things like gender, age, personal identity, and things like trustworthiness and attractiveness. Just some very basic exposures. And you can get changeable features like the mood of the person from the face expression. And deal with direction of attention through where the person's looking to. It's a very rich source of information and that's in part why maybe faces are treated separately from other objects. Because they're so important. And [INAUDIBLE] another high level [INAUDIBLE] neurons.
OK. So for you to use this social information, there's a couple of computation problems here to solve first. The first thing, you have to know where the faces are. Once you know where the faces are, then you can apply your high level recognition algorithms and you try to extract the mood of the person or identity of the face. And it's not a trivial problem at all.
So if you think about it, if you see all these images here, right? We can see the same identity in these images here even though at low level these images are much more similar to each other. So one of the questions is, why does the brain do this automatically and how does it do that?
OK, so something can go wrong. So there's about 1% of the population that suffers from a condition known as prosopagnosia, or face blindness. And to prosopagnosics the social world might look something like this. So that virtually all the faces look identical to each other. And when this happens, you can imagine that now your enthusiasm about processing the social environment is very much [? curved. ?]
So the question is, what's going wrong? And I'm don't think we have a very good explanation right now. Still like, what's going wrong in prosopagnosia?
So now there's a post-doc in Nancy's lab. They have a very sweet prosopanosic there. They're looking for any differences in face recognition, any markers that we had didn't turn up anything drastic. So there was an M1 [INAUDIBLE], component. And everything physiologically seems pretty normal. And all the differences that we're talking between prosopagnosics and neurotypicals are the very, very small differences.
AUDIENCE: [INAUDIBLE] English [INAUDIBLE]
WINRICH FREIWALD: Yeah, so most prosopagnosics can recognize that there's a face, so they can detect faces. But they can't tell one face from another. So they would know that there are a bunch of faces here, but they wouldn't really know-- well, in this case, they might know. If there were different people there then they would [INAUDIBLE].
So there are other cases where, yeah. So it could be select [INAUDIBLE] expressions impaired. It could be even the ability to detect a face could be impaired. The typical cases is that you can't tell one person from another.
OK, so these are all the different clues from a face. And the point I want to make here is that, I think beyond face recognition, we have to look for the network that face recognition is then feeding into. Because you have these emotional responses to faces. You activate your memory, see this picture here, know it's Charles Darwin. It's just a face, it's not just a specific face, but a face you know and you have knowledge about. And you're activating your knowledge about this person automatically.
Faces just draw attention. When a face pops up in an environment, your attention is drawn to it immediately. Or directed in the direction of the gaze of the face, as shown here. And here, eliciting some moderate responses as I showed to you with this baby [INAUDIBLE].
OK. So there's some early models about how this might work in the human brain. I don't know. Nancy might've talked about this. There was this idea early on brought forward by Bruce and Young, some 30 years ago, that there's a distribution of functions. So that there might be different parts of the brain. This is based mostly on lesion studies in humans. They would process different kinds of faces, so some partly before the [INAUDIBLE] information, others for identity information. And that then, these would feed into other modules that would do more high level cognitive processing.
OK, the neural basis of face recognition. Did Marge talk about this already, or Nancy? So this is Charles Gross. He was actually the first person to record from face-selective cells. And this was in the late '60s, early '70s.
This is a slide of a macaque brain that I'm going to be using for most of my talk. The temporal lobe is known to be important for object recognition, mostly the ventric portion of your temporal cortex.
And so he would record from cells in this region and then present stimuli, like a monkey face or a biological control object like a hand or a scrambled version of a face. And he would find cells like this one here. Then selectively increase the response by if he shows a picture of a face we can see action potentials here increase when he shows a face, and don't increase when he shows a hand.
And he did some controls to this particular cell, they're responsible for the monkey face and the human face and if you take the eyes off the response is a little bit weaker. If you show a pumpkin face there's still a response but it's weaker. If you just shows some bars there, there's no response. Again, no response to the hand. Also not to the [INAUDIBLE]. And so on and so forth.
The problem was-- actually I should mention this. So Charles Gross was very much influenced by the work of Hubel and Wiesel, who studied audiovisual processing. I don't know if anyone has covered this hearing approach. And also by this guy here at the bottom, Lettvin. And he actually coined this term, grandmother neuron. I don't know if Gabriel mentioned this, anyone has mentioned this before.
So the idea that he put forward is that maybe in your brain you have a neuron that's firing exactly when you see your grandmother and only when you see your grandmother. And so this idea of grandmother neuron, that there's a very, very high level, a very sparse representation that is selective for a concept.
And then he basically went into this whole story about a person who has several grandmother cells and then some neurosurgeon takes off these particular neurons and then afterwards the person doesn't recognize his grandmother anymore. And I mention this to illustrate that the study of face cells-- ever since the discovery and even before the discovery of face-selective neurons-- has really been very important in theories about how the brain works.
So there have concepts, like Konorski's for example, that did advocate the fact that there are representations that are very, very sparse and very high level, like the grandmother neuron.
And Horace Barlow talked about pontifical cells that he mentioned that it was a hierarchy of processing. At the top off which he would have very few cells that are highly, highly selective to one particular object concept. And that the firing of these cells would be like a one-to-one correspondence to your subjective experience of seeing a particular object.
And then are other theories like Lashley who talked about mass action and maybe in a hologram. And Donald Hebb who talked about cell assemblies. And so these were vastly different concepts about how the brain might be working very sparsely. Selective units versus very coarse, very broad distribution.
And I think that's still an open question in the field even though we know that face cells exist. Which of these schemes is true, or is it really a dichotomy, or are both schemes going on?
So the first problem I gave to you, why face cells exist, actually is a pointer to this debate. Because you could imagine these don't happen, Karl Lashley could imagine that you can perfectly well recognize a face without having any face-selective neurons at all. You would just do it by sampling over hundreds of thousands or millions of cells from selectivity.
By the joint activity, just like we learned before in the hippocampus, you could-- in this case not decode where you are, but you could decode what you're seeing. And you might not need these very high-level concepts. So if these high-level cells exist, the challenge part is not only to understand how you build up these kinds of cells, how you build up [INAUDIBLE], but also think teleologically, what are they good for? Why do they exist in the first place?
OK. So, subsequently to Charles Gross, many people have found face-selective cells. David Parrot was one of the main people to lead this investigation after Charles Gross. And he summarized 20 years ago, with these symbols, every location that people had found a face-selective cell. So many of these locations are not very precise because people didn't register exactly where they recorded from. But you get the idea that within [INAUDIBLE] temporal cortex, basically everywhere people have found face-selective neurons. And so the idea emerged that these cells are very much distributed amongst other high-level selective cells. And that basically face [INAUDIBLE] recognition was the same process.
And [INAUDIBLE] when I was an undergraduate, [INAUDIBLE] and David Parrot gave a talk. And so my impression at the time really was that it's going to be impossible to understand how the face-selective cells work. Because it's very, very difficult to even record from. So you just caught a very small fraction when you recorded [INAUDIBLE] cells. You'd find papers with thousands of recordings and there's 20 or 30 recordings of face-selective cells.
Another thing we didn't know was how many steps of processing there really are in inferior temporal cortex. You just find these cells located here. We know that early vision is here, we know the early vision [INAUDIBLE] one, and this be two, be three, be four so they [INAUDIBLE] give a name to.
But once your inferior temporal cortex, it seems to be a big swath of cortex. It seemed like an association cortex with not much structure to it. And so we didn't know-- maybe there's a hierarchy but we didn't even know how many steps this hierarchy would have for processing.
OK and then we got the main clue from 's work when she discovered face- selective cells. Which I'm sure that you're all familiar with right now. That maybe there is an organization to the system that we just missed in the macaque monkey.
And this could be reflection, maybe a point that [INAUDIBLE] before about the method that he used for the investigation. If you're doing a [INAUDIBLE] recordings that are not targeted, you might be missing an overall picture organization. Just by you being focused on very minute detail on single cells and you don't see the overall organization.
If you use a technique like fMRI, you can't know the very minute organization because the resolution is not high enough. But you do get a pretty much accurate picture of the overall organization. You could find things like face-selective areas. The drawback of the method is you don't really know how face-selective the areas are.
These actually are significant maps which are thresholded. So you see that there are regions that are more face-selective than others. And they're significantly face-selected so significantly more responding to faces than non-face objects. But you don't know how many cells in these regions really are face-selective. It could just be a slight percentage higher than [INAUDIBLE]
And so is the question that we've been asked, are these face cells really domain specific modules? My information about this course tells me that this was already a subject of discussion. This is why I also raise this discussion. So are these modules, or is it just like an iceberg effect? We have slightly more face-selective cells here.
And so the other questions is, if you have this data from single cell recordings, do monkeys also have this organization into face-selective areas? Or are they just distributed cells? And then once you know these areas, can you associate different functions to different areas?
And also the question that Doris Tsao and I asked some 10, 12 years ago. And the way we address this is shown here. We used fMRI just as in humans. We had a horizontal chair that fits exactly into the board. The monkey would be sitting in a crouch position.
And we would show pictures of faces, human faces, monkey faces, pictures of bodies, other things. Hands as biological controls. And then one of the key tests over the entire thing, to look for areas that respond more to faces than to non-face objects. And what we found was, very consistently across monkeys, that there are six regions in the temporal lobe that respond more to human or monkey faces than to non-face objects.
We can actually give names to them, which are not showing here, but anatomical locations [INAUDIBLE]. We can give the name, space, and their anatomical location.
So, and then I want to highlight-- so [INAUDIBLE] very nice paper last year coming out. They looked for color selectivity in the inferior temporal cortex. And actually found that wherever there's a face area, there's a nearby color-selective module as well. And so, if you put this information together and information that Marge had that she might have discussed when she trains animals on numerosity tasks early enough in life, they will develop also small areas in the brain that are selective for this [INAUDIBLE] and they also seem to organize very closely to the face areas, overlap at the very nearby face areas.
AUDIENCE: So, just in [INAUDIBLE] stuff, I see three regions. And you said six. Is that the-- and two posterior middle, anterior is a different [INAUDIBLE] levels?
WINRICH FREIWALD: Mm-hm. So, OK, so we have actually four levels of processing. And I often talk about this, this might be an [? eye-selective ?] region. So we would then have three regions, but there's a color region here as well.
AUDIENCE: Those are the--
WINRICH FREIWALD: --and then we have--
AUDIENCE: [INAUDIBLE] together those two middle ones--
WINRICH FREIWALD: Yeah, so he would a color area here, here, and this one here you can just see. And then these areas, there's something special for faces, most likely. The two levels of processing, you have a counterpart that's deeper inside the STS inside the [INAUDIBLE].
But in terms of the [INAUDIBLE] posterior [INAUDIBLE] we had these different levels. Actually we also have [INAUDIBLE] this one here, close to the PL, then this one here, this one here. And then the one close to M would be this one, which you can't see in the side view here.
OK, so there's likely an overall organization to the inferior temporal cortex with areas. And it might be that the full view of the object vision is faces because this is what you [INAUDIBLE] and that everything else is organized around. OK.
AUDIENCE: Can you mention Marge's work, where she gets the [INAUDIBLE] train on arbitrary symbols and [INAUDIBLE]. She only used one area, right? She doesn't get like three?
WINRICH FREIWALD: Yeah, so it's variable across animals, which I see, looking this up yesterday again. So I think she had-- in the original paper she had three different monkeys, maybe she has more now. And if you look where they are, there's typically this one area that I think is closest to these middle areas here. But oftentimes there were two or three areas, which would also be more accurate.
AUDIENCE: Do they have a line-up of the face areas?
WINRICH FREIWALD: So they're very close to the face areas. So it's a big debate about whether they're facial activities, genuine facial activity, or just expertise with a high level visual content. And then for the differentiations there. And so Marge, as we talked, would indicate that maybe the overall areas are particularly good learning things, but the facial areas are separate from other high-level content.
AUDIENCE: Yeah, but does there seem to be a yoking? Like what was cool about [? Gross's ?] paper was that you had-- it seems like things are right next to each other. There might be like three chunks of [? IT ?] [INAUDIBLE]. Did Marge's areas land consistent with that story, or not really?
WINRICH FREIWALD: Yes, so the way I see this, in the three animals that she had, that there's typically there's one area here. And then sometimes I see also something here. And, again, this area is hard to image. It's like [INAUDIBLE] so I don't think she's seen anything there. Which is actually interesting, that she does not.
OK. So if [INAUDIBLE] so we have [INAUDIBLE] face areas. Oh, sorry.
AUDIENCE: Oh, sorry. In the the stimuli, I noticed that you used some kind of round shape, like clock or like orange. Because Marge told us something about [INAUDIBLE] also responds to curvature pretty robustly. So I wondered, is that the reason that he used round-shaped object, to get rid of the low-level image statistic?
WINRICH FREIWALD: Yeah, so we used [INAUDIBLE] show you some data that actually speaks to that directly. But yes, we want to attempt to control stimuli that mimic faces in some properties like the roundness. And so technological objects don't do it, biological controls like hands, body shapes are not very good for that. And so, we had like roundish fruits and vegetables for that control.
OK. So another question is, are these domain-specific modules? Like, how face-selective are these regions, really? And to test this, we then introduced electrodes into these areas. And initially targeted the middle face patches for recordings. And we showed the same stimuli that we also used in fMRI.
And I'm going to show you one recording, which is actually the firsts cell we recorded in this area. You're going to see a video that we took from the control monitor which shows the same thing the monkeys saw at the time of the recording. In addition, you're going to see this black square which is indicating where he's looking at, and he did not see that. And the clicks that I hope you're going to hear, they're going to indicate when the cell fired an action potential.
OK. Well, this doesn't work. So I hope we can [INAUDIBLE] that every time there's a face shown, there's cell response. This is the response vector of the cell that we can extract from these responses. With 96 different stimuli, the 16 faces on the left-hand side, the other objects on the right-hand side. We normalize the response magnitudely, you can see the cells responding mostly to the 16 faces.
But there's some intermediate responses to some of the non-face stimuli. We can color-code these. Red indicating response enhancement. Blue indicating response impression. What this color coding allows is now to [? spec ?] response. Like there's all the cells that we record from these areas and talk with each other and get a population response matrix.
So here's [INAUDIBLE] from top to bottom. Stimulus from left to right. All these lines here are single-cell population respond vectors. And there are several things that I think you can see at a glance. So one thing is-- well, for the 16 faces here there's a lot of redness over here which indicates that most of the cells really are enhanced and they respond selectively to faces.
There's a smaller population of cells that is actively suppressed. Then there's the interim population here, for which it's not clear what they prefer. But for some 85% of the cells upwards, they like to respond to faces. But you can also see if you average the responses across all these different cells, it's again dominance of faces.
But you can also see that there are some stimuli here, these orange stripes that are running down, some non-face stimuli that are eliciting intermediate responses. And if you look what stimuli these are, these are clock faces, apples, sliced tomatoes. So things that have things in common with faces. So if they're roundish, if there's some internal structure in there, maybe if there's some metric, these are all good cues already for the cells to respond. Not quite as good as the faces, but they have a partial response.
So if we think this area that we're recording from might be analogous to the FFA, then maybe that's part of the reason why we might get a response in the FFA as well.
OK, so virtually all the [INAUDIBLE] are roughly--
AUDIENCE: A question.
WINRICH FREIWALD: I'm sorry. I should, yeah.
AUDIENCE: [INAUDIBLE] I'm sorry. [INAUDIBLE]
WINRICH FREIWALD: [INAUDIBLE]
AUDIENCE: In the more anterior regions, is there less responding to these round non-face objects?
WINRICH FREIWALD: Yeah, so we did a decoding, so you can let her know, she can tell me if that was decoding some data for us. And as a [INAUDIBLE] area, yes. So there's less information about this in the middle area, there's more entered to this one here. If that's OK. So there might not be less. In the more [INAUDIBLE], it's less.
WINRICH FREIWALD: --to the non-face stimuli. Even those [INAUDIBLE] properties like [INAUDIBLE].
AUDIENCE: The question I was wondering about was, like they showed [INAUDIBLE] how many face cells you have inside a face batches. Do we have a sense of how many face cells are outside of the face versus that are [INAUDIBLE]?
WINRICH FREIWALD: Yeah, so we have not measured this extensively. There's work from [INAUDIBLE], which was actually Jim [? Dekalo ?] studied this [INAUDIBLE] to study this extensively. And they basically, just by recordings, [INAUDIBLE] face areas are there. And you've gone very systematically through the entire brain, you'll find them [INAUDIBLE] like to physiology.
So the fraction that were reported prior to the facial areas were up to 20%,30%. Typically from the order of 5%. And the question is, OK what do you call a face cell? And so I'm circumventing this probably right now, but just telling you loosely know that these cells are responding more to the 16 faces than to non-face objects. We can qualify this by some kind of index and say that they are face-selective. So we can formalize that.
But what about a cell that only likes one out of a thousand faces and doesn't like anything else? It's going to be harder to define what makes a cell face select. And also, if you have a cell that's pointing some more to faces than all other control objects, it might not be truly face-selective.
So one indicator that's coming up now, and this is worked at [? Open Focus System, ?] is he was recording from cells inside and outside the face areas. And he was looking for cells that responded more to faces than to non-face objects, inside and outside the face area.
And now convert the face upside down, there's a margin of difference between cells inside the face areas and the ones outside. The ones inside, they're really inhibited a lot on their responses. It's [INAUDIBLE] much by faces anymore once you invert the face upside down. But for the cells outside, they don't really care very much whether it's inverted. And face inversion, I mean, that's totally just like one of the hallmarks of human face recognition.
So even though something might look like it's face-selective in different regions, it might have quantitatively different properties if you look at [INAUDIBLE]. Mm-hm?
AUDIENCE: So how does the-- as you look at the face from the [INAUDIBLE], right? How much does the response fall off? Is it pretty fast? Or, I assume it's [INAUDIBLE] a little tiny bit. I can still tell it's a face, right? But if it's upside down, it's way different. So what does that response curve look like?
WINRICH FREIWALD: So we have not measured this [INAUDIBLE]. I'm going to show you one example cell. Well, I can't promise this. But I might show you one example cell from another face area-- going to answer that. But if I don't get to it, you can ask me later, and I can show it to you.
AUDIENCE: I just have a quick question. So when you say-- when you play the face upside down, I wondered if any people try the control conventional where they just like [INAUDIBLE] developmental study puts [INAUDIBLE]. Like the top-heavy configuration.
WINRICH FREIWALD: Yeah. This would be great. So we have not done that. And I don't know if anyone [INAUDIBLE]. I think it would be a very nice thing to do. Yeah, so developmentally you can find-- it can be disheartening to mothers that their children might be looking, smiling at objects that are just like three dots. Like two at the top and on at the bottom. And if you [INAUDIBLE] as well.
OK. So we find virtually all the cells to be face-selective. They respond to non-face objects which have the features of faces. So we think that face patches really are dedicated face processes in modules. And it would like one case of modularity exists.
And we think what the cells in this area are doing is to detect faces. They respond very brilliantly to the presence of a face. And they do this by shape analysis. OK.
So Nancy, she warned me years ago when I gave the talk about this work, not to mention the word "modularity" because I was going to get a lot of heated responses to the term. And actually that really happened. But when I was about to give a talk at Vanderbilt University I thought, OK so now I really have to wonder about this because someone's really very much against the idea of modularity at Vanderbilt.
And so I looked up, what does the term "modularity" actually mean? So Merriam Webster said it's a mod, it's a noun, it's an independently operable unit that is part of the total structure of the space vehicle. OK, so this is good.
But what we are referring to is a model which is based on Jerry Fodor's seminal book, The Modularity of Mind. And as you can tell from the title, he had like a very radical idea about-- that this is like the way the mind--
And so a lot of properties, domain specificity for example, that come along as [INAUDIBLE] with this term modularity. So domain specificity means that the mod is only operated in a certain kind of inputs. So their specialized information is encapsulated, so it's not accessible to the outside. The cells there, the modules, act obligatorily when a certain input stimulus happens. One of the advantages is you can have fast feet.
Maybe that's related to the question about what face recognition has to do with intelligence. It might be in-built, fast intelligence. The outputs are shallow. Accessibility is limited. There's characteristic ontogeny. And there's fixed neural architecture.
And I think that a lot this criteria we don't have evidence exactly for these regions. Like information encapsulation, I think it's something that is partly true. [INAUDIBLE] of the face patches, there's likely a gradient there. And I don't think that, if you talk about a module, that you have to take literally all these properties here of the model because we haven't proven all of them.
But on the other hand, I think there's something very-- that's a lot virtue using this word because you have these areas which in their core are highly, highly face-selective. And we find this in one animal after the other. And it's not recording the lab just like a week ago we were starting we were recording from these areas. We find the face-selective and this is the prime property. So there are other properties to it, but the number one thing-- it's not a subtle phenomenon. This is just the way it is, that virtually all the cells in this region are face-select.
OK. So one thing I wanted to briefly mention is now if you--
AUDIENCE: Are you moving on from modularity?
WINRICH FREIWALD: Yep. But I can go back.
AUDIENCE: It was really interesting to see the contrast with the [INAUDIBLE] design and Fodor. Because I think Fodor is very-- because he's very smart, he had a lot of intelligent things to say about models. And also his work on language and thought, also that varied selection of people [INAUDIBLE].
But it's also the case he's not an engineer. He didn't have the benefit of a lot of engineering background that we have now. And I really actually think a lot of controversy over modularity has been, I think, unfortunate because every engineer knows that some type of modularity is essential in any complex system. There's just no existing engineers who don't have some type of [INAUDIBLE] models.
And I think it's-- in general it would just be very interesting. [INAUDIBLE] Try to re-think what would be my modularity. Not so much along the lines of one very, very smart person who's-- if all of these are engineering-type considerations, but they're not necessarily really thinking about how you really build a system that actually works in engineering terms.
WINRICH FREIWALD: Right. OK, so we should then follow just your advice and so--
AUDIENCE: Well, [INAUDIBLE] the kind of thinking that you use to build a complex, robust, adaptive system. You would [INAUDIBLE] your engineering, but more like computer science.
WINRICH FREIWALD: Right, right. So let's put some modular [INAUDIBLE].
AUDIENCE: Can I just add to that? I think these are important insights, but I just wanted to note that it's still the case that in human cognitive neuroscience, this is considered fringy, radical, and [INAUDIBLE], right? Right?
AUDIENCE: The notion of modularity?
AUDIENCE: The notion of modularity applied to the human brain is considered fringy, unproven, radical--
AUDIENCE: Some of that's just philosophical baggage, right?
AUDIENCE: Because of the term it's associated with, with Fodor, the 30-year-old idea of Jerry Fodor that wasn't particularly-- you know what I mean?
NANCY: Just the Fodorian baggage created some interesting spaces of consideration but also some confusion in the neuroscience literature. The residual problem is that many people think that the very idea even that there is specialization for face processing in the human or macaque brain-- many people don't get that like that's just a fact. Or we can talk about what it means and how all these other issues play out. But the idea that that's just simply a fact at this point is not accepted. Astonishingly, I think.
AUDIENCE: Do you think if he hadn't used the word modularity, people would have accepted it more?
NANCY:That's why I suggested to Winrich that he not use it. And that's why I haven't used it since. If I'm trying to focus on what do we know, what do we not know. We know that there is a striking degree of specialization at the neural level. The rest of it is all up for discussion, but that fact is clear. To me and to anybody who actually looks at the data. I mean, am I being unreasonable?
AUDIENCE: Sorry, but can you explain what the concern about modularity is? Like, what's the argument against it?
NANCY:I don't want to high jack your--
AUDIENCE: Sorry, it's my fault for going into it.
WINRICH FREIWALD: No, I think it's good. I think it's what we need to discuss. So the reason I brought it up is that-- so if you look at the recordings, so there's going to be quantitative questions. Like how precisely these modules are organized. And I think we need two-photon imaging of an entire sheet of [INAUDIBLE].
So we had data when we're going into some of these face cells depending on how they're located. We're going through a body-selective area first. And then we were going from the body-selective area into the face area. And it just takes like one or two neurons, minus joint selectivity, to go into the face area. So based on recording data like this and others we are pretty sure that the borders are pretty sharply defined. And that within these borders, then virtually all the cells are face-selective. By the way, so the ones that don't replace like the [INAUDIBLE] because [INAUDIBLE]. They're going to be very selective too.
Now we don't know for sure because we're using extracellular recordings. So there could be lots of cells that are never going to respond, so the dark matter of the brain, that we don't see. But we would see them if you were patching cells or if you would do like intrinsic [INAUDIBLE] imaging of cells. Virtually all the cells [INAUDIBLE] all the cells really are face-selected. So right now I can only talk about the cells that fire at least a few spikes to any of our stimuli.
And so then it's an empirical question. Is they really going to be a gradient? We would have to know all the dimensions of face processing to know if there are gradients or if there's some kind of structure where is a more binary transition from something that is totally face selective to something that's not face selective at all. So that's the empirical part. But then there's the whole philosophical part that I would actually very much like Nancy to comment on.
AUDIENCE: To me, it is less crucial whether the edges are a little ratty. A lot of biological systems have ratty edges. And that doesn't mean it isn't a thing. So I think that's worth elucidating. And it's important, but less the essence of the matter. See, the essence of the matter is what I started my lecture with, which is does it make any sense to talk about the human mind and human brain as composed in any substantive way of having components that do different things.
If there aren't components, then that's the wrong-headed lever into the system. So to me, the essence of this is does it really make sense to think that face processing is a different piece of the system from the rest of object recognition or the rest of the mind. So I think the sense in which there is an intelligent debate on this is that you can go into those face system when you have neuron-level resolution.
And you can find information about other things than faces. To me, that's the most intelligent counter to the idea of specialization for the face system. And you guys showed in your 2006 paper, I think, that when you do a support vector machine on the neural response of the face selector itself, they contain some information about things that aren't faces.
It's much less than about faces. But there is some information there. So to me, the question is the ultimate resolution of all this will be with causal paths of the role of each of those parts of the brain in actual behavioral paths.
AUDIENCE: Yeah, and really emphasizing the function of the overall system. That's why the engineering definition is so interesting, right? It doesn't use the word, different. The interesting thing about a spaceship is not is this different part and that different part. In some sense, but in some sense it is operation-independent. And what its role it's role in the total structure and function relationships of the whole system, right?
Yes. So it's a place where computational models will have this role also to elucidate what we mean by interesting notions of the functional component. I think if we used the word functional component or something maybe that would be--
AUDIENCE: Absolutely, of course, the clashing fields of neuropsychology and the study of patients with brain damage has been doing this for 200 years going at, I think, exactly that notion of independence. You lose one thing [INAUDIBLE]
AUDIENCE: Right, in a complex system, independence is complex, right? Anyways--
WINRICH FREIWALD: All right. [LAUGHTER]
AUDIENCE: Lead engineer.
WINRICH FREIWALD: So one quick thing on the side is if you record the face cells in the middle face patches, they actually have a few of them that you can understand them. They're not that super-complicated. So one thing we did was to Photoshop images as you were recording from a cell. And as we started with one stimulus that was effective driving the cell. Then we modified it in different ways to make it bigger and hide behind bars to show like one quadrant of the cell or turn it upside down to [INAUDIBLE] of them, and so on and so forth.
And what you can see here is in all the cases when there's at least partial face information get a response in the cell. You don't get it in the two images where there's no face in there. There's no response. But you can also see a lot of the properties of the response of the cell change. The response latency changes. The intensity changes. The duration changes.
The thing that was very consistent was that the response seemed to get longer and longer the less information about a face that you had . And every time that you're showing the same stimulus, you're getting pretty much the same response. So it seems almost like your something to a machine. And then it's cranking along and trying to figure out if there is a face there. And if you give it less evidence, it takes longer to come up with a solution. But once it found the solution, it's going to respond. It's going to tell you there's a face there.
So the monkeys have localized facial areas. The facial areas might be domain-specific modules or function with all these specialized units. So the next question we asked is if we have is the modules are connected to each other. So the reason we asked the question was the faces were very far apart from each other and therefore, it might be difficult to wire them up.
You might have an organization where they there are six face areas. But they're not connected to each other. They are mostly operating separately. Or you might have specific connections that would group these face areas into a face processing network. And the way we addressed this was with micro stimulation inside the scanner. The logic here is just as you can activate cells in the brain by visual stimuli, you can also activate them by electrical stimulation.
So if you place an electrodes into your face selected area and you pass a current through activated the cells. Then in turn it's going to change blood flow and oxygenation signals you can pick up with your MR scanner. And you might see a swath of activity around the stimulation site. If these cells are now activating-- if they have projections that are strong and focal enough to drive [INAUDIBLE] neurons, then you might also see patches of activation at spatially disjunct locations.
And then you can ask if these locations overlap with the face are. So the logic of this experiment is very simple. It appears with stimulation It appears without stimulation. There's no visual stimulus shown. It can happen in complete darkness or during sleep. And the [INAUDIBLE] should be [INAUDIBLE] at the time of these experiments. These are computer-flattened surfaces of the brain with the face areas, which I am now going to show by green outlines.
And it first started in the biggest face area-- the middle face patch. And these images here show the electrode go down-- the site here-- electrode on the face area-- the front the brain electrode going into the face area. And so the stimulation site inside the face patch. And before we did the microstimulation, we record from these cells.
So this is the response relative to baseline. And you can see again these cells responding in a face selective manner and confirming [INAUDIBLE] light orange inside the face patches or inside one face patch. And now this is the activity pattern that we're getting from micro simulation at the site, which is inside the face area.
So we get a swath of activity at the stimulation site. But we get multiple spatially-disjunct activations at other regions. And if we now overlay them in the face area. So you can see that they nicely coincide. So from this experiment is just showing that is really spatially disjunct. So from this, we concluded that yes, they are selective connections from this one face area to the other face areas.
And we can also look at the other hemisphere. If there is activation of the hemispheres, it's also selective-- be confined to the face areas. If we do controlled experiments, stimulation outside the face patch. We don't get face selective cells. Again, for stimulation, we get a swath of activity throughout the stimulation site in the multiple regions that are not-- that are also activated by stimulations here. But now they are-- they're now inside the face areas. But they're straddling the organization of the face cells.
So we did this experiment over and over again. I'm not going to label this [INAUDIBLE] whenever we do the experiment stimulating here, again, we find activations that are overlapping into the face areas. And so from all these experiments together, we concluded that, yes, these face areas are a part of the face processing network. In fact, we now have retrograde labeling. They show that 90% of all the cells can be labeled for the face area. If you look where the cell bodies are located, they are located inside the face processing system. So it's a very surprisingly closed system where the different face areas are connected to each other.
AUDIENCE: Are they necessarily directly connected?
WINRICH FREIWALD: Yeah, from this experiment, we don't know for sure. Usually in [INAUDIBLE] projection, you also have feedback projection that tends to be a little less precise. But from this technique, you wouldn't necessarily know which direction to go.
AUDIENCE: What can you tell about the direction? How from the top one down to the bottom one can you tell that that's independent from going through [INAUDIBLE].
WINRICH FREIWALD: So strictly speaking, we don't know this either. So it would be possible if you stimulate activation here just passing through this area here, it's only because activating these cells to make it in the downstream areas. So we would argue by the strength of activation that you're seeing through this that it's not clearly getting weaker and weaker as you would see if you were to go to one station and then the other. But directly we only see this now with retrograde facial studies that you can go to [INAUDIBLE] levels here.
AUDIENCE: Sorry, say that again. Can the retrograde ones go [INAUDIBLE]?
WINRICH FREIWALD: No, usually ones that just go to the soma. So they are picked up by the synapse and they go to the soma. Then you know it's a direct connection. So the big caudate structures-- if [INAUDIBLE] passing through here, it would be possible that you get activation here. And then just because these cells are active, then you also get these ones active. And so whenever we draw direct connections here, we might not really know. If you look at the data and look how strong activation is, it's unlikely it'll be [INAUDIBLE].
Then because I think because faces have this special social status, it might also not be surprising that this network has particular connections to other outside face areas. So in particular, this most anterior area AM has a projection to the lateral nuclei, which is part of the emotional brain where there are also face-selective cells found
And this is one of the outputs. This is going to be also the projection from one other area to the pulvinar. And from the pulvinar into the frontal eye fields. And so these are the connections that we currently know that are the strongest ones, as you know, a little bit more to the claustrum, for example.
But there's a very strong output to the amygdala and then indirectly to the frontal eye fields, which are controlling attention. And so you could imagine that maybe these are the links that are mediating the processes that faces exert on emotional responses or part of this responses is orienting attention to a face or following gaits.
Maybe also briefly actually mention the three face-selective areas in the prefrontal cortex-- one in the orital frontal cortex-- venterolateral prefrontal cortex-- arcuate face patch. They are all very face-selective. So the time courses here show some green [INAUDIBLE] responses to faces and an orange [INAUDIBLE] responses to non-faces. And you can see they are very face selective too. And we also know that they are connected to these temporal locations. So these are all the connections that we know of right now.
There might be additional connections to these frontal areas. But this is only showing the connections that we know.
AUDIENCE: Are the frontal ones reliable across monkeys as more temporal lobe ones?
WINRICH FREIWALD: Prefrontal ones?
AUDIENCE: Yeah, prefrontal.
WINRICH FREIWALD: The outer frontal one-- we also see in all the animals. The lateral ones-- they are a bit more finicky. So we don't always see them in this PV, for example, it seems to occur at two different locations. It might actually be two areas. And sometimes we see one and the other. We haven't done many recordings in there. So we also don't know what the function of properties.
So once you would know this, I think you'd be more sure what exactly the nature of it is. But we see them in half the animals. So if we don't see them for the reason that it my be for particular reasons that we don't see them in particular experiments. And they don't necessarily exist.
AUDIENCE: I think that's true in humans too-- the frontal face stuff. We see those in a nice, robust, frontal selective face responses in frontal lobs in about half the subjects and not the other half [INAUDIBLE].
WINRICH FREIWALD: So in general, this is for experiments with static faces compared to static content. It's easier in human face areas to see them for dynamic faces, for example. And so I am quite sure-- like with the monkeys I think they are much more easily engaged by the stimuli we're showing to them.
And so this is where I think it might be a little easier for us to activate the prefrontal cortex because they didn't watch movies everyday-- it's happening there. And if we are making things more interesting like in the social scenes I was showing initially, then actually we see these areas activate much more easily.
So likely they are face selective. And likely they are doing something more than just faces.
AUDIENCE: Yeah, I was going to make the same point that we see there that it's very variable in humans as well. And then it might have been [INAUDIBLE]. I'm not sure. I had a paper saying it might be object-based attention. So when you're attending to a face. Then it lights up more consistently. I am just curious.
WINRICH FREIWALD: Yeah, since you're going to talk this afternoon, you might talk about this some more. So there's a region now in human cortex. And there's also some work in the monkeys in the region that might be overlapping with PV. And it seems to be involved with nonspatial forms of attention that might include object and feature-based attention. And yeah, they could be linked to attention networks as well-- the faces may automatically engage this.
So I have a few more minutes. So I would like to make one more point. So I raised the question initially as a problem. Can we figure out how face cells work-- how they're wired? And another way of looking at this is to ask if we have these multiple face areas, what are the functional specializations in these regions. So I showed recordings from this area here. And I'm going to show recordings from these two areas here.
And we first-- we did the same experiment we did before. So we're showing these stimuli with faces in front view on the left and the monkey face stimuli here. This is again for the middle face patch selection-- very face selective. We find 94% of the cells that are face selective. And now in AL, the second area, again you can see a concentration of cells that respond to faces here. But the [INAUDIBLE] cells are actually inhibit by faces. And soon the overall population response, you see that actually this population is a little less face selective than the middle face patch.
So we were actually quite disappointed by this initially and we are puzzled because-- you asked me how is it conceivable that the inputs to ALIP, which is from the middle face patches predominantly. How can they be so face selective and then leave the output area, AL, ALIP, less face selective than the inputs.
And the answer to the puzzle became clear when we used somewhat different stimuli. So we showed a stimulus set with 25 different faces from left to right. And then eight different head orientations from top to bottom. And now the response matrix is shown here again. Cell number runs from top to bottom, stimulus somewhere from left to right.
And these stimuli are organized coarsely by head orientation. So the first 25 images are left profiles and left half profiles-- front views-- right half profiles-- right profiles. And within the fine-grained information is about the face entity. The system is number 12651. These are the same identity different head orientations.
And this is, again, our screening set with the front of the faces shown on the left in objects, but sorted in the same way by the cell number from top to bottom. And what you can see here that there are some cells which we put at the bottom. They respond to the front view of the face. Also, the face looking upwards and downwards. And these were predominantly face-selective.
But at the top, we have a very peculiar population of cells that are responding to profile views. And if these cells respond to one profile view-- the left profile, they actually always responded to the right profile. And we only find these cells in this particular region. It can be suppressed by front views of faces.
And because in our screening sites, we only showed front of faces that didn't appear very face selective. So depending on the definition of what a face selective cell is, I would say these are very, very face-selective cells. But they're not going to respond to front views of faces, but to profiles very selectively.
But the cool thing is that this confusion of left and right symmetry. I don't know if Tommy is going to talk about this later on in the course. So there's some very-- to me, very satisfying theoretical insights now that's following from a new theory that he has developed in object recognition that actually can explain why this mirror symmetry can occur at this level.
But I'm just going to state it as a fact for now. So I would like to show one example cell to you. I think you can hear it really likes the profile view. It doesn't care so much which person is shown. It doesn't care if it's a left view or a right view. But it really likes the profile view of a face.
AUDIENCE: Is this the actual presentation speed of a--
WINRICH FREIWALD: Yes. So they can be very fast. And sometimes so fast I don't even see the stimulus, but the cells will see the face stimuli. So because we find all the animals are always at this stage, it's a reason this is like an essential property of this area, but we have no explanation. And now we have an explanation for it.
I'm going to jump over fine details. What about AM? It's the top level. It's the area that's projecting to the amygdala. Again, overall in the population, the cells became less face selective in the input cell level. And there was again to the stimulus with the 25 individuals in different orientations, you can see that this looks very different now from the input level would be this mirror symmetry.
So we have some cells we're putting at the bottom that are responding to virtually all the faces. So they don't care which individuals is there. They don't care it's a left or right view. They also don't care if it's a small face or a big face at this position or that position. It's not shown here. And then there are some cells which are very, very selective shown at the top. And I'm going to show one of these cells to you and maybe you can figure out what the cell is interested in.
WINRICH FREIWALD: Yeah, this one person there-- it's a blond boy that the monkey has never seen in real life-- this one. And this is the stimulus the cell likes. So it has a graded response. [INAUDIBLE] contributed better than five views. But it's almost spiking if this one particular individual comes up.
So we have these three different levels of processing. And this is, I think, why it's not possible because of the modern architecture-- because you can record some face shots day after day after day so we can really determine the computational processes because there are these different levels of processing, there'll be no one connected to each other. We can know something about the inputs of the face-selective cells and then can start to build models about how they [INAUDIBLE] properties and actually test incrementally how they compare their properties.
And these properties are transformed and [INAUDIBLE] in different ways from one level of processing. So you have a very picture-based presentation at the front level-- the cells from a front view or a side view, but only one view. Then you have this combination of views in the next level across the symmetry axis. And then you have this almost view invariant implementation at the top level.
And since it's 12 o'clock, I'm going to be jumping over a lot of stuff here. But I'm going to be around. So if you have any questions, you can ask me. And I don't have control over it. Otherwise, [INAUDIBLE].
So what I was showing you is if you look at the macaque brain, even if you have a got much better view than of the macaque brain if you used an MR scanner and do a functional magnetic resonance imaging experiment, which can then locate face selective areas. And you can do this with other function properties as well.
So Pablo was in the course of [INAUDIBLE] using this for attention studies, see if he can localize [INAUDIBLE] attention networks. And we also are looking now at social processing networks-- networks that are for social communication and emotional responses. So it's a very, very useful tool to make a problem that seems intractable, tractable.
So again, David [INAUDIBLE] was very difficult to find face-selective cells. It seemed impossible to figure out how these cells are wired up to become face-selective. I think now the problem is tractable. And it is possible because of fMRI in combination with electrophysiology. You can then record from these areas and figure out that they are modules. I can microstimulate and figure out what they are connected to.
You can see these are networks. You can do this for any function. You can now figure out and know how many processing there really are. And you can figure out what the properties are of cells at the top level. So you can in a Maurian way of analyzing this system, you can now ask what are the computational the goals of the system? What are the properties it wants to derive?
Then you can ask what are the algorithms which they implemented. And then neurally how they are coded into the hardware. And then you can ask questions deep into the social brain-- how can you use this information now to recognize the person that they know-- how they activate social knowledge-- how they activate emotions-- how they activate attention systems. And so does it have to do anything with intelligence? I don't know the answer. And this is like one partial view on this is you're seeing faces everywhere. It's like our cognitive or perceptual hardwired [INAUDIBLE] prior.
So we have-- looking into the world, we want to see faces and want to recognize them. And so maybe in this way [INAUDIBLE] talked about recognition models or other cognitive architectures. It might be one snippet of how our minds are pre-wired with certain structures then posing them onto the incoming information [INAUDIBLE]. And that might be a building block of intelligence. Thanks for your attention.
Questions? Or a question? I'm not afraid of questions. You can ask questions.
GABRIEL: One or two quick questions. And then Winrich will be around for questions. And then we'll have a group picture. And we'll make sure [INAUDIBLE].
WINRICH FREIWALD: Yeah, OK, two quick questions and one--
AUDIENCE: One quick question. I wonder if from a developmental perspective, I don't think patches start from beginning they are there. Or how is it connected?
WINRICH FREIWALD: Yeah, it's a super-important question. As you can imagine, it's going to be very difficult to study. There was a poster briefly four or five years ago in cognitive neuroscience from a Japanese group that studied the development of time course behavior very, very thoroughly. And they had shown a facial area in a very young animal.
And Marge has told you she is studying this question now. But I don't know the definitive answer. I think it's going to be super-important to know if they are there because I showed these three-day old macaque monkeys oriented to faces. There is something there about face recognition that would be so nice to know. But I don't think we can do it.
AUDIENCE: I have a question about recorders or algorithms that might use responses to faces. So it would be very different from-- I like the last slide about the face [INAUDIBLE]. Do you think those decoders that respond to some face [INAUDIBLE] would be very different from responses for natural scenes or [? rape ?] scenes?
WINRICH FREIWALD: My hunch is yes. But I don't know because I'm so face centered. I can tell you some stories about what we know that the cells are doing. So it's like measuring facial features, but also being selective of course. Facial features will help you detect the face. And so we know more and more about these cells.
But essentially because face recognition is more holistic and because you know they're going to have to process faces as opposed to other structures. There might be differences between those cells and others. And I think it's just going to be-- I think I can't really say this definitively right now because we haven't really studied. But the problem is with all of the cells that you can be studying is you don't really know what they are processing.
So if you find a cell that is responding well to a particular stimulus, you can says, OK, it's making some contribution to encoding that stimulus. But do you really know that this is the right kind of stimulus. Then if you start analyzing what are the critical features, you really [INAUDIBLE] respond to [INAUDIBLE] of the cell, And you really think about what the cell likes or you're somewhere else. And you've seen a shadow of what the real cell likes. So if the only difference is this is going to be very difficult methodologically to know exactly what they have in common and what they don't have.
AUDIENCE: So you focused on the parts of the presentation that you got to show us on two very important dimensions of variability-- the face sensitivity and pose. What about emotional expression? Have you ever did that?
WINRICH FREIWALD: So we are only starting to look at this now. The reason we haven't done this so far is that we failed. We have to monkey facial expressions to get this right.
AUDIENCE: Are we sure that monkeys won't have the right responses?
WINRICH FREIWALD: Yeah, so if you smile at the monkey, they are going to do the aggressive gesture. So you can say, well, they understand human emotions if they get the meaning wrong. So for the face processing part, it might be OK. But we felt you have to get the monkey facial expressions. And so far we don't have-- so if we do find some cells that are selected for it, we don't know that it's going to be selected across the individuals.
So there's something sensitivity like an open mouth-- they see this and make a very strong stimulus in face selective cells. But we don't know it whether its' a specific dimension that is independent of identity versus low-level features. So hopefully in a year or so we know the answer.
Associated Research Thrust: