Normalization models of attention
Date Posted:
November 19, 2024
Date Recorded:
November 13, 2024
Speaker(s):
Rachel Denison, Boston University
All Captioned Videos Computational Tutorials
Description:
Attention is a cognitive process that allows us to prioritize the sensory information that is most relevant for our behavioral goals. In a successful class of computational models of attention, attention biases neural responses through its interaction with normalization—a canonical neural computation that promotes efficient representations across sensory and cognitive systems. Normalization models of attention have provided quantitative explanations for a wide range of findings on how attention affects neural activity and perception, making them a powerful example of a computational framework for top-down modulation. In this tutorial, we will learn about normalization models of attention—including some of our recent work on dynamic attention—with hands-on Matlab exercises.
Links to code:
RACHEL DENISON: I'm Rachel Denison, your neighbor at Boston University, just over the river. And my lab studies visual perception, attention, decision making. And we do kind of a combination of approaches, where we're trying to integrate behavior and neuroscience measurements and computational models to understand things like how we attend to things, why we can't see everything at the same time, and how that plays out in basic visual processing.
So today, for this computational tutorial, I wanted to talk about normalization models of attention. And just to start out with an example, why we might be interested in attention. So if we think about our lives, we're going around the world and a lot is going on. Let's say you're biking down this busy street in Midtown Manhattan. You're going to be faced with this sort of incoming stream of information.
There's cars, there's people, there's streets. There's all kinds of things. It's scary. And the task that our brain has is to process all this information that's coming in. And a challenge for us is that that's difficult. There's a lot of information. And only some of it is going to be relevant to our behavioral goals at any given time.
So attention is this cognitive process that lets us prioritize relevant visual information. And it's important to have this process, because we're limited in terms of the neural resources that are available to process sensory information at any given moment. So there's different types of attention we can think about. One very well studied type of attention is spatial attention.
So for example, even if you're keeping your eyes straight ahead on the road in front of you, you might attend to the right, because you want to, let's say, change lanes into the right lane. And you want to make sure that no cars are coming. And that's helpful. When you tend to the right, you're going to see better on the right. But it comes at this cost of seeing worse on the left hand side where you've been ignoring. So there's this kind of limited ability to process information, and attention can regulate how we select some things and filter out other things.
We can also consider, for example, feature based attention. Let's say you know the taxi drivers are especially erratic. You might want to pay attention to them, especially. And so you could perhaps use your feature based attentional systems to emphasize yellow things in the environment and filter out other colors.
The last type of attention I want to mention here is temporal attention. So we can also attend to specific moments in time when something important is likely to occur. I think you already noticed, you might want to attend right about this time to make sure you don't run into this blue van as you're passing by it.
So these are all different ways that we can use our cognitive systems, right, attention to prioritize different aspects of the visual world. And we know from many decades of research that these types of attention affect visual processing at really a very basic level. So they affect things like contrast sensitivity, acuity, spatial resolution, the list goes on.
And what I want to focus on today for this tutorial is this really nice body of work in the attention literature that says, how can we use some quantitative models to capture the way that attention affects this kind of basic sensory processing? And there's this class of models called normalization models of attention that have been very successful in capturing how attention changes sensory processing.
And they're a really nice example, I think, of a kind of model framework that has let there be connections made between theoretical math, brains, and behavior. So it's a kind of a very nice, almost case study in the kinds of models, like interpretable models that we can use to connect these different aspects of the system that we're trying to understand.
So that's what I want to focus on for this tutorial today. And the nice thing is that these models have made really quantitative predictions about how attention changes sensory processing that have been confirmed and tested and then iterated on to improve the models. So it's this sort of nice example of this like virtuous cycle of experiment and theory.
So today, what we're going to do is I'm going to take you through three parts of-- three sections where first we're going to talk about just like normalization. What is it? Normalization has actually been called a kind of canonical computation in the brain, because it's quite widespread, not only in sensory systems, but people have thought about normalization operating from everything from very low level sensory processing all the way through value judgments, reward, emotional processing, like normalization seems to be a kind of computation that's very basic and widespread in sensory systems, for good reasons, which we'll talk about.
So we're going to talk about just normalization, what is it. Then I'm going to introduce attention into this mix, and show how normalization models with attention, capture this interaction between attention normalization. And then, if we have time, I want to tell you about some of the work that actually we've been doing more recently to bring dynamics into this picture. So as usual, a lot of studies of vision are very static. It's like, how do we process an image. How do we recognize an object.
But as we saw in this example from the beginning, our world is dynamic. Our brains are dynamical systems. And we have to think about how this is happening in real time. So hopefully, we'll have time to talk about our more recent iterations of these models, which are now dynamic and deal with dynamic input and dynamic attention.
So along the way, we are going to do-- a fair amount of this session is going to be devoted to you playing with some of these models. And I've made some code that you can access. So I'm glad to see everyone has your computers. And we can just-- I feel pretty flexible about this, in terms of how long we take to do different sections, and very happy for people to ask questions.
We can go on detours or tangents as make sense and comes up and have some discussion along the way while people are playing with the models. So, this can be pretty informal. And just feel free to jump in and ask questions, and let me know what you're more interested to focus on.
And we'll see how much we may get through these different exercises. And if we don't make it through all of them, that's completely, completely fine. OK, so let's start with normalization.
First of all, I just want to give you an intuition about what normalization is and what it's good for. And I want to do that using this example of boats. Yes, boats in the morning, boats in the evening.
So let's say you go to the harbor in the morning, and you look out and you want to see what's going on there. And you notice, oh, there are boats. How nice. So you can do that in this great kind of daytime view. But you know what? You can also do the same thing in the evening. You can go out to the harbor, look out at the scene, and be like, ah, boats.
Now, this is actually not a totally trivial thing that you're able to look at these different-- the scene in the morning versus the evening, and pretty easily recognize everything is boats. And why is that? Why is it not totally trivial for like if you were going to build a computer vision system to do this job? What might be challenging about recognizing these as exactly the same scene?
STUDENT: Just some segmentation would be very difficult in the evening.
RACHEL DENISON: In the evening, segmentation might be more difficult. That's a good point, because the differences in luminance between the edges is actually less. Very good. Yeah, any other thoughts about why isn't this just totally trivial? What is that?
STUDENT: [INAUDIBLE]
RACHEL DENISON: [INAUDIBLE] noise. Good, good. So we might be in a situation of [INAUDIBLE] in the morning, than in the evening. Good. Yeah.
STUDENT: In the luminance values, would you suggest [INAUDIBLE] if you train your recognition, object recognition system luminance values, that might not generalize to the other case.
RACHEL DENISON: Great. Very nice. So pointing out that the luminance values are really different here. So let's say you're looking at this pixel in the morning, and you're like, oh, this is a really bright pixel. So maybe this is an object or something.
But in the evening, this is, in terms of absolute luminance, this is actually rather a dark pixel. This pixel would probably be similar to something like back here or in the morning. So if you're trying to get your computer vision system to see these as identical, you're going to have to do something like deal with the overall level of illumination.
And so that is basically the motivation for something like normalization. So the absolute luminance at a given location is actually not that informative for recognizing the boats. What's better is to represent the deviations from the mean luminance. And this is something that's so intuitive and natural to us, right?
We're just so immediately kind of discount the overall level of illumination. When you're walking outside, you barely notice it when you're crossing from a sunny patch to a cloudy patch. It's not like your whole visual system is suddenly confused, right? You can still do everything exactly the same.
And so the visual system does this. It accounts for this overall illumination. And in general, it's useful, it's efficient to represent deviations from some mean background value than to represent the absolute values. And this applies to any kind of feature you might think of.
It applies to luminance. It applies to contrast. It applies to color. It applies to texture. It applies to any sort of summary statistic you might think to generate. It would be efficient to represent not the absolute value of those things at each location, but some measure of the difference from the mean or the background value. OK. Any questions about this basic idea? Yeah.
STUDENT: So this normalization, I mean, if it is like a function-- excuse me-- is it like a-- is it also like a function of a luminance? Because at like one point in time, when the luminance is always going to be 0, you can't perceive that difference. It's not going to be like a constant. So is it more like a decaying function of luminance or something?
RACHEL DENISON: Decaying function of luminance over space?
STUDENT: Or just the luminance as a factor. For example, as you mentioned, like in a morning bright situation, you could differentiate between all these objects. And as the luminance falls down, like in the evening, you could still do that. But once it goes to the night, it's going to be really hard to perceive. Like the normalization wouldn't even work at that moment or something.
RACHEL DENISON: OK. Good, good. Yeah, right. So I think there's maybe two points. One is if you really degrade-- if you really-- if you're in the pitch dark, there's just no more information left. So that's too bad.
But if you start getting a little bit of level of illumination, if you're adjusting to the mean-- and we're going to look just in the next slide about how this happens-- you'll see that it's possible to calibrate your system in order to have sensitivity around those values that are more variable around the mean.
So let's actually look at that. So this is something that already happens in the retina in terms of normalization for luminance, overall illumination. So even in the retina, your photoreceptors do this all the time. I mean, that's how we can recognize scenes under really like orders of magnitude, different levels of illumination. Bright sunlight versus dark shadow.
The way we do this is we have the-- it's very simple. We just say, OK, there's some light intensity coming in at a given location. And what we want to do is just divide by the mean light intensity at some surrounding broader location. So in the image before, we could be so simple about this and just divide by the mean over the whole image.
Or you could divide by the mean over some more local region. That's going to give you, instead of the absolute light intensity now, some measure of local contrast in terms of how much brighter or darker is that location compared to the mean. And this is it. This is divisive normalization. It's just divide by the surrounding context. It's super simple.
And what that results in is a situation like this, where if this is, say, responses of retinal ganglion cells, or photoreceptors, we have light intensity on the x-axis here. The response of the neuron on the y-axis here. And what we're plotting here is these response functions of light intensity.
And you can see they're kind of sigmoidal. They start at 0, of course, when the light intensity is really low. As light intensity increases, they're going up. And these different lines with the different colored bars here are situations where the mean light intensity of the image is either very high, like in this morning boat scenario over here, or very low, like in this evening boat scenario over here.
And what you can see happens is that under high levels of overall illumination, a given neuron's response is going to shift its response curve so that the most sensitive region is actually lining up with the mean level of illumination. So it's shifting that curve to the right under higher levels of illumination, and it's shifting that curve to the left under lower levels of illumination. Yeah.
STUDENT: So would this be the visual cortex or the retina?
RACHEL DENISON: This is from the retina.
STUDENT: The retina.
RACHEL DENISON: Yeah. This is already in the retina.
STUDENT: Retinal ganglion cells, or--
RACHEL DENISON: I think this is a retinal ganglion cell. It might be a photoreceptor, but I think either one would show this pattern. Yes. So already in the retina, we're getting normalization that's related to elimination.
OK. So is everyone happy with these curves and how they can shift back and forth, left and right with overall illumination? So what it does, it places the most sensitive region of these curves around the mean.
So that if you're a pixel sitting on this boat right here, in the morning you're going to actually have sensitivity around these bright values. But in the evening, you're going to have sensitivity around these darker values. And that lets you do this computation.
OK, let's do-- let's move on into down the visual pathway and actually do a kind of perceptual demo, right? So what do you think? If you look at these two images, one on the left and the one on the right, which one looks stronger inside the ring?
Left looks stronger? Everyone, raise your hand for left. Got left. Yep, mostly. And anyone for right? No. And I got Zoomers saying left. So yeah, so the left one, the left looks stronger. And so given what we just talked about with normalization, why do we think that could be?
STUDENT: [INAUDIBLE]
RACHEL DENISON: Nice. OK. And so now, we're not-- we've moved beyond luminance, because this is now-- everything's like a-- the average level of luminance, so these images are all gray. But now we're in contrast. We're doing contrast. So this is the contrast of the gratings. That is like how bright are the whites and how dark are the darks.
And we see that when we put this around here, suddenly this ring is looking dimmer. And so we can see this. We can see this with our eyes happening, this kind of surround type suppression or normalization.
So let's look at the equation for making this happen. And it's very, very simple. We have the response of some neuron. Let's say it's our neuron of interest neuron j. And that equals some scaling constant. You don't have to worry about that. The meat is here. So in the numerator, we have Dj. This is the driving input to the neuron, whatever inputs coming in from the visual stimulus.
In the denominator, we have this sum, which is a sum of driving inputs to all these k neurons. These k neurons are just a bunch of neurons that are like including the neuron of interest and also a bunch of other ones surrounding it. So this is what's giving us the surround or the local context. And then we have this sigma which is just a constant that determines at what point this function is saturating. So let's, yes--
STUDENT: This the surrounding neurons are ambivalent to whether they're like inhibitory or excitatory? Or are they-- the equation accounts for both [INAUDIBLE]?
RACHEL DENISON: Yeah. So in this equation, we can actually just think about all these neurons as being excitatory to whatever drives them. But what they do is then in this computation they will suppress that neuron j. So let's see how this looks in this same example.
So let's say we have our neuron j. Let's say it has a receptive field that's located on the ring. So we're going to call this excitatory field for this neuron j. It's whatever's driving this response in the numerator here.
So if we have contrast here, this neuron is going to-- the excitatory field is going to be more strongly driven. If there is nothing there, it's going to be not driven. Now, let's look at the k, these surround neurons. Let's say these are a bunch of neurons that are all around here kind of tiling this larger field. So we're going to call this the suppressive field for this neuron j, consisting of all these other k neurons that are surrounding it.
And so what this means is that the excitatory stimulus driven activity from the excitatory field is going to drive this numerator, and everything else around it is going to drive the denominator. So if there's more contrast in this around, this denominator is going to get bigger, and there's going to be more suppression.
So if we compare this situation, we're going to have a lot of suppression from these k neurons and what we can call this the suppressive pool. If we take away the surround contrast, now you see what happens. So what's happened here? What will happen to the response of this neuron j? It's going up. It's going up. OK, good. Because?
STUDENT: [INAUDIBLE]
RACHEL DENISON: Great. Yes. Everyone happy with this question. Question about it? So boom. And then it looks more contrast. Incredible OK. So this is the normalization equation, is very simple. But it has some-- does many powerful things.
So I want to show you-- I just showed you with that perceptual example. This is actually data from primary visual cortex where we're measuring contrast response functions of neurons as a function of the-- so the configuration of these stimuli for this experiment was like a little disk in the center, which went from low contrast to high contrast on the x-axis. And then we can see, the response increases as the contrast goes up. That makes sense.
But then they also had this surround. So a region is surrounding the classical receptive field that varied from low contrast to high contrast. And as you can see, as we increase the contrast of that surrounding annulus, what happens to the contrast response functions? They're shifting over to the right.
So just like in the illumination example, when you have a higher contrast around, these curves are shifting more and more and more to the right so that the same level of disk contrast is going to lead to lower responses when you have a higher contrast around, looking up like this row. And the level of contrast that you need to reach a certain spiking rate gets higher and higher. Yeah.
STUDENT: Corresponding to the magnitude of gamma for the segment.
RACHEL DENISON: OK, good. So in this case, the shift to the right of these functions is because there's more and more energy in this term. There's more-- there's stronger this term, because we are increasing the contrast of the surround.
STUDENT: Yeah. My question is, would that impact the gamma and sigma at the same time?
RACHEL DENISON: Oh. I see. So those are separate constants. But sigma also does play a role in controlling the position, the left right position of this contrast response function. So you can see how that works, right? Any term that's down in here, if it gets bigger it's going to push the function to the right.
So if sigma gets bigger, it's going to push it to the right. And if the suppressive term gets bigger, it's going to push it to the right. Yeah.
STUDENT: So higher contracts would impact the sigma as well?
RACHEL DENISON: Higher contrast of the stimulus won't have any impact on the sigma. The sigma is just a constant that's specific to the neuron. Yes. And same for gamma. It's just a constant, just a scaling constant. Yeah.
STUDENT: [INAUDIBLE], what is the-- how big is it surrounding place where [INAUDIBLE] same as the receptive field. And is it like a function of just evolution, versus also a function of the stimuli. And how much is the excitation in the target neuron and the central neuron? How is that?
RACHEL DENISON: Yes, this is a very good question. How big is the surround. The question is, how big is the surround. It varies depending on which neuron you're talking about, and which property of the visual world you're talking about. In general, the surround, it can be larger than the classical receptive field of the neuron.
So the classical receptive field is like which area of the visual field does stimuli actually drive the response in the neuron. You can have some stimuli in an area beyond that region that will have no impact on the response of the neuron in isolation, but it will modulate the response of a neuron when there's something in the excitatory-- something in the classical receptive field.
And this is one of the major points of this normalization computation, because it's divisive. It's not inhibitory in the subtractive way. It's not like you put something in the surround, and it's always going to reduce activity in that neuron. It's only going to scale the activity that's already being driven by the classical receptive field, excitatory receptive field OK. Yes.
STUDENT: Is the neuron actually neuron goes to j should have strong k propagations.
RACHEL DENISON: Yeah. Good question. Yes. This is a good question. So I've kind of drawn it here in this very uniform way, as though all neurons in that field have an equal suppressive effect. And that's how it is in this equation, it's like all neurons have an equal effect. But that's a simplified case.
In general, different neurons could have different weights in terms of how strongly they contribute to the suppressive field. So you could have a situation-- and this is often done in modeling. And actually, this is what we'll see in the example model that we use. The neurons that are closer to the neuron j may have a stronger suppressive effect. And neurons that are farther away from neuron j could have a weaker suppressive effect. Yes, good.
So time to play with code. I have made, actually, when I was first learning about these models, I made these little toy for myself to try to play with it so I could better understand what these equations are doing and how these models work and stuff like that. And I have slightly modified it for you for the purpose of today, but I'm sorry, it is still not very beautiful. But hopefully it will be instructive and fun to play with.
So if you want to go ahead and grab this code. It's on GitHub. I put a QR if it's easier to take a picture and then send it to your self and see if that works for people. Oh, this is in MATLAB. Sorry, I should say this is in MATLAB.
STUDENT: You think if you don't have MATLAB installed, we can go to the lab and [INAUDIBLE] Mathworks.com and you'll be able to open it in there.
STUDENT: It's a potential model that [INAUDIBLE].
RACHEL DENISON: Oh, let me show you. If you already have it, if you already got the code, do this. Sorry, this is a little annoying, but you do have to use-- you do have to use these exact arguments, nm and opts, because it's expecting those things to be there in the workspace.
So once you open this thing, what here-- what you'll see is there's these little sliders, OK, which control the positions of-- let me-- OK, let me stop sharing this. And I'm going to start sharing MATLAB, just so we can do this a little bit together. And I'll take you through what the components of this model is, and what we're looking at.
So here, we have the GUI. And what we're looking at here is on the x-axis, we have position. And so what we're showing is there's actually two stimuli. So this line is the black line is where the stimulus is. Everything is Gaussian. In this case, we have a Gaussian stimulus.
So this is the stimulus number one, which is over on the left side of this one dimensional space. And this is stimulus number two, which is over on the right side of this one dimensional space. So this is a very simple world. We just have one dimension of space. And we have two stimuli that are sitting there.
We have attention, which I'm going to turn off for now. Let's all turn off attention. So there's just uniform attention across the whole thing. We're going to get to attention later. And then let's go through the other parts of this. What's being plotted here.
So you'll notice there's things that are E, which are excitatory. And there are things that are I, which are inhibitory, or suppressive. So the E raw is the excitatory drive to the neuron before any attentional modulation has been applied.
E is going to be the excitatory drive to the neuron after attentional modulation has been applied. So right now, because attention is off, we have no difference between those two things. I is the suppressive drive. And you'll notice that it's kind of linked to the excitatory drive, because it's relayed to whatever surrounding that space. And then R, the purple curve R, is the output response of the neuron.
And what we can see here is that if we move our stimuli around, you can do this with the sliders, you're going to get changes in how these guys are interacting. And once they start getting close together, interesting things start happening.
You can adjust the amplitude of these stimuli by using these bottom sliders here. And then like I said, we're going to get to attention later. But right now, just take a minute to familiarize yourself with this guy. Does anyone have-- how's everyone doing? Everyone pushing buttons and making things go? OK.
STUDENT: Sorry. So we're turning off attention.
RACHEL DENISON: Yeah. Just turn off attention and then just see if-- make sure you can move things around and make sure that everything's working for you in terms of how you can change the amplitude of the stimuli and you change the position of the stimuli. That's really all you need to do at this point. OK?
OK. So to run it, I'm going to put this in the chat. To run this, you're going to do this thing, this thing here, and that will give you the GUI. OK.
STUDENT: [INAUDIBLE]
RACHEL DENISON: We're good. OK. Incredible First thing passed. All right, so let me now, let me go back to the slides. Now, let's do the first simple exercise. We kind of already looked at this together, but just the idea of normalization across space.
So keep your attention off. And then just go ahead and use the GUI to show that nearby stimuli can suppress each other. And then let's report back to say like under what conditions does this happen? Like when do they suppress each other. When do they not suppress each other, so let's take just like a couple minutes to fiddle with that.
So yeah, does anyone have any findings about what's going on? When do these things start to suppress each other? What have you noticed? Yeah. Well, good. So when you move them closer, they both are decreasing.
And do you notice anything about at what point does this start happening in terms of when do they start decreasing? Is there any component of the model that you notice is important for that threshold from, like when they're not affecting each other to when they start decreasing? OK, yeah. Perfect.
So is everyone noticing that? When, basically, when they enter each other's suppressive fields, they're going to start suppressing each other. It's pretty straightforward. OK. So let's-- so I'm now going to show you how to be a little more interactive with this model and also you could start using the actual code, if you like.
So we already saw how to run this GUI. You can also put model options into the GUI call. You can use this opts argument. And this is a-- opts is a structure. I'll show you. I'll show you in a minute. A structure that can take model options for the model parameters.
You can also, if you like, have this extra R output, which is the whole response of the whole population. And then, if you want to get a little more into the guts of this, you can go look at attention model 1D, which can also take this option structure.
And I should say, actually, that this code is modified from code that was released by Reynolds and Heeger for their normalization model of attention, which was attached to a paper, which I'll be talking about in a couple of minutes.
What I've done is just-- they have kind of a 2D model with space and feature, and I'll show you that. But what I've done is just taken a 1D space slice through this model to make it easy to visualize and play with. So you'll notice there's a bunch of-- in this folder, there's a bunch figure generating functions and stuff like that. That's all from them. That's all from their original normalization model of attention paper.
If you want to set model parameters with the code, you can make this option structure. So for example, if you want to set different stimulus amplitudes, you can put a little vector with the stimulus-- one stimulus two amplitudes. If you want to set the sigma, you can set the sigma like that. And then you just put it as an argument either into attention model 1D, or into the GUI, and you'll see that it should change what's shown at the beginning.
All the parameters that you might want to play with are listed in this attention model 1D function. So there's a whole long list of them at the beginning. And you can put any of those into the option structure. So now let's do something that's a little more-- a little more involved. I'm trying to think, trying to think in terms of time, if we should do-- let's try-- let's spend a few minutes see how this goes.
So let's try to do this. Now, I've just been talking about-- we've shown these different examples of contrast response functions. It's not-- like the fact that neurons actually have these saturating contrast functions is a direct result of this normalization type computation.
You can imagine a neuron having a different kind of contrast response function. Maybe it looks like a linear function of contrast, right, where maybe there's some rectified portion where there's no response, but then afterwards it's just linear. And that's like a lot of neural nets and stuff are built like that. It's just like boop.
But neurons, it's not like that. They have a non-linear, saturating, sigmoidal contrast response function. And that is actually because of normalization. And so what we can do is actually see that happening. And if you go ahead and-- so for this exercise, the idea is, can we demonstrate this, just generate a contrast response function from the GUI or probably from the code in this in this case.
Can you basically get a plot of a contrast response function by varying the contrast that you're inputting into the model. This would be in the stimulus amps, stim amps. That's like the contrast of the stimulus. And see if you can generate a contrast response function. And then if you can do that, we want to ask, how does that sigma parameter actually control the response saturation.
So we think that sigma should help us shift this thing to the left and the right. That's the idea. This is the contrast response function exercise. And I'm giving you this tip, which is make sure to plot your responses on a log scale x-axis.
And if anyone needs help with that, we can go around and show how to do that. But OK, you want to give it a go, see if you can generate some contrast response functions. And I think I'll put this back up so you can see where to look.
Oh, one thing that might be helpful to say is we're kind of interested for the contrast response function. We're kind of interested in the peak response from each stimulus. So you can just look at this R max to get out that peak response.
So let me go back to these pictures of the contrast response. So I think we can think about what we want to actually plot on these things. So basically, then the x-axis, we want some measure of the stimulus contrast. So in the model that's just going to be the stim amp parameter that will be on the x-axis.
And then on the y-axis is going to be the kind of peak response of the model output, so that you can use the R max for that. Or you can just click on the GUI, like peak-- and I don't-- will that work? I'm not sure. Or you can just look at the max of the R output from the GUI and see what that is.
Like as you update the Sim amplitude even in the GUI, I think the R in the workspace should be updating. And you can look at the max value of that if you want to do it in the GUI way. So at this point, you can let them be the same. We just care about the contrast response function for like one of them.
I would make sure they at this point in default they're far enough apart that they won't affect each other. So you can just-- you have to put in two, because it expects two stimuli. But then you can just look at one, the output of one, and see how it's changing with contrast.
STUDENT: [INAUDIBLE]
RACHEL DENISON: Oh yeah. So either you can-- either you can use the GUI to just step through different levels of the contrast, and then read-- and then sort of record the maximum response, and kind of just keep track of it and then plot separately the contrast response function.
Or you can use-- write a little loop to loop through using this attention model 1D, where you're changing the amplitudes, and then getting out this R max at different stimulus contrasts.
STUDENT: [INAUDIBLE]
RACHEL DENISON: I didn't go over this well. So R is a vector, which is the whole population response. Yeah. And R max is peak. Yeah. Oh yeah. And when you're doing it, just make sure to use a big enough contrast range so that you really get the thing to saturate on the top.
Yeah. So someone on Zoom is asking about the contrast range. To get this full sigmoidal shape, you just want to make sure to have a contrast go from low enough levels to high enough levels. So if you're seeing that-- if you're seeing an increasing function, but it doesn't look like it's really fully sigmoidal, you might try increasing the range of the contrast that you're using.
I'm going to carry on so that we can talk a little-- get to the attention part. But hopefully, this gives you got a little chance to just play and see how you can start to change some of the parameters of the model. Did anyone get a sigmoidal function going?
STUDENT: Yes.
RACHEL DENISON: Super. All right. Excellent. And I saw I saw the top half of one here. And OK. So if anyone wants to, like I said, I did give an example of doing this contrast response function in that little RD and in demo function that's inside the toolbox as well, if you want to check that out at some point. And it might be useful later.
So now let's go to attention. So we talked about normalization. We know what it is. We know how it works. How does attention act-- interact with normalization? OK. So several models have been proposed in which attention interacts with normalization when modulating neural responses. There is a particular model, this Reynolds and Heeger normalization model of attention I mentioned, which has been actually a very successful quantitative model for explaining a variety of behavioral effects, neural effects. It is one of those rare models that made experimental predictions that actually were confirmed at the end. So this is like actually a success story in neuroscience.
The key feature of this model is that attentional gain modulates the excitatory drive to a neuron before normalization. So if you're thinking like how can attention interact with normalization, it has to interact somewhere in this equation. Where should it be?
And this Reynolds and Heeger model basically looks like this. So again, we have the same idea, some response, some neuron's response. And here, instead of being a spatial example, I'm giving you a feature space example. So this is orientation. So let's say you imagine you have a population of neurons that are tuned to different orientations. And we're interested in the response of this yellow neuron here. That will be our neuron j from before.
So we're going to think about the response of this neuron as being some ratio, again, of an excitatory drive to some suppressive drive. And here the drive, again, is determined by whatever the receptive field properties of this neuron is. So in this case, it's tuned to some location in space and it's tuned to some specific orientation. Here's to a vertical orientation.
And then the new ingredient in this equation from what we saw before is attention right. So attention is here. It's just a number. It's just like [INAUDIBLE] that multiplies the excitatory drive. So that's what we mean by gain, when we say potential gain. It's just like turning up the volume or turning down the volume on the excitatory input to the stimulus.
And then again, in the denominator now here, we have the suppressive drive, which is, just as before, the sum of the excitatory drive to this broader pool of neurons. In this case now, the pool is neurons that are tuned to all the different orientations. So you can imagine neurons tuned to a pool of surrounding space. You can imagine neurons tuning to a pool of surrounding feature space. In this case, this feature's orientation. So we have the neurons tuned to all different orientations here.
And just give you different examples of how this can play out in neural populations with different kinds of tuning. So it maps on to exactly what we saw before. So we have the stimulus, the excitatory drive, this D here in the numerator. We have the suppressive, which is the pooled excitatory responses to the suppressive pool.
We have our sigma, our friend sigma here and here. And the new ingredient up here is simply this attention factor, this term attentional gain, which is just multiplying the excitation to the neuron. Just to give you a sense of this type of model framework is pretty general. So I've been showing you these 1D examples of like just space, like on one dimension of space, or just one dimension of feature.
But you can imagine having, as they had in this paper, a spatial dimension and a feature dimension. Or even two spatial dimensions and a feature dimension. Or multiple feature dimension. So you can start making this a big high dimensional space. But the principle is always the same. You're going to have these stimulus drive in the space. You're going to multiply that stimulus excitatory drive by some attentional modulation, which can be specific to spatial location or feature or both, or whatever.
And then that determines the excitatory drive. You pool those excitatory drives across all the other neurons in the suppressive pool. Divide, and get out your population response at the end. OK? So this model is cashed out at a neural population level. We're thinking of the whole population of neurons. What's the activity of each of these neurons as a function of their receptive field center, their location, and their orientation preference?
So this is talking about now the whole population of neurons. And in our GUI, our little toy, that's just like a 1D population. So we have to think about neurons tiling that space. Also in brain-- oh, and in retina. So normalization really happens throughout the whole pathway, starting in the retina, in the LGN, in the visual cortex, up and up. It's going. It's going everywhere.
So you can think of this as a very general computation. But at this point when we're talking about attentional flow modulation, this is going to be in the brain.
STUDENT: [INAUDIBLE] sound, like sound like a different tone.
RACHEL DENISON: Actually, it's a really good question. There's been less work on applying this model to the auditory domain. So I'm not sure if there's actually a lot of empirical data about this, but I think that would be the hypothesis, that there should be a similar kind of normalization, similar kinds of ways that attention could interact with that.
And so in the auditory domain, you might think of a frequency space, something like that, or some other spectral temporal space. Yeah. Yeah.
STUDENT: [INAUDIBLE] question. When we're looking at a specific feature, this response of neurons is represented in terms of that specific feature. Are there interactions when there are multiple features competing? Like did you just extend the dimensions of those responses?
RACHEL DENISON: Right. So in the simplest case, you could just add dimensions and then you would just need to specify the tuning of each-- of your neurons across those different dimensions. And so then there's questions of how are those receptive fields organized. Are the receptive fields independent on these different dimensions, or are they-- are some dimensions interacting to determine the receptive field?
So those are choices that you would have to make about constructing the receptive fields. But once you've constructed the receptive fields, then you can really just run the, model because the suppressive-- you basically have to construct how are what contributes to the excitation of the neuron, and then what part of the feature space is going to contribute to the suppression of the neuron. So those are choices. And that's going to depend on domain you're studying. Yeah. Yeah.
STUDENT: Did you rescale when you [INAUDIBLE], when you like-- when you divide-- so first you do a convolution with the attention map to the raw perception. You divide the convolution result with a result after. It looks like it is divided by itself from this part.
RACHEL DENISON: Yeah, that's right. I mean, that's right. So this is going to be part of the pool too. But this pool is like a broader-- I mean, so what you can see happening is this thing gets multiplied by attention. So this is going to increase the kind of brightness of this thing.
Then there's a pooling operation. And what the pooling operation basically does is smooth out this whole thing. The pooling here is it's equivalent to convolving this product with a filter that's kind of integrating information from nearby neurons.
STUDENT: So it's a low pass filter?
RACHEL DENISON: Yeah. Yeah. You can think of it as just like applying like a Gaussian blur to this thing at the population level, kind of applying a Gaussian blur. That will effectively mean that the activity of that neuron over there now influences this location, because you've sort of smeared it, smeared it out.
STUDENT: So the image on the right would be a resolve point of a image divided by itself after integration.
RACHEL DENISON: Yes, that's right. After this pooling. That's right. So we have-- that's exactly what it is. It's just these things multiply together to get the excitatory drive. Then you pool or blur, smear, whatever you want to call it to get the specific drive. And then you take that image, the excitatory image, divide it by the suppressive image, and you get out this final image, which is like the response image.
STUDENT: So the range is different. It is rescaled. But [INAUDIBLE] the same patterns is conserved.
RACHEL DENISON: That's right. And so what you see at the end of this, it's going to be rescaled because of this division. But in the end, what you see is two spots in two different-- in this case, two different locations. Somewhat broad responses in orientation. But this spot on the right is brighter because it has been amplified by the attention modulation.
And so on this-- on this side, the attention modulation is specific to the location, and it's broad in orientation. It's like you're just attending to that location at whatever to whatever orientation may be there.
STUDENT: So it's the same, the luminance as it is in body. So how would you compare the cell before and after how it is brighter or darker?
RACHEL DENISON: Oh, over here you mean?
STUDENT: Yeah.
RACHEL DENISON: Better to do that with the model. So let's actually do that. So let's do it. So we have-- let's do this little exercise about attention normalization. So going back, we can just do this in the GUI. Let's go back to the GUI, and let's start playing with attention.
So just, again, it's that nm ops, rd, and then GUI command. I'll put that in the chat too. And GUI to get your GUI back. Turn attention on and off to see what it does.
And then we want to figure out, OK, how does the effect of attention on each stimulus depend on the spatial proximity of the stimuli. And how does it depend on the size of the attention field? So you could start playing with the attention size slider too, to make the attention field broader in space or narrower in space.
And you just want to play around with this, move it around, move around attention, move around stimuli, see how they interact. And what I want you to try to get is to answer the last question on this slide, which is, can turning on attention ever lead to suppression of a stimulus. And think, why or why or why not. And you can play with this to see what happens.
STUDENT: Is it in the context of neighboring stimuli?
RACHEL DENISON: Yeah, yeah. Have a look, have a look. See what you can see. It's a little-- I think it's a little blurry with the Zoom share. But--
STUDENT: I also hate this figure. [INAUDIBLE] it's got like two dimensions on the far left. And then they collapse on those. It's a bad figure.
RACHEL DENISON: I know. It's kind of-- you kind of have to work through this figure.
STUDENT: Yeah. I think one dimensional is nice.
RACHEL DENISON: It's a little easier to--
STUDENT: Yeah.
RACHEL DENISON: Yeah.
STUDENT: But then you lose the feature amplification or whatever.
RACHEL DENISON: But that seems fine.
STUDENT: Yeah, yeah. [INAUDIBLE].
RACHEL DENISON: Yeah.
STUDENT: So pooling customizability. Is the pooling customized?
RACHEL DENISON: Is the pooling customized? Oh yeah. I mean, you can't--
STUDENT: I want to use max pooling and main pooling or average.
RACHEL DENISON: You can mess with the pooling. Yeah. I mean, the way the pooling is implemented in this model is very simple. It's just, it's basically, because we're working on this population image basically, it's just like convolve it with some, like I said, Gaussian that effectively controls the size of the integration, controls the spread of the pooling. Yeah. But you can do anything. Yeah.
STUDENT: I think pooling is a way to actively lose information manually. So I wonder, does suppression also do the same thing in [INAUDIBLE]. Like is suppression or inhibiting neuron do the same thing, like filter out information like in terms of discard some of the input, which maintains this relatively small difference.
RACHEL DENISON: Yeah. I mean, that's exactly what the kind of purpose basically of this normalization is. It's like, let's not respond based on the absolute level of whatever's going on at this location. Let's respond based on the relative level of what's going on in this location compared to what's going on in the local neighborhood. Yeah.
STUDENT: Yeah. I'm not sure if it is really distorting the [INAUDIBLE], or giving it a lower rate.
RACHEL DENISON: Yeah.
STUDENT: [INAUDIBLE]
RACHEL DENISON: Yeah, so I would-- the division takes a lot-- the division discards info in the sense that it removes some mean or local statistic. So then that is gone after you do the division. But the relative variations around that mean will be preserved by the division. Any thoughts about this here? Yes.
STUDENT: I got it to work. I got it to suppress not the centered stimulus, but the adjacent stimulus.
RACHEL DENISON: All right, so you got to suppress. And so how-- and so you turned out attention and the other stimulus was suppressed. Yeah?
STUDENT: Yeah.
RACHEL DENISON: And so what do you make of this?
STUDENT: I mean, it makes sense. Paying attention to something and something [INAUDIBLE] in another part of your visual field. Or we've all seen the dancing bear that moves through [INAUDIBLE]. It's like you do suppress other things. It comes at a cost.
RACHEL DENISON: It comes at a cost. Right. Yeah, so beautiful. So I mean, so I think that's actually kind of a powerful like result from this type of model, is that you can get attentional suppression without attention actually being inhibitory itself.
So attention in this model as you can see is only excitatory. It only amplifies stimulus. The suppression or the cost that you see at a nearby stimuli is not because there's an attentional withdrawal. It's because there's an enhancement that then leads to normalization, which suppresses more when that stimulus-- when the other stimulus is attended more.
So this is an important kind of idea that comes out of this type of model, is that once you put the tension together with normalization, you can get a lot of these selective properties of attention without attention actually doing anything like subtractive or inhibitory withdrawing.
And I kind of just wanted to connect it. And maybe this would be-- I have more stuff that we could talk about, but this might be a reasonable place to end, given the time, which is that this idea is a really, really old idea in the attention field, this sort of idea of attention as mediating the presence of biased competition.
So this is from this classic Desimone and Duncan review from '95, where they say at some point or several points, several points between the input and response, objects and visual input compete for representation, analysis, or control within neural circuits. And this-- but this competition is biased by attention-- that's what attention is-- toward information that's currently relevant to behavior.
So once you-- so the idea is in the normalization model framework, normalization is a kind of computation that implements competition. And attention is a modulation that regulates the competition
And so, I'll just let you know that this has been able to explain a lot of detailed findings in neurons and in behavior. This is just one example of showing how changes in the relative size of the attention field and the stimulus can lead to different changes in the contrast response function, either contrast gain or response gain under different regimes. So this is making good quantitative predictions about what neurons are doing and how attention is modulating these things.
There's been many more recent findings where neurons showing stronger normalization also show stronger effects of attention. Same for voxels. If we do this with MRI, pick the voxels that have strong normalization, they have stronger effects of attention.
So there's this really tight link that seems to be between these two processes, which is captured by this type of-- it's a simple model that's captured that there. So we don't have to do this, but I just do want to finish up with just letting you know about some of the more recent work that we're doing, just briefly about dynamics, right?
So I told you that we can also attend to points in time. And the world is dynamic. This model that we saw here is completely static. So that won't help. That's not good. I may skip some of this because I know we're running close to time.
But what we do is in the lab, we actually ask people to attend to different points in time. So now we have two stimuli that appear, one after the other. We say attend to the first one or attend to the second one. And we see behaviorally in many, many studies that when you attend to the first one, your perception gets better for the first one. But it gets worse for the second one.
And when you attend to the second one, your perception gets better for the second one, but it gets worse for the first one. Compared to a situation where you're asked to sustain attention across both stimuli. So it's very interesting, because these principles that are in spatial attention feature attention, where you have stimuli at the same time, and you have benefits for attended stimuli and cost for unattended stimuli.
This also applies across time, where you have benefits for attended times, and costs for unattended times that are close enough in proximity. OK, so these principles seem to extend not only across space, but across time. And so we have mapped this out psychophysically, to show that we have benefits in the blue line for when you're attending valid trials, and costs when you're not attending, invalid trials in red, and the difference between valid and invalid trials, it actually has a kind of temporal kernel where these trade offs across time peak around 200, 300 milliseconds, and fall off by the time you get up to about one second.
So there's some constraint in our ability to attend across time that lasts on the order of a few hundred milliseconds. And we've been developing this now normalization model of dynamic attention, as I said, to explain these types of temporal data. And actually the fits to those data points are from the model output. So we can do that.
And this is important because we have-- there's not a lot of models that actually will do dynamic stuff. But we kind of have similar type of phenomena that we need to explain in the time domain. And so the way we do this is very simply to take the classic Reynolds and Seeger normalization model, and just turn it into a dynamic model using a kind of differential equation where we're updating the response of the model at every moment in time using the same normalization equation.
So now you get the input, you do the normalization computation, you update the response, and you move on to the next time step. And you just do that over and over. So now it's an online model. It's a continuous model that's continuously taking input, continuously doing these normalizations, and generating outputs all the time.
And so this type of model, we can actually run this same computation in all these different layers of the model. In sensory layers. We want decision layers because we need to relate this to behavioral data. And also in attention layers.
And you end up getting these time series that are ongoing over time. And we can use that to read out from the model, make predictions for behavior, and also make predictions for neural activity. And so last step we're doing now is trying to build out this model so we can understand how this is actually happening dynamically and how attention is unfolding dynamically and interacting with these normalization computations.
So I want to just let you know that this model is also there for you if you want to play with it. And that's it. Thanks. Thanks, everyone. I'll put this back so you can-- [LAUGHS] Thanks, everyone. OK.