Large scale high temporal resolution effective connectivity analysis from MEG and the problem of inference
Date Posted:
May 8, 2019
Date Recorded:
May 6, 2019
Speaker(s):
David Gow
All Captioned Videos MEG Workshop
Description:
David Gow, Massachusetts General Hospital
slides (PPTX) and associated video file (MOV)
PRESENTER: So now it's time for the third talk about brain connectivity methods from David Gow. David Gow is the head of the Neurodynamics and Neural Decoding Group at the Massachusetts General Hospital and a professor at the Department of Psychology at Salem State University. He completed his PhD in the Department of Psychology at Harvard University and went on to complete postdoctoral fellowships at MIT's research laboratory of electronics and the Department of Neurology at MGH.
His current work focuses on the development of tools for the analysis of MEG data and the use of these tools to understand the sources of phonological grammar. I'm sure it's going to be a very exciting talk. Phonological grammar escapes my strengths, but it's a very useful area, and I already had some interactions about David's research before. So I'm very excited to hear your presentation.
DAVID GOW: Hope it's good. So I'm going to talk not about my research today. If anybody is interested in grammar and phonology, I'll be giving a talk next Monday at 2:30 in Ted's lab. And you're all invited probably to Ted's hours afterwards and it'll be great. So thank you.
So I'm going to do a couple of things. I'm going to talk a little bit about effective connectivity. I am a psychologist. I'm not an engineer. I am not a developer in the traditional sense. So I'm going to tell you kind of why I do the research the way I do. And hopefully you will see some ways that this might connect to questions that you're interested in. And then I'm going to talk a little bit about the logic and assumptions of something called Granger causation, which you may have run into before.
And I'm going to kind of build it up from basic principles, and I'm going to try to describe a processing stream that we put together that's an attempt to implement Granger causation with some integrity. You may have heard a little bit about Granger causation. If you've heard it in the connection of bold imaging, you know that it's terrible, unreliable. I'm going to try and build a good argument for why it's a reasonable application with MEG, and you can ask some good questions with it.
So as a psychologist, I was coming from this question of how different kinds of information are combined. And I study spoken language perception. And there's this sort of classic problem in spoken language perception. Give me a second. I'd like you to count the number of S's you hear. You're going to hear a bunch of words. Just count the number of essays at the beginning of them.
[AUDIO PLAYBACK]
- Sandal, sandal, sandal, sandal.
[END PLAYBACK]
DAVID GOW: Give that to you one more time.
[AUDIO PLAYBACK]
- Sandal, sandal, sandal, sandal, sandal.
[END PLAYBACK]
DAVID GOW: How many essays did you here? OK. So let's listen to another series.
[AUDIO PLAYBACK]
- Shampoo, shampoo, shampoo, shampoo, shampoo.
[END PLAYBACK]
DAVID GOW: OK. And of course, it's all exactly the same sequence of fricatives on an S to "esh" continuum. But you hear it really differently, or at least you make really different judgments depending on the lexical context. And so the implication is that we use our knowledge of words to make sense of speech sounds. And that seems pretty obvious to us now, but this was a pretty contentious question, especially when this was first found at MIT.
[INAUDIBLE] was around one of his students named [INAUDIBLE] discovered this. And [INAUDIBLE] said no, there can't be interaction of that form. So you can say that two kinds of information were combined, but you really probably shouldn't say that there's any kind of a top down lexical influence on speech perception.
So this idea that we should be careful how we interpret the combination of different kinds of information became fairly contentious, in my world at least. There was a paper by Dennis Norris and friends that basically reviewed a lot of different pieces of behavioral evidence that a lot of people thought seemed like fairly solid evidence that there were top down lexical influences on speech perception.
And what they did is they said here's a simulation model we can either explain or explain away all the kinds of interpretations you'd had before because you'd started out with this model. You'd said that there's some kind of a representation of words some place, some kind of representation of speech sounds. And in all of these tasks, we have you make a decision. And so you see arrows pointing up, arrows pointing down.
But what if the world works like this? What if you make sense of speech sounds, you make sense of words, and then when you're making a decision about what you heard, you're combining your information about words and sounds after the fact. So notice what's different here is there's no feedback to sounds. And so the percept of the sound, the basic underlying neural representation or underlying representation isn't changed.
Now, this is a tough inferential problem because of the tools that we had at the time. We had a bunch of behavioral tools, different tasks that you could do, different kinds of judgments about words or their meaning or their lexicality or grammatically. And the problem was each one of these things examines a very tiny slice of what's going on. And to make matters worse, they all did it after the fact. So basically, I give you a problem. Somehow different kinds of information match together, and at the end you make a decision. And it is the decision that I'm measuring.
So it's kind of like a magic trick. The thing about magic tricks is generally the trick has happened before the person walked onto stage. They've sorted their cars. They've loaded their thumb tip. They've done whatever magicians do to fool you. Brains do the same thing. So if you were measuring how information combined after the fact, you can certainly-- you can show the combination of different kinds of information, but you can't show how.
One of the responses to this is people said, well, let's try BOLD imaging. So one of the early experiments here by Emily Myers looked at exactly the effect that I showed you kind of at the beginning. Here she's looking at a continuum between a gift and a kiss, between G and K. And what she found was on the trials where you show a parent strong lexical influence on categorization versus ones where you don't, you get lots of activation in the superior temporal gyrus.
And people said, well, that's a pretty speechy area. There's not a lot of reason to think there's much going on lexically there. So this is consistent with our story because, well, what's happening is you're getting that this-- you're re-energizing whatever kind of representation you have in this superior temporal gyrus because of lexical information.
The problem with this is anytime that you're looking for some kind of top down effect, generally that's going to show up most strongly in cases where the stimulus is ambiguous. And cases where the stimulus is ambiguous, you'd also expect superior temporal gyrus to work a little bit harder to make sense of the signal. So we had these results. We had this other way of approaching it. But the problem is, once again, there's more than one way to interpret it.
And this is a real problem. If there are lexical effects on speech perception, you should be able to get a lot of things out of this. This changes the way we think about the problem of speech perception, and it also changes some things that might not be as obvious, like where does phonology come from. But I'll get to that in a week.
So the question that we're faced with is, is there a better way of asking these questions? And what my group came up with, at least, is that we should be using effective connectivity analyses to discover functional architecture. So not coming from a particularly brainy background but interested in the tools that might be used to do this. We had this fairly simple minded approach, which was, well, here are the key components.
Here are the things that we think are part of processing for interactive versus feedforward kind of processing. And you'll notice it's exactly the same three components that I'm looking at in both of these. We can localize the components based on the empirical literature. So we would expect that speechy sort of thing would be associated with the posterior superior temporal gyrus, lexical information with the supramarginal gyrus, and posterior middle temporal gyrus, and decision-y stuff probably with cingulate and ventromedial prefrontal cortex.
So the difference between these two models now is they have exactly the same components. Which by the way, if you're dealing with something which is temporally flat like fMRI, they look the same. But if we can find the pattern of affective connectivity. So by effective connectivity, I mean directed information flow. It is a different pattern for these two models. So what we'd like to do is we'd like to see, well, which model holds?
So this led us to Granger causation. Now, there are a bunch of different models that are out there, different affective connectivity approaches that are out there. At the time that we got into this, the major competitor was a dynamic causal modeling. In dynamic causal modeling, you start out with a model. Actually, you start out with a tremendous number of models, and you test patterns of correlation to see which would be kind of most consistent with whatever model you try to test.
The problem is this can only test very simple models, meaning very few parts. These become very computationally expensive very fast, because there's this hyper exponential explosion of model space as you introduce more and more elements. And so we wanted something that was data driven. I do not know the answers. If I knew the answers, I wouldn't be doing the experiment. So we wanted something data driven, and so we went to Granger.
Now, Granger actually has its roots at MIT. Norbert Wiener, who was kind of famous for haunting the Infinite Corridor for generations, cornering people and asking them more questions or actually not asking them questions, lecturing at them until they could escape.
Made a couple different-- wrote about a couple of different intuitions about cause and effect. The first one is that causes precede their effects. So if you've got two events, say one that happens at time 1 and one that happens at time 2 and they're very highly correlated, you might make the assumption that whatever is happening at time 1 is causing whatever happens at time 2.
Similarly, if your child was vaccinated between birth and, say, 12 months and at 16 months they started showing signs of some kind of spectrum disorder, you might say, well, one went before the others, so clearly the vaccine must have caused the problem. This is not a really great logical tool.
And this is a problem because this is something that a fair amount of neuroscientists used to argue for one event causing another. Just simple temporal precedence. So we need to dig a little bit deeper. One of the problems here is what if we have multiple causation? So let's say the arrows are showing the actual underlying pattern of causation between a couple of different elements. And if we aren't looking at the green circle and we just know the blue thing happens before the red thing, even though there's no direct interaction between the blue and the red, we might assume that there was. So this is a problem.
And Wiener addressed this with a second intuition, which is causes carry unique information to predict effects. So I want you to imagine you're playing billiards. So here's the situation. You've got the cue ball you're about to hit. There are two balls here. There's a ball right next to a pocket here, the corner pocket. You smash into these two balls here. One of them peels off to the side. It just nips the thing in the corner and it goes right in the hole.
Now, if you wanted to predict if the ball was going to go in the hole or not, where should you be looking to make the best predictions? I would be looking at the ball that hits it. Now, you can make some prediction. Once any balls are moving, the chance of something going in the hole, pocket, whatever you call it, I don't actually play this game, are obviously increased. But the most information about what's going to happen next is going to be the thing that causes that the cause of the effect that you're interested in.
So let's imagine, then, that we've got the exact same situation that we talked about before. Now we're observing all three of these different interacting things. The arrows are showing the underlying model. And what I'm going to do is I'm just going to try and predict when the red event is going to happen, and I take out the blue. Have I lost any ability to predict the future of what's going to happen to the red? And the correct answer is, oh, hell no. We aren't going to lose anything.
But what if I took out the green? Assuming that the blue isn't exactly the same thing as the green. Assuming that there is some probabilistic relationship between all of these, things if I take out the green, I've actually lost some predictive information. So what we're going to do, then, is we're going to try and use this to see cause and effect.
So the person who kind of put this all together was Clive Granger. He won a Nobel Prize for kind of related work. And he said, here's how you identify causality. First identify time series data from everything. I think he said everything in the universe. That may be a little bit of overreach. We'll get back to that. Then use all the time series data that you have to predict the future of one variable.
And as an economist, he knew one thing about models, which is predictions are always wrong. But the nice thing is they're quantifiably wrong. Go back now. Predict the future again removing one variable from your models. You've measured everything in the universe. Now you measure everything in the universe except how much coffee is left in Dimitrios's mug at this point. This might be the causal factor.
PRESENTER: It's tea.
[LAUGHTER]
DAVID GOW: OK, this example no longer works. Tea? Jesus. OK. Making it work. OK. So in any event, if the tea was causing something, which is probably excessive sleepiness during an otherwise scintillating talk, then my prediction-- and we took that out of it, then my prediction shouldn't be as good. That's the whole logic.
So turning this into some simple math. Let's say that we're doing this with just two factors, which by the way, is something that Granger would say from the very beginning do not do, because the first premise was we're going to measure everything. So you're losing an enormous amount of information if you just use a binary implementation of this.
We're making a model here. It's a restricted model because we've taken out one variable, and we're trying to predict the future of another variable. Now we make another prediction about the future state of one variable using all the data on the table. And I've got these two error terms at the end. And all I'm going to do is I'm going to compare these two. And this is going to give me a quantitative measurement of unique predictive information.
So that's the logic of Granger and that's one of the reasons that I love this analysis is I think I understand it, which is not true of a lot of the things that seem to fall out of IEEE journals when I peruse them. The logic is relatively simple. The question, then, is how do you implement this with some kind of integrity?
So there are a bunch of different problems here that we have to deal with. And I'm just going to kind of go through these. The first one is having sufficient temporal resolution for time series analysis. And what I'd also like to do is I'd like to have sufficient spatial resolution for functional inference.
So you recall kind of our strategy before, what I want to do is I want to see if I can associate this area with lexical representation, and it's causing changes in activity in another area that I'm associating with acoustic phonetic processing, now I've got a processing model. If I just have two sensors and I can't directly associate them with an underlying bit of cortex that I have some kind of independent functional characterization of, I can't do quite as much. So I need to pick my imaging modality carefully.
I also have to be careful about what I'm counting as my variables. So I need to measure everything. I'm going to scale back my expectations here. I'm just going to try and measure everything that appears to be doing something, so anything that is potentially causal. And I'm also going to eliminate all redundant signals. So if you've got two things that are measuring exactly the same thing and I remove one of them, I've lost no information. So I need to have some kind of principled way of getting rid of redundant information.
And there are some other assumptions. One is the signals have to be stationary. We'll talk about that in a moment. I need to keep measurement noise within manageable bounds. And this is a challenge for MEG, because we are dealing with an inherently noisy signal. And I need to use methods that give us strong predictions. So I'm going to solve these by having a specific way of imaging, being careful about our y selection, and using a Kalman filter technique, which I'll talk about in a moment, to make predictions.
And we put this all together in a processing stream. I'll get back to that in a moment. But this is something which I would love to share with you. So if you guys find this way of asking these questions useful or powerful, we would love to help you make it work for your research as well. My RA's kind of shaking a little bit now, but she's about to leave and go to graduate school, so it'll be someone else's problem to help solve the problem.
OK, so let's talk about the imaging considerations. So we've got a bunch of different ways that we can think about the relationship between-- measure what's going on in the brain. First thing we need to do is we need to measure everything. And unfortunately, one of the first things that eliminates is ECoG. The problem with ECoG is because we can only implant a relatively small chunk of brain, we're going to be losing a lot of information.
Now, it doesn't mean that you can't do Granger on ECoG data. We've done Granger on ECoG data in the past. But there's a little asterisk there, because it could be there are all sorts of dynamics that you aren't measuring that are affecting what you're doing, and that's a serious consideration. OK. You're also going to get rid of lesions, because they're not going to cover much.
A more important or another important consideration is we need to have sufficient temporal resolution to be able to model and make good predictions. Predictions are inherently temporal. So we need to have some kind of a measurement technique which has some temporal gain. And that means no PET and that means no fMRI, and that kind of drives us towards MEG and EEG. Which also if we've got enough sensors, if we're using powerful source localization methods, then what we'd like to be able to do is we'd also like to be able to not just say describe time series in some detail, but we'd also like to be able to tie them to functional units.
And the way that we talk about kind of our term of art, maybe different MEG labs have different terms of art for this question, is MEG does not localize as well as fMRI. And it's not going to. But what it will do is it will get us more or less within a Brodmann's area. So at least it gives us enough localization power for us to make some connection with the localization literature.
So here's kind of what this combination of spatial and temporal location looks like. So we're looking at very slow motion of, I don't know, 350 milliseconds of somebody hearing a syllable. And what I want you to notice is there's a lot going on. So that's good. We've got a signal that's got a lot that we can kind of grab onto with our analysis. So that's the reason that we use the approach that we do.
We use simultaneous MEG and EEG. The localization that we get by combining these two things is a little bit better than we would get if we were just using one modality. Part of this is the fact that this allows us just to cram in more sensors or make more measurements. And that's going to help us with the solution. Also the fact that they have complementary abilities that in terms of their sensitivity to things like the orientation of tissue.
So when we're identifying ROIs, here's the approach that we've come up with. What we'll do is we will come up with some measure of average activation. And then what we're going to do is we're going to just pick points that seem to show larger changes in activation in other places relative to baseline. These are going to be our potential centroids.
And then what we're going to do is we're going to eliminate redundant ROIs based on time series comparisons. So if I've got two different centroids, I'm just going to look at the distance between them, the difference between them over time, and I'm going to set some kind of a standard for how much redundancy can I live with. And that's going to allow us to prune our centroids. And so we're moving towards kind of more manageable numbers.
Once we've got these centroids which are sufficiently different, now the question is how do we characterize the spatial extent? Because the centroids were single dipoles based on our M&E reconstructions. So what we're going to do now is we're going to grow out. So we're going to go kind of adjacent vertices, and we're going to do comparisons, again, just looking at the difference between time series information from these adjacent dipoles, see how different they are, and now what we're looking for is we're looking for similarity.
So we'll basically grow them out as long as they're similar within some reasonable bounds. And we have not done any kind of formal optimization around this. We've been using this technique for, I don't know, about 10 years now. And basically we've got some settings which we don't tend to play with, because they tend to be pretty reliable. So that's how we get ROIs that meet our standard of being relatively exhaustive and non-redundant.
So the next step is this is a question of stationarity. So stationarity means are there changes from moment to moment in terms of central tendency and variance? So here's a very non-stationary signal. And what this is doing is this is plotting number of people flying in airplanes. And if you look at this, you might see that there's kind of a seasonal variation that gives us the sawtooth structure, and it's also generally increasing.
So here's a really dumb prediction that you could have made at the beginning of this chart. More people will be flying in the future. Or similarly, if in 1929 you thought, OK, now is the time I'm going to buy stocks. I'm just going to hold on to them for, I don't know, 70 years. It's going to turn out pretty well for you.
So prediction under these circumstances isn't great. It isn't terribly meaningful. And so for these reasons, this kind of prediction or prediction based on the modeling of non-stationary signals, is a problem. Now, anytime that you are putting something in front of a person and having them react to it, they are going to respond in a non-stationary way.
There are a couple of different strategies for this. One is to transform the data. You could demean it. You could difference. The problem with this is the more that you change it, the more that you mess with it to make it meet the standards of stationarity, the more you've weakened your ability to have strong inferences about what it actually means.
Another strategy is that you can use small moving windows. So I'm just going to model this chunk of information and then this one and then this one. And the problem with that is although they may be locally non-stationary, if you're trying to model something using just this tiny window, you've got this a small amount of data and a lot of parameters to play with. So you've got this significant risk of overfitting.
So we were really worried about the stationarity problem. And so we're trying to look at different solutions that people had to it. And the one that kind of impressed us the most was the use of something called the Kalman filter, which, by the way, also has its seeds in the work of Norbert Wiener. He used kind of similar ideas to try and shoot down airplanes during one of the world wars. And the idea is that if you're trying to shoot an airplane down now, you need to predict where it's going to be in a moment. And so you could model kind of how to predict where it's going to be in a moment.
Kalman filters are used pretty widely in a lot of different kinds of engineering. If you're trying to make a shuttle or a rocket ship connect with the space station, you use Kalman filters to do this. And what they do, basically, is they model the way that the system works. And then you give a little input to it. So maybe you fire a rocket someplace and you predict what that should do for you and you're going to be wrong and then you adjust your model slightly.
So basically, what happens is you start out-- when you're building a Kalman filter model, you start out with basically a randomized model. So you've got kind of randomized coefficients for everything to make a prediction. And you make those predictions better. It's another minimum-- or it's at least squares solution. So you make them fit better and better and better. And then at each moment, you basically just nudge the model. You tweak it a little bit, which gets rid of our stationarity problem.
When you do this, part of the Kalman filter is the Kalman filter doesn't treat all measurements or observations equally. And this is a great thing for us and MEG, because what it does is if you've been measuring things and it's 1.5 to 1.3, 75 to 1.3, 75 probably was not a great measurement. That probably was noise. And so what that Kalman filter is going to do is when you've got a measurement that's kind of unusually large or kind of does not fit with the others, then you give this less ability to move your model around.
So this is a way of addressing the inherent noise of the model. And so then you're tweaking it as you go. And again, it's making a model. Because it's making a model, there's going to be an error term. And the fact that it kicks out the error term gives us something that we can work with in our Granger model.
So then we're just going to turn this into a number. So we're going to calculate something called the Granger causation index. And this is just the log ratio of the standard error of the restricted model to the standard error of the complete model. And what I want you to notice is because this updates that every millisecond, we now have an instantaneous measure.
So kind of earlier implementations of Granger, we're saying, OK, for this 200 milliseconds, what's going on? We're doing this millisecond by millisecond. And when we do this, we're finding an interesting kind of a spiky structure to this. We're getting patterns which are pretty complex.
Another great thing about this is this allows us to look at much larger networks than people had looked at previously. So if you were doing dynamic causal modeling, you have a practical limit on how many variables that you can put into your model. So if you don't look at kind of all the potential models in source space, or I'm sorry, all the potential models that you have to think about how different parts of the brain might interact, you're still going to be stuck with maybe five to seven would be kind of the outside of what is a practical model that you can do here.
Using Kalman filters, we're able to look at systems with more than 50 different moving parts, which means that we're dealing with an enormous amount of complexity. So here's a visualization of kind of a Granger analysis. Again, the same few seconds of someone-- or the 350 milliseconds of somebody listening to a single syllable. But what I want you to notice is there's an enormous amount of structure.
If you're building a psychological model that's going to be interpretable, I've just ruined your day, because you've got so much to understand. I like to think of this as job security. You might think of it as a headache. But if you want to measure kind of human abilities, I think you want the models to be complicated. You want there to be a lot going on.
And so our strategy has been so far to basically look at tiny pieces of the model rather than trying to understand the whole thing at once. So you've got an enormous amount of signal to grab on to. Happily, by the way, these things seem to replicate pretty well.
One of our strategies, then, for reducing the amount of data that we have is we'd like to be testing hypotheses all the time. So oftentimes our hypotheses have to do with these top down lexical influences on speech perception. But if we're going into something cold, we'd like the data to tell us what's going on. So we can use some very simple tools from graph theory to do this.
So for instance, here in these bubble charts, the size of the circles is basically a cumulative measure of the strength of Granger causation over some period of time. So these on A, the size of the circles is showing us, basically, where the most information is being integrated over time. And what you can see is there are a lot of significant hubs here. Now, we can kind of scale this different ways to kind of pull apart just how big the differences are.
But one strategy you might have is we'll say, OK, where is the information pooling or what areas are influencing lots of other areas or exerting a lot of control on the system in general? Then we can go and we can do another kind of analysis. We'll say, OK, here's the ROI that we care about, the one shown in green. And now the size of the bubbles are showing us the effective influences on that area.
We've come up with different ways of charting this also. We can do this. Here's a way of looking at the relationship between an afferent and an efferent. So we're looking at the relationship between SMG and STG. And we can basically see them at the same time. So basically, the measurements are reflected here. So just deviation from the center is telling us the strength of Granger causation over time. So we can actually see here an interaction between two areas over time.
Another thing that we're able to do is we're able to make comparisons between experimental conditions. We can do some basic inferential statistics, which is useful. So what we'll do is we will look at two different experimental conditions. And the technique that we use, which we stole from genomics, is we will bootstrap some p-values for every moment in time for the strength of the [INAUDIBLE] of the Granger causation measure. So basically, we'll generate some null distributions running thousands of trials.
And then we'll just basically assign a p-value by where the actual value sits relative to this distribution. And then we'll just count them in different conditions. So we might say, OK, here are 200 milliseconds that we care about a particular window. In one condition versus another, we'll just count up the number of points that exceed whatever criterion. Usually it's 0.05 p-value. And then we can do statistics that will allow us to say something about the difference between these two conditions. So this allows us to use kind of traditional experimental designs to ask very, very focused questions about this very large and complex network.
So here's our network as it stands today or our system as it stands today. One of the things we're trying to do is we're trying to steal everything that we possibly can from Dimitrios. So if you think about this model, once we've got the ROIs, once we've got the ROIs here, now our question is, OK, we see how they-- we can tell how they relate by playing through the rest of the Granger processing stream.
What's actually going on in these? So we're building off ramps so that we can take time series from bottom up to find ROIs and do neural decoding. So far we've been doing mostly SVM techniques to allow us to interrogate what's going on.
So for instance, if we're looking at lexical effects on speech perception, well, the question says, well, what is the lexical representation? We've got these chunks of brain, which for a variety of reasons we think that they probably are sensitive to lexical information. Do they tell us things like do syllables repeat?
It turns out that this is a fairly critical question in some debate. So what we can do is we can actually do this. We can find a representation where we can see if we can pull these things apart in a lexical area, and it turns out we can. And so now we've got this basis for generalization based on this very abstract pattern of words.
Or another thing that we're playing with now is looking at words that are similar, kind of neighborhoods. So there are a number of different predictions from different simulation models that suggest that you get these gang effects where you've got a lot of words which are kind of slightly activated because they overlap with the input to some extent. And we think that they drive things like the processing of non-words, just because there's overlap in them.
So what we can do now is we can find features. An important part of neural decoding is always to kind of come up with a good set of features. We'll take an ROI that we think is important that's doing some kind of work. And if we do a principal components analysis, we can just kind of list down which features, which are going to be these timed varying signals themselves, best allow us to characterize a particular representation.
And so now we're going to predict that these features are going to have stronger Granger effects on our speech area than features taken from exactly the same area that are kind of low value, low variance in the principal components analysis [INAUDIBLE]. So here's a way that we can actually circle this together and kind of put together these different kinds of processing.
So a couple of things come out of this. One, there was some commentary written about one of our papers. And Mike Spivey came up with my favorite line. He said that our work showed that brain activity or data were painfully realistic. So there's a lot of complexity there. And this is, I think, one of the next challenges that we have in general, which is we are so good as scientists at carving the world up into tiny, tiny little pieces and understanding those little pieces, this forces us to look at them in kind of a broader context. So there's at least a potential to embrace a kind of complexity which we haven't so far, I think.
There are some challenges about the use of MEG. Not so much here at MIT, but a lot of places. Well, actually, one kind of universal problem is coverage. So we talked about subcortical sources or deep sources. And so there's some question whether we're capturing those well. And so there's always a potential if we are not characterizing them well or we've mis-localized them that we might be missing some dependencies. So that is a challenge. The spatial resolution isn't as great as we'd like it to be. I'm going to leave it to the smart people to make that better.
And then there's the problem of cost and access. And when I submit grants to do this work, one of things that people often say is, well, there aren't a lot of places that have MEGs. I think when last I checked, there were 103 different MEG centers in 24 different countries, and I think the number has gone up since then. So there are a lot of opportunities that are out there, and there are always opportunities to piggyback on institutions other than your own to do this. So I think it's more available than people think it is.
But the practical problem is when we're doing this elimination of redundant sources, one of the questions is do these sources actually do any work? Maybe we're eliminating the wrong one. So this is one of the problems that I lose sleep over. We don't have a really good solution to it yet. But it's an area that we're thinking about.
And kind of the big question in a lot of this is how do you interpret those arrows? Is this the flow of information, which formally is what Granger causation should be capturing? Are we looking at control processes, attention, perceptual learning? All of these have been raises as potential counter interpretations of some of our results. All that Granger tells us is that information in one area is changing information or changing representation in another. How you interpret that functionally is another big question and one that I'm looking forward to working on.
So that's what we have. If you're interested in using our processing stream, well, you can get it just by pointing your camera at the-- assuming that you can run Linux on your phone and yeah. And these are some of the people who helped out. So thank you very much.
[APPLAUSE]