30 - Resting State & InstaCorr: Part 1 of 2
February 15, 2019
June 1, 2018
Rick Reynolds, NIMH
All Captioned Videos AFNI Training Bootcamp
Rick Reynolds, NIMH
For more information and course materials, please visit the workshop website:
We recommend viewing the videos at 1920 x 1080 (Full HD) resolution for the best experience. A lot of code and type is displayed that may be hard to read otherwise.
PRESENTER: But we'll do resting state analysis. And we'll bounce around a little during this. I think I'll just run through these slides very lightly. And then, maybe we can actually just start a little resting state analysis on our laptops. And we're also going to talk about the Insta core stuff and the Insta features. And those are in the AFNI GUI.
So resting state activity, BOLD signal fluctuations during undirected brain activity. The biggest thing to note about resting state activity with respect to task say, is that the resting state has no designed signal. There's no directed activity. There's no expected BOLD response so we don't know what to predict.
And so, basically, the data in the resting state analysis all looks like noise, basically, plus some artifacts. And we don't really know-- we don't have any model for what is appropriate and what is not.
So that means when you start running correlations, you can get correlations for anything in the data. So you try to clean it up to some degree and remove what you might consider as likely noise sources. But you can only do so much there. So here's an example of doing resting state analysis on the surface. And note that both AFNI and SUMA can produces correlation images in the GUIs.
So it was originally noted by Brock Biswell in '95. But it didn't take off until many years later. So a lot of these slides are basically complaining about issues we have to worry about with resting state analysis. And in particular, we get into global signal regression later. But I think for the most part you here have a decent idea of what we care about here and what we don't care about here. So I'm not going to spend too much time following all these.
Again, the basic key is that there's no model for the signal. People are doing different types of related analysis to alleviate that to some degree. For example, showing movies instead of just having rest. People watch the same movie, so there is some temporal synchronizing there across your subjects. But again, for here, we're talking about motion can lead to a lot of variance in the data due to just that motion. And those can lead to correlations and group differences. And so, you don't want that.
CO2, that has a big effect as well. If the subject starts breathing deeply or something like that, that can produce large signal changes and that can dominate the correlations as well. So there are a handful of papers that have been written over the years about sources of the bias and error. Head motion, for example-- well, people started talking a lot about head motion in 2012. Head motion was dealt with all the time before then, but suddenly it became a big deal with the paper written then introducing-- telling people you should be censoring and things like that.
Though, really, we've had censoring in our linear regression since 1988. It's not really a new thing, but doing it the hard part that people really hadn't been doing is censoring in the context of band passing. Band passing was a typical pre-processing step. Censoring was a typical pre-processing step, but basically, no one was doing the two at the same time. So now we can do that. We'll get to that in a bit.
So head motion, of course, can cause the correlations. Again, one of the concerns that came out in 2012 was like you have-- you're comparing this patient group to a group of normals in some sense. So for some reason, the patient group is much more prone to motion. That's a bit of a concern because you're much more likely, even if you do censoring, you want to censor at comparable levels across your groups, but one group is moving much more.
It's easy to get motion correlations in the result of that. So you just have to be careful. And again, as always, look closely at your data and see if you seem to be getting artifacts creeping in. Respiration and cardiac cycles with the glover, a lot of stuff. Non-stationarity of those. Non-stationarity is a concern overall. I mean, for example, you do band pass to remove, you know, sinusoids at some frequencies. But of course, bold responses, even if they tend to be dominated in some frequency, they will span all the frequencies, you know. It's not-- and they happen whenever. They're irregular. So they're not stationary either.
So there are a lot of little concerns. But on top of all these concerns, it still seems to work fairly well. So that's nice. Hardware instability, that's where Hang Joon Jo and the ANATicor pre-processing method came from anatomical bias. Pre-processing is mentioned in here mostly because back then, and probably still but particularly back then, people made a lot of mistakes in the pre-processing, and those mistakes really had a much bigger effect on resting state analysis than they might on task.
So I think that's why it's included in this slide. So here's an example of how you might pre-process. I don't know if we'd call this recommended any way anymore. I don't know that we particularly have an exact recommended method for resting state, but de-spiking is often done early on. That seems to help for, one, spikes are bad in the data, right, because most of our computations are evaluated in terms of a sum of squared differences. So spikes make a big difference.
So de-spiking might be done first. That actually can help motion correction a little bit too. So if you have subjects that move a lot de-spiking early on might even make the motion, the volume registration, a little cleaner. Slice timing correction. Again, even with the tr of two or three seconds, now you've got adjacent slices that are half a tr apart, and across the volume they're almost a whole tr apart.
If you want correlations, a two or three second shift in the time series data makes a big difference for correlations. You know, if you take two sinusoids that are perfectly correlated and you shift one by a little bit, you may make the correlation zero. That's why sine and cosine are uncorrelated. They look identical, but there's a little temporal shift, and now there's no correlation. So yeah. You do have to worry a little bit about that. So we typically do the slice time and correction. It involves a temporal interpolation, but at least you get the timing in some sense more synchronized across your volume.
Then, of course, motion correction alignment with the anatomy and spatial normalization to the template. Again, we do those in one step so you don't keep blurring your data. And once you do that, you might do some spatial smoothing. We have slides that talk about the impact of that, and then the linear regression. The linear regression, people often include tissue based regressors, white matter regressors.
In our in our opinion, the no-no of global signal average regresses. We might even include, for example, if you make a ventricle mask, ventricle CSF mask, you can include the top three principal components as regressors of no interest, and that might help deal with some of the physiological noise.
Those regressors should be extracted after the volume registration step, and importantly, not after smoothing, right? Once you smooth the data, now you're blurring the stuff all over. You're basically doing global signal regression at that point in time. So we extract tissue based regressors before any blurring. And then in the regression model to clean up your data, you've got the nuisance regression, which is maybe the motion parameters, maybe their first differences, maybe the tissue based regressors, and then you have censoring.
If you know what time points or you have some time points where it looks like the subject was moving and you estimate those with the motion parameters or with the outliers, you can censor and you can do band passing. You can do this all in one step. Band passing is done with most packages as an atomic operation, and you can do it as well in AFNI with 3D band pass for example, but don't do that. It's nice. It's fast. It's a fast Fourier transform.
We use the slow Fourier transform. We don't do it fast because if you do it fast, you can't censor, and you can't account for the degrees of freedom. So we actually create a list of sinusoids that are the band pass filter terms, and we put them in the regression model, and now it's all in one model. We're accounting for everything. We can censor. We can keep track of our degrees of freedom, and that's a nice way to do it. You can also, if you band pass separately, if you're stuck in some situation where you can't do it at once, you can band pass separately, but then you should also band pass your regressors that you're projecting out.
So many of these things we're not done well by most of the papers published telling you how to do it well. And then once you've done that, basically, you're projecting out the things you don't want in the data, the things that you consider to be known sources of noise in some way, and then you can do your correlations if that's what you're doing. Note that if you're doing something-- if you're going to do some sort of ICA or something like that, then you probably won't do these steps the way we do them.
But if you're just going to do a correlation like a C-based or ROI based correlation, then this is a nice model.
AUDIENCE: [INAUDIBLE] because they did exactly these steps, and then, afer that, [INAUDIBLE].
PRESENTER: The people that write the ICA, the software, usually expect their software to handle all of the things that you should be removing. So they often don't want everything projected out. They think we'll handle the motion, and particularly like de-trending. Actually, de-trending can possibly throw it off if they don't expect de-meaning, but that depends on the ICE software. So you kind of have to discuss what steps are acceptable with the people writing the software because that'll vary.
AUDIENCE: It has ICA?
PRESENTER: We have historically stayed away from that, but we do have [INAUDIBLE] ME-ICA package that's multi echo ICA. So if you're collecting multi echo data, we do have an ICA technique for that, [INAUDIBLE]. But the reason we're slightly more comfortable with that is that's at least using the fact that you have the multi echos to detect which time series seem to be more of a bold effect, in which case you'll have a decay of the magnitude. And in the cases where it's not a bold effect, you won't have that exponential decay. So that gives you a little more information with the multiple echoes and then maybe we trusted a little bit more, but that's the only ICA we do
And AFNI Proc incorporates ME-ICA the Multi Echo ICA if you're interested in that. If you don't have multi echo data, then of course not.
AUDIENCE: Can I ask something? [INAUDIBLE] smoothing or--
PRESENTER: I usually base it on the voxel size. More or less, if I didn't know anything else, I would say, basically full with half max of twice your voxel size, and that in sums of approximately. That would make your first neighbor get half the contribution of the central voxel, and then it would decay farther out. Something on that order of magnitude. We're getting better at alignment too, so the need to blur is reduced. But still, blurring makes a fair difference, especially in resting state analysis.
AUDIENCE: Can you say a bit more about [INAUDIBLE]?
PRESENTER: We'll hit a few things, but basically, the one thing we try to avoid is anything with bold in it. If you regress out bold signal, you might think to some degree no big deal because you're just losing a little of what you might care about, but then you just won't get that in your results. It's actually much uglier than that.
We have a bunch of slides later on that I may or may not show you. But effectively, if you do something like a global signal regression, it doesn't just-- suppose you've got a correlation matrix of all these regions in the brains correlated with each other, or even all the voxels 100,000 by 100,000. but Just pretend it's smaller, say.
Global signal regression or something like it, for one thing, the overall correlation without doing anything like that is very strongly positive. You know, it's mostly positive correlations. But once you do global signal regression, the average correlation will drop to zero. Now if it's simply lowered all the correlations, that might not be too bad except if you go from 0 to negative, what is that supposed to mean? But it's a little uglier than that in that it distorts the correlation matrix in an unpredictable way.
So when you do global signal regression, areas that were not correlated with each other suddenly are, sometimes positively, sometimes negatively. Areas that should be correlated with each other maybe are no longer correlated. And the basic trouble is, imagine these bold included tissue based regressors, they have some good bold signal and some noise signal in them. And a voxel that has a strong correlation with the noise that you're trying to remove, what happens when you regress out this global signal?
Well, you subtract out the noise, right? But at the same time, you're subtracting out that bold. So you might wipe out a bold effect or you might introduce a negative bold effect. Similarly, if you have a voxel that has a nice bold effect that's in the global regressor, suppose it doesn't have the noise but it has the bold effect. What happens when you do the global signal regression? You remove the bold effect and you add the noise. You put the noise in where it didn't used to be, and that can correlate other places where you're putting in the noise but removing the bold.
So you can you can get-- and then you've got 100,000 voxels, right? So the range of problems can be very widespread. And of course, things aren't perfectly correlated. You'll have partial correlations-- you know, be modestly correlated with the noise or the bold, and so you'll get partial effect. So you remove a little bit of the noise, but that means you're in some sense adding in some of the noise that you didn't have to remove some of the noise that you do have. So you get that you get these weird effects.
And if you are just removing quantities of a noise like in the motion parameters, that's fine. I mean, you do your best to remove the clear noise signal. But when there is bold involved, you know, then you're adding or subtracting signals of interest all over the place. And so the results they're harder to predict. It's less reproducible at that point because it's more random data dependent.
So for example tissues that we do go after, we might use ANATicor to go after local white matter. I'll get to that. Is that soon? Let me just run through these, and then I'll see what's in the slide because I don't quite remember everything in here, and then we'll mention some of the details as we go. So again, de-spiking I mentioned. This is before. This is after. Are these big-- so that's a big motion spike. Is it useful to have that in the data, or would you like to hammer it down?
It's hard to say. Some people think that might actually-- the spike should be in there for doing registration, but rarely is that really actually helpful because we've got interpolation. We're in the middle of voxels. Registration is not going to just clean those spikes out as we've seen. So hammering down the spikes just so they don't have such a big impact on the sum of squares is often a little nicer.
PRESENTER: How is it what?
AUDIENCE: The de-spiking part?
PRESENTER: So remember like with outlier count, so we get a median absolute deviation from the trend, say. So at each time point, you know how many of these [INAUDIBLE] you are from the trend. And the de-spiking operation, basically, if you're more than two MADs from the trend, it maps basically two to infinity down to two to four. So things that-- spikes that are bigger are still bigger. Spikes are still spikes, but they're much smaller spikes. So it doesn't-- and it doesn't just truncate them, it still drops them down, you know, so that one spike is still bigger than the other one.
AUDIENCE: [INAUDIBLE] the volume, the volume is too big right, but then the intensity increase?
PRESENTER: It's closer to the trend, but it's still away from the trend, so you still see that there is a difference there, but it's just not quite as impactful, and this isn't a huge thing. We've seen in our analysis that this actually helps a little bit. But you know, this should not make or break your analysis. Almost no little steps should make or break your analysis. The only one we more-- well, one, you shouldn't have mistakes in your analysis, which many people do so. But aside from that, whether you do any one step or not shouldn't hopefully not hurt you too much.
Slice timing correction. We talked about that. Being off by half or a whole tr in your time series will make a difference in correlations. Registration. Doing it all in one step is nice. I don't know the status of other software packages with respect to combining registration steps. But anyway, it's good to do it at once so you don't keep blurring your data. And that also applies to-- a lot of people do their analysis in a ridge space and then warp beta weights or warp something to stay in space. That's going to obviously add another blur to the results in a similar fashion, and you may not really need to do that or really want to do that. So just a related thought.
So once you have all your data registered in your final space, that's a good time to extract your tissue based regressors, again, before any blurring operation. If you want to extract local white matter, or average white matter, or ventricle principal components or something like that, if you drop a 6 millimeter blur on your data, you know, now it's gray, white, ventricle. It's all mixed up. So it's not appropriate to do tissue based extraction after the blur. Those are some common choices that are mentioned in here, average white matter.
When Hang Joon Jo was working on this stuff with the ANATicor method noted-- uh-oh. Where did I go? When he was working on developing the ANATicor method, which is local white matter regression, he was doing lots of correlation, seeing how much areas were correlated with each other, obviously. And the little thing that was kind of interesting is, basically, the white matter was not correlated with anything including white matter.
So the only time white matter regression should really make much of a difference is if you've got some sort of artifact in there that is stronger than usual. So for example, scanner artifacts, which is what we had in the data that we were playing with, and that was the reason ANATicor was written. But there is a-- we've seen there is a little bit of a bold signal in the white matter, but it's very small. And when you average some stuff together, basically there's very little left there. As voxels gets smaller, that may be different.
There is blood in the white matter. There is a little bit of-- therefore, there is a little bold in the white matter. If we start getting tiny voxels, then the methods may have to adapt. So people have tried taking principle components out of the white matter time series, maybe that's CompCor, maybe a more common CompCor method is components out of the ventricles. And if you have ventricle masks or something like that, you can give them the AFNI proc and tell it to take principle components out of the-- regress them out.
So what is ANATicor? ANATicor you have-- I don't know what is white and what is not white, but let's assume that white is white here. So you've got an eroded white matter mask basically because you just want to avoid gray matter. So you decide what white matter is, and then you have erode it a little bit. But at each voxel location, you might drop a 20 or 25 millimeter radius sphere and get an average time series white matter within that sphere. And for the most part, as I mentioned earlier, that shouldn't correlate with too much unless you have some artifacts in the data, which in fact, we did.
For example, little coil problems where the signal was changing a few percent asynchronously, a little randomly over time. So this makes a difference for that. But basically, you're removing white matter that-- the average white matter signal that is close to each voxel. So that's computationally a little intensive. At one voxel, drop a sphere, you get a time series average, and then you have a regress of no interest at each voxel. So it's a four dimensional data set that is one degree of freedom that you want to project [AUDIO OUT].
This was-- I sped this up a little a few years ago. In some sense, you want the closer white matter to have-- you might want the closer white matter to be removed preferentially over far white matter. So this mask, basically, if you picture it as a disk or as a disk-- so you're going to take an average of the local white matter, it's flat. It's unit height. So every white matter voxel contributes equally to this average time series projected out. But maybe you might want more of a Gaussian curve where the closer the white matter is, the bigger the contribution to that average white matter signal, and then you project that out giving the close white matter preference over the far white matter.
That seems like perhaps a nice way to go, but that seems more complicated. So now not only do we want an average, we want a Gaussian weighted average. So much more work, right? No. Actually, it's much, much faster to do this if you do it right. You can take white matter time series, remove all the rest of this, drop a Gaussian blur on the data set, and you're done. So you take your 40 time series and just mask out everything but the white matter eroded masked voxels and drop a Gaussian blur, and you're ready. So that's the fast ANATicor method. So you can you can ask AFNI proc to do that if you want.
AUDIENCE: [INAUDIBLE] voxel which is inside the eroded mask, right?
PRESENTER: Every white matter voxel. Yeah. Yeah. That's right. That's right. That's right.
AUDIENCE: Ideally the rate would be smaller than the eroded, right? Otherwise, we wouldn't be picking up also gray matter, or--
PRESENTER: Well, you take out the gray matter. So you've got disk or sphere, say, and you're only looking at the white matter voxels within it. Anything else is zero. So doing this with a flat disk or ball, if you will, is a lot more work than doing it with the blurred because you can apply the blur ahead of time and make it fairly fast with applying the Gaussian blur because you can do it in each direction independently.
So in the x direction, it's just a Gaussian blur, but only along the x-axis. Then you do it along the y-axis, then you do it along the z-axis, and it's the same thing, and it's faster to do it that way, then you blur your data as much as you want to. Here's an example of the effect on blurring. Not blurring at all, a little bit, a little more. We start to get these-- here we start to get correlations that look like they follow anatomical contours nicely, and then eventually they just blow up and take over the whole area.
So you can look at this and ponder what seems like a reasonable amount of blur for yourself. And again, with the insta stuff, insta correlations you can do in AFNI that we will get to, you can play around with this and see the effect, and that may give you a better idea of how much blur you want to apply. But again, we lean towards small. Then nuisance direct regression. So we talked about this stuff. Motion parameters, the first differences, possibly white matter signal.
Measured respiration. So if you're collecting physiological signals-- like you get the pulse ox and you stick a respiration belt on the subject so you can measure their breathing-- if you get you get these at the scanner, you actually can convert them to the two signals that might be better captured in your data and regress them out, too. So those would be mostly be via a retro RETROICOR type of step. In AFNI proc it's a right core block, but you can convert those two times series into up to 13 regressors, and then include those.
And those would actually be projected [INAUDIBLE] 13 regressors because it's done slice by slice. We don't want even registration in there. We don't want blurring or anything like that. So the data is basically unmodified when you remove those slice wise regressors because they're trying to be very picky about the timing. The blood cycle, you know, is the fine tuned part. The breathing you wouldn't care. The breathing could be done at the very end. But if you want to be picky with the heart rate and stuff like that, then we try to do that early.
Band passing. It's very common to band pass in resting state such as in the frequency range between 0.01. Just to keep pointing you at the help and my propensity for whining, I'll point you at the help page for AFNI proc again so you can see my rant on band passing, not that you really necessarily care that much about the rant. So I'll go into the old program help again, find AFNI proc. And after the examples, I've got all these note sections. This is in the resting state unless I have a band passing note. Do I? I don't think so.
So resting. OK. See, comments on band passing. So here's where I babble a bit. But the basic thing is, if you're tr says two seconds, that's a very common, that means the Nyquist frequency is 1 over 2 tr. So you basically can evaluate one sawtooth cycle every four seconds. Your tr is two. That means you can measure an up and a down as the fastest cycle you can capture if that's a full cycle. So basically, you're looking at-- so if you can capture the up and down every 4 seconds, that's your frequency of 0.25 Hertz.
If we quickly band pass from 0.01 to 0.1-- so ignore the 0.01. The very low ordered terms are not important. We're keeping things that are faster than the 0.1 Hertz. That's the good range, from 0.01 to 0.1. Ignore the bottom one. So just 0.1 is what you care about. 0.1 and lower. So we're throwing away the fast stuff because we think it's not bold. So keep 0 to 0.1. Throw away the 0.12, 0.25. Well, you're throwing away 60% of your frequencies. That's the easiest way to put it.
When you create regressors-- when you do a 3D band pass or some magical step in another software package, oh, I band passed. That was one button press. It's done. Magic is done. My deed is basically untouched. You are throwing away, even in this nice gentle case, 60% of your degrees of freedom. If you put this in a linear regression model and you have 200 time points, that's 120 regressors you've just add it to your model. That's a big [INAUDIBLE] after these flies, you know.
You may handle-- you may get rid of a lot of the respiration effects like that, but you know, that's a very big cost. Plus, you can't even see the cardiac signal in your data, right? At a tr two seconds, it's already aliased in there. Well, so what if my tr gets down to half a second? Then I can very clearly see the cardiac cycle. Now band passing is a great idea. Well, if your tr is half a second, what's nyquist now? Nyquist is 1 second. 1 Hertz. So if you're keeping things from 0 to 0.1, but now your data goes up to 1 Hertz.
Now how many regressors, what fraction are you losing? 90%. If you have 1,000 [AUDIO OUT] that's at this fast tr of 0.5 seconds, 1,000 time points, you're going to use 900 regressors to clear out all these high frequency signals. So you'll probably get rid of those heart rate. Fantastic. 900, 90% of your degrees of freedom are gone. So heaven forbid that you're censoring as well, and 15% of the time points get censored.
Well, we're already in the negatives, and we haven't even de-trended the data. We haven't modeled motion. So you take out 90% for band passing and 15% for censoring, you're already dead. That [AUDIO OUT] is long gone. And that's what was happening. And actually, in a lot of the earlier papers telling you how to analyze your data, that's what is happening in there and the negative degrees of freedom, but still publishing these results. So that's why it's nicer in a safeguard way to put this all in a regression model so you see the impact of this.
You know how many degrees of freedom you're using and how many are left. And it's fast enough because we don't censor with the 3dDeconvolve. We don't band pass and censor with 3dDeconvolve directly. It just makes matrix for us. There's a 3D t project program that we use because you're just projecting out this stuff, and that's much faster. Anyway, so that's my band passing rant. Any questions or comments about that? So you can decide what you do if your tr is on the order of two or three seconds. You still accomplish it without wiping out your data as long as there's not too much motion, but there's still a big cost in there.