14 - FMRI Analysis Start to End: Part 4 of 5
Date Posted:
August 8, 2018
Date Recorded:
May 29, 2018
Speaker(s):
Rick Reynolds, NIMH
All Captioned Videos AFNI Training Bootcamp
Description:
Rick Reynolds, NIMH
Related documents:
For more information and course materials, please visit the workshop website: http://cbmm.mit.edu/afni
We recommend viewing the videos at 1920 x 1080 (Full HD) resolution for the best experience. A lot of code and type is displayed that may be hard to read otherwise.
PRESENTER: We left off just creating this E norm time series, this Euclidean norm time series that we might use for censoring, deciding on what time points to censor. And we saw the censor time series along with this. So hopefully, that's reasonably clear. Any questions, again, on creating this?
So then in the actual regression command, we come down to 3dDeconvolve, and this should look almost the same as the command we used yesterday. But again, this is more of the adult version where we're including parameters more like what they should be. So the input now is PB04, the scale data set. I said we have multiple input data sets.
Now, it might not be clear that our star, again, that's the wild card. That's why we include the .head at the end because it has to actually match files on disk. So the shell is going to expand all this garbage out to as if we had type PB04.ft.r01 scale plus TLAC.head PB04 run 2 and run 3. So it's as if we typed out all three data sets. It's just I don't know if it's better or worse. It's just more economical sticking it like this with the wildcard.
So those are the three runs. So now, 3dDeconvolve knows there are three runs, it knows the TR, and it knows how long the runs are. Then we give this sensor file. So we're going to sensor based on motion.subjectFT._sensor.1D. So that's the file we looked at. That's mostly 1's and a handful of 0's in it.
And essentially, what this is going to do is it's going to 3dDeconvolve is going to make a full regression model with 450 time points of an X matrix that is 450 time points long, and your data is 450 time points long, but we're censoring six time points. It will first create the full matrix and then essentially just erase those six rows.
Why do it like that? Well, the baseline terms and your basis, your bold response terms, you need temporally continuous data to make these. You need the temporal continuity to know that the curves go like that. But once you make your curves, you can wipe out those time points afterwards. And now, those time points will not have any effect on the subsequent beta weights that come out of this.
Let's review the full model that we're creating here that we will sensor from. So we use polar 3. Again, the runs here are 450 seconds long. That's right on the border of when it jumps up to 3-- I'm sorry. Not 450 seconds. It's 150 time points TR of 2. It's 300 seconds. And that's right where it jumps up from 2 to 3. The rule of thumb, unless you tell it what polar to use is it uses a linear drift model plus 1 degree for every 150 seconds.
So going from 149, if you hit 150, it will go up to a quadratic. When you go up to 300 seconds, it goes cubic. 450 would be fourth degree, et cetera. And again, you can tell it otherwise. If you want to use drip modeling more like FSL or SPM, if you want to use sinusoidal modeling, you can do that too. It's not quite as convenient, but you can just use a polar 2 here, even if you use higher order sinusoids because sinusoids can't match polar 1 or 2.
So use polar 2 here, but then you can use the sinusoid functions as if you were band passing. So you can see band pass to use the high pass filter where you remove the lower trends. And you can use the same cut off that is applied in FSL or SPM. I forget. It's 150 is the cutoff. Anyway, you can ponder how you do that, but then that AFNI proc would create the sinusoids for you and stick them in here. So you would do that with a band-passing option if you wanted to.
AUDIENCE: Can I ask you a question? For censoring, how many [INAUDIBLE] are too many to have the ones [INAUDIBLE].
PRESENTER: So per runner, per subject, it really depends on your subjects. And well, your subjects determine how many people have, in general, but your experiment design how many time points do you need for all of your classes. So in this case, we have two classes. We have big response curves. We have a lot of data points. We could probably censor a fair amount of this and still have a robust model, get a robust result.
But if you have some of your stimulus classes of interest, only have a handful of events or especially at a faster event design, losing a handful of time points in the fast design model is more damaging than in this slow one. It's going to be based on that, but you really have to think about it. If you have a lot of classes with not so many events, if you lose a big fraction of your classes of interest, then you have to worry about it.
In the end, when we run that review SS review basic, the actual processing showed that at the end-- and we'll look at it again. That tells you the fraction of nonzero time series values out of your regressors of interest, the fraction that was censored per class or per regressor, actually. So it'll tell you you lost 30% of this regressor of interest, and if that's too much, then you drop it.
But what is too much is hard to say. It depends on your design. In general, it's censoring. To some degree, you would like a rule of thumb with censoring. You'd like to censor as much as possible, really, as long as you have enough data to get a robust result. Because at the time points that there is motion, but you don't censor, you're essentially putting garbage into your model.
And I've seen in a lot of cases people analyze kids, irritable children, and they irritate them in the scanner. So there's motion, but they're censoring at 2 millimeters initially because they don't want to throw away too many time points. Later on, after badgering them for years, they started going down to 1. At some point, they ran a comparison with a group result between the 1-millimeter censoring. 1 millimeter is big already. It was 2 before. They compared 2 to 1 at the group result level, and the group result level was better for the stricter censoring at 1 millimeter.
So you threw away more data, but the data you had was more reliable. So everything is trade offs.
AUDIENCE: And the 0.3, the most conservative?
PRESENTER: 0.3 is a semi-random number that I threw in here for examples, but it seems like a reasonable number for healthy adults. For resting state analysis, the example uses 0.2 because motion is a bigger deal. But what should you use, it depends on your subjects. If you have good subjects, you can use a more strict level. If you've got a group of good subjects and a group of higher movers, you have to be consistent.
So then you might have to use a higher censoring level, and maybe most of your good subjects won't get censored. So that's a messy aspect. But you'd rather remove the bad data. If you know the subject moved, don't corrupt your results.
So then our model is that polar 3, so we'll have four polynomial terms per run, a constant and then the linear quadratic and cubic terms per run. So this is 4 terms, 3 runs, 12 parameters right there. And then the same eight stimulus time series as before. We have our two regresses of interest created according to these options.
So here's the timing file for our visual reliable task. The first event was a minute into the first run. And we tell it to model our events with it convolving-- I can't double click on that-- a 22nd box car with the 15-second block curve. And again, each one of these will be a 35-second response curve, and we'll assume all of the responses, or the model is based on having all of the responses to the visual stimuli the same.
That's why the regressor is all the same, all the shapes are the same. The question is, how much do we have? If it doesn't match the data, the beta weight will probably be small, and the fit will be like this. If it matches the data, well, you could have a medium fit and match the data very well if the bold response wasn't big. If the bold response is big due to this, you'll get a larger beta weight just for the magnitude of the bold response. And then the t-statistic will be based on the variance explained by that.
But still the height of each response would be the same. That's in this simple case. You can do other things. You can have things vary over events or durations or whatnot. You can get complicated. I think Don will talk about that tomorrow morning.
So those are the visual reliable events, and then the auditory reliable events, and then our six motion parameters. This isn't exactly as adults say, I would recommend. At this point, it would be better-- we've got three runs-- you're better off modeling the motion per run. And that means not having 6 parameters, but 18, 6 per run. And that means the motion terms, they wouldn't span all the runs. They'd span the first run and then 0. 0, then the second run, 0, and et cetera for however many runs.
So that would capture the motion better. If you have 10 runs of data and you throw 6 parameters across all 10 runs, they're not going to explain very much variance. Sometimes they'll match, sometimes they won't. So they won't be so-- they get less useful the more runs they would have to cover. So it's better to do that per run. And the [INAUDIBLE] proc examples, I think mostly show per run. So that's our model.
Let's look at the rest of the options just to see them, but then we'll go back and talk about the model, and we'll look at the results. So jobs 2, my computer has multiple CPUs, so parallel processing happens in many different ways. In 3dDeconvolve and Remo fit. It does by forking children processes off internally and running things in parallel that way and collecting them so. 3dDeconvolve does all the work for that, but there are other programs that use OpenMP. That's why you need OpenMP libraries.
On the cluster, we just break things up into one subject per CPU node on the cluster. So you can paralyze stuff however you want. But anyway, this is old school parallel processing. That's using multiple CPUs by one program, creating a child of itself, and doing the work that way. So that's job 2.
Then we had our GLTs, your general linear tests, and these are symbolic GLTs. So we have we don't have to use 1's and 0's and negative 1's and 0.3, 0.3, 0.3 if you're averaging, or whatever. So we can just say vis minus odd, then we're going to get this contrast, then the corresponding statistic.
And then we've got a mean thing here, just to have it. We get f-stats, we get t-stats, and we name our X matrix. I'll come back to that. And finally, at the end, we get our ERTS data set. ERTS is error time series. These are the residuals, and y equals x theta plus e, these are the e's at every voxel.
We don't have the fit data set. Normally, we'd have a fit time series option of the model fit to the data. You want to see that. We told [INAUDIBLE] proc, we didn't want to do it here, and that was just to save RAM, save computer memory during the 3dDeconvolve step. We told it to do it later.
And where did we do it? Way down here. Its farther down the script than I expected it to be. But anyway, we just do it later on as a calculation. We take 3D calc, we take the all-runs data set, which is just a concatenation of the scale data. So just the input to 3dDeconvolve is all runs, but it's in one data set now, and then the residuals, and we subtract them. And the subtraction is the fit time series, so it's just doing it as a separate step.
That's taking y minus epsilon and getting x beta. x beta is literally the fit. The X matrix times the beta weights is your fit. Because each beta weight times each regressor is the fit for that voxel. So normally, you might do that in 3dDeconvolve. We just put it lower to save RAM. Sometimes the laptops sharing Linux and Windows or whatnot, older laptops, they don't have enough memory, so that's why we stuck it in here.
So that's the end of the 3dDeconvolve command. Let's talk more about the model now. Let's look at the model and make sure that the X matrix looks like we think it should. So I'll leave this here for now and leap over to the terminal window. And let's do 1D plot/ subscale on x.xMat.1D And remember that x.xMat, this is where we ask 3dDeconvolve to make the X matrix or save it in this particular file.
That's actually, if we didn't tell it anything good, we use the same name, but this just is more clear. So I'll use 1D plot to show that. And there you go. There's our full regression matrix. So we have three runs, four polynomial baseline terms. We have the constant term, a linear drift quadratic, and cubic, three sets of them, one per run. So that's 12 polar terms and then the same two regressors of interest and then six motion parameters.
Let mean direct you to this. Does this look a little odd right here? Hard to see. It's a little chopped off. This is a censored matrix. This is after censoring. This is missing six time points, including, I forget three or four right up here. It should actually go all the way out to 450, and it does not.
So let me leave this on this screen and also bring up-- I don't know if you want to bother keeping up with this part, not too important. No sensor. You can guess what no sensor might mean, so 1D plot on the no sensor data set. That is your full regression data set. And you notice this looks wider than before.
So this data set contains all-- you see it goes to the very end all 450 time points. The normal X matrix does not, and we've asked for both to be saved on disk just so we can look at them and think about them. Any questions about this?
AUDIENCE: I see a first, second, and third order polynomial there, like the red, green, blue. You have all three of them when you had O Polar.
PRESENTER: We had polar 3.
AUDIENCE: So that includes 1 through 3, not just 3.
PRESENTER: Yeah, all of them. That's right. To model any cubic polynomial, you need the lower order terms 2. So this will fit any cubic trend. It doesn't have to look like this. Anything that has a cubic trend will be comprised of these four terms.
AUDIENCE: And that's not the actual plot. That is the beginnings of the third-order polynomial?
PRESENTER: These are Legendre polynomials.
AUDIENCE: That's the same as the data doesn't actually look like that. It doesn't need to be detrended that way.
PRESENTER: Right. The data, if you detrended the data to a cubic degree and you detrended all of this stuff up here, you'd get the exact same beta weights as you would get here, except you'd lose those degrees of freedom, but you'd get the same results. This would be the same as detrending with a cubic if you also detrended these dudes. But it will have the same effect.
So any cubic trend of the data will be fit by these terms. Any other questions about that? So we try to put everything in one model for that reason, handling degrees of freedom. Even if you're doing band passing, we put it in this same regression model using sinusoids. Same as using a fast Fourier transform, but that's a slow Fourier transform. You put it in a linear regression model, it takes much longer because it's just like having a lot of regressors, but it has the exact, same effect, and you can count degrees of freedom.
A lot of people have done analyses like resting state. They tell you how to process the data. You should remove time points with motion and have a lot of tissue regressors and whatnot. If they would have accounted for all their degrees of freedom, they were publishing in the negatives, which is to say having 150 time points, but 160 degrees of freedom used up with regressors. The result should be garbage.
But if you do it in this magical preprocessing step, you don't notice that those degrees of freedom disappearing, and you don't remove that from your other regressors, so you're not handling it properly, then you still get a result. So it's better to put in one model so you know exactly you're accounting for everything. Let me just finish up this script here, and then we will do that review driver step.
So after the regression, we come down here, we show core map warnings. I'll come back to this, but this looks at all pairwise correlations in your regression matrix, whines about anything that's too high. We'll see that later.
This is where the all runs data set was made, just catenate the three runs together so that you have something to look at the fit time series with. We'll do that soon. Here's where we create a TSNR data set. Most of this isn't so important, so I'm not going to talk about it. There are more important things to talk about.
Here we do blur estimation with this 3D full with half max X program. And we're specifying this ACF method. This also writes full with half max measures, but Bob has changed it to output zeros. He doesn't even want you to know because he doesn't want to use those blur estimates. He wants you to use the ACF method. We'll babble more about that later, but here's where they're made.
And then later on, we have this block where we generate review scripts. Here's say, the more important one, Gen SS Review Script [INAUDIBLE] Pi. This looks at the data files in the Results directory and makes the script that you should use for a single subject quality control. So again, for your first couple of subjects, I recommend doing everything we are doing here and maybe a little more in-depth.
And then once you're happy with the processing, you think things are working like you expect them to, you can just do quick quality controls. That's a couple of minutes per subject with that. We'll do that in a minute. And at the very end, this script actually runs the basic version of the QC. That was the last thing we saw at the end of the processing.
Any questions about the script at all? So let's do that then.
So I'm going to close all of these AFNI windows, and I'll do it the fast way by when I click on Done, I'm going to hold the Shift key down. Hold the Shift-Done, and that'll just blow away everything. Typing ls here, here are the driver scripts, the review scripts. @ssreviewdriver basic and commands. Commands is the same as the driver script. It's just no comments, no fluff. It's just very tight.
But running this is what you should do with every subject, so let's just do that now. You can type ./ssreview @ssreviewdriver, or you could do TCSH. That's the syntax of the script. TCSH @ssreviewdriver. And what this will do is, this is going to tell us what to open some windows, possibly, tell us what to look at, and wait for us to close those windows and say, OK to move on.
So the first thing it does is it runs that basic script again. We've already looked at all this output. Now, we understand more. Yes, it's censored six time points, a censor fraction it was 1.33% of the time points. Here we had two stimulus classes. So we get the fraction of time points censored for each of our stimulus classes.
So the visual class, I think, is listed first in our command. So 2.5% of those time points were censored, 0% of the audio. And that's time points out of the ideal response curve. It's not out of the stimulation points. If you ponder numbers to use for subject exclusion, so remember, you can trivially create a spreadsheet across your subjects with these files. So across the top, it will have all these things, all these values, and then you have by subject.
Average censored motion is one of the most useful ones to use for subject dropping and the censor fraction. So the censor fraction, if they've had too much of their data censored, you should drop them. Beyond that is average censored motion. This is the average motion in the data excluding the time points that were censored.
So what you might notice is a lot of subjects, you might have these head jerks for whatever reason, you get spikes in the data. There are these big motions, you censor them out, you're good. Some subjects aren't like that. They actually are almost vibrating in the scanner. Literally, you see almost continuous small motion, and that's horrible too. It's just that's not going to get censored. That's going to stay in your data.
So you may want to drop subjects that have too much motion after censoring too, so you can ponder these types of things.
AUDIENCE: So I'm curious about spikes that are unrelated to head motion. So for example, [INAUDIBLE].
PRESENTER: If there's a gradient spike, the outlier censoring would catch that and censor that. The motion parameters might not get that, and you might have it left in the data. So if you have gradient spikes being a concern, you may want to do that. We also have a program that's called @radiocorrelate that you can ask 3dDeconvolve. You can ask afni-proc to run too, or you can run it yourself.
It takes a local average of 30 millimeters. At every voxel, it takes a 30-millimeter physical average over time, and it correlates with that. Basically, the results should show correlations in small, anatomically relevant areas. If you get a big blob from that, that means you have a channel artifact, and a whole channels area might be affecting the signal, and that's a concern, but that program would make that more apparent.
Using [INAUDIBLE], local white matter average regression was-- an [INAUDIBLE] was done for that reason to clean up the data so that you can still use it, assuming that channel artifacts aren't that horrible. Once we're happy with something, we can close the newly opened windows and move on. This didn't open any windows, it just dumped the stuff in the terminal window, so we'll just say OK.
Now, it's telling us to review the plots. Outliers in motion. There are three windows that had opened up. And we've seen all this stuff before. This is the outlier account, and it shows, for example, a 0.1 red line here. We didn't ask to do outlier censoring. If we had, this would actually apply, but this is just here for information. And here's is the E norm time series with our 0.3 value, and the green lines that go up are the time points that were censored. So you can see exactly the effect of these things on your censoring.
And here is a combined plot that has the motion parameters and these two time series altogether. For this subject, because this was the subject was irritatingly good and didn't move much, there's basically nothing here except this one spike that we had to fake ourselves. So there's not too much to see here, but normally with subjects that are moving more, you'll see these time series. Sometimes they correspond, and sometimes they don't.
So you'll see that outliers will sometimes capture motion that the motion parameters do not. And maybe the motion parameters will be high, but there weren't any outliers. The outlier fraction wasn't high. So maybe it was a sudden short, small movement mostly between the TRs where the motion parameters were big, but motion volume registration corrected the problem well enough where it didn't leave a lot of garbage in the data. So anyway, you can see things in either plot.
So again, we'll close the plots, and then hit OK to move on. You don't have to close the plots, but it keeps opening things, so you don't want to get overwhelmed with screens. So now, it's telling us what to do here, but I'm just going to do it quickly. This is going to look at the EPI anatomical registration result, the way I prefer to do it, but everyone prefers doing it differently.
So here it says leave the underlay as the anat. Set the overlay to the vol rig. I should change the default rate data set to be the base image, but it doesn't really matter, I guess. And then it says to turn off the overlay and basically, the point is to go to a couple places in the brain, and it tells you to toggle between the underlay, the two data sets as the underlay.
So right now the anat is the underlay. If I hit the U key in the image window, I toggle between the anat and the EPI being the underlay. And the point is to look at the contours of this. So keep your eye right on this part or right on this S pattern here. As I go back and forth, you can see they match really well.
And so you look at that in a few places. CSF is very bright in the EPI data. That's what Daniel was talking about, that CSF area that stuck off the top of the axial image when it looked like a bad registration where that was a coronal image. It looked like a bad registration, but it was good because this was just CSF it was bright and stuck out above the anatomy in that image.
So that the CSF will be bright here but dark in the T1. And so you can just look at that and evaluate the registration. We have many times caught bad, public data this way. You can just see in some cases it should be flipped. You can see that if I flipped this, it would be better.
Now, we have an option we can give to align API and add to test stat by flipping the anat, doing another alignment, and comparing cost functions, and they'll tell you if it seems to align better the other way. But by eye is how we've actually caught this. If you don't look at your data, you wouldn't catch it. And then maybe your big find on the left side of the brain is really on the right side of the brain.
So it's fairly interesting. Anyway, it's educational if nothing else, but it's a very solid way to test alignment. We use this contrast that is fairly well driven by the CSF to do the registration. A lot of times registration is performed using mutual information or something like that. In mutual information, we'll put this stuff on top of each other pretty well, and you won't get messed up very often. But if you look at it like this, it's actually not very good.
Mutual information is not going to end up following the contours. It's going to plop the things on top of each other, which may look appealing at first, but not under scrutiny. This method will, when it fails, it can feel drastically, and at a glance, you'll see it's horrible, and you may have to help it out a little bit. But other than that, it does a much better job, but you can request mutual information if you wanted it, but in general, these LPC methods work pretty well.
So I'll close AFNI, and then hit OK to move on. So AFNI is closed, and then I'll hit OK over here. And now it's saying to look for regression warnings. There are three types of warnings that might give anything from 3dDeconvolve, sometimes for 10 functions-- I won't talk any more about that-- and then X matrix warnings.
So here is a warning from our X matrix. Here's that 1D tool that will show core map warnings, correlation matrix warning. So looking at all pairs are all correlations of regressor pairs. Anything high gets whined about. So in this case, it finds a correlation of negative 0.16, 0.612 between our aggressors of interest. Surely, it's time to panic.
We shouldn't see this high correlation or very strong negative correlation between our regressors. What is correlation? To some degree, you can say it's predictability. Our regressors have this sinusoidal pattern. When visual reliable is up on the bump, where is auditory reliable? Zero. When visual is flat at zero because of the fixation period or because of the other response, the audio reliable curve is up. That's predictable, and it's a negative correlation. This does make sense.
So the question is, is it high enough to panic over? No. Actually, it's the next step that will actually talk about this. So we can decide whether to panic on the next step. We'll save it for that, but we do get a high correlation out of here.
Sometimes this will show baseline terms too. Sometimes motion will be correlated with the baseline terms because if the person shifts up during the runs, which happens, the delta will be correlated. If there are only two runs, the zero and then all ones for the constant term in run 1, the constant term that it gets that's for a run 2 will correlate with the constant term from run one, that's all 1's and then all 0's. The correlation is negative 1. They are perfectly anti-correlated. Totally predictable. 1 0, the other one must be 1 But you don't care. It's OK. That's an understandable part of the model, and in that case, it doesn't matter.
So let's hit OK to continue on. Now, we look at some of the regressors of interest. So it shows us one image with our two regressors of interest. If we had a lot of them, we would see many here. And remember, this is a nice block design. If you're doing a fast event design, what do your regressors of interest look like? Noise. But with this nice, slow, long, block event design, we get these big ones.
This is their sum. This is this sum ideal thing that adds them up. And if you see an anomaly here, then you might have messed up your timing. This looks good. Now, should we worry, again, about that negative 0.612 correlation? Well, there's a lot of variance in here. We're going back up to 1, back to 0, up to 1, back to 0. There's a ton of variance in here. It looks good.
We would get worried if this would flat line. Then things are starting to approach multicollinearity. In this case, we're good, but we still get the high correlation. So let's close those and hit OK. And then the last exciting finish here, it just opens the result data set and shows you something, hopefully, something useful.
So it looks at the full f-stat data set, the full f-stat volume, thresholds it to a 90 percentile, and then jumps to the peak voxel. So this script is telling AFNI what to plot, what volume to look at and how to threshold, and what to jump to. You can control AFNI through scripting very nicely. Daniel will talk about that.
But anyway, hopefully, this is an area you care about. Should we see visual response, visual activation? Yeah. So this looks reasonable, and this is interactive. You can start goofing around if you want to. I'm going to show you one last thing before I let you run away.
So leaving this up here, I'm going to set the underlay data set to be all runs, so set the underlay to be all runs. So that's all three runs concatenated. That's the fourth one down. And open up a graph window. So we're at a voxel with a really big f-stat.
And remember, we can go to the Op menu, and we can do that double plot thing to see if our model fit to the data. So we can hit Opt in the lower right corner. And then under trend 1D, there's that data set number N plug-in. So Opt trend 1D, then change None to data set number N, and I'll click on that. And then we can toggle the input 1 and choose the Fit Time Series, the third one down.
So toggle Data set 1, choose Fits, and I'll change it from red to blue because I like blue, and hit Set and Close. So there's our model fit to the data. And how is this created? Again, for run 1, we have some beta weight presumably around 100. There's clearly that the linear slope, the linear drift term, clearly looks like a positive number. We have a linear drift up there. The quadratic and cubic terms might be small. It doesn't look like a big cubic effect here.
And then the magnitude from going up and down gives you an idea of how big the beta weight should be for your regressors of interest. So the beta weight times each regressor, add these up, and you get your data, and that's the fit time series.