12 - FMRI Analysis Start to End: Part 2 of 5
August 8, 2018
May 29, 2018
Rick Reynolds, NIMH
All Captioned Videos AFNI Training Bootcamp
Rick Reynolds, NIMH
For more information and course materials, please visit the workshop website:
We recommend viewing the videos at 1920 x 1080 (Full HD) resolution for the best experience. A lot of code and type is displayed that may be hard to read otherwise.
RICK REYNOLDS: One thing you might note, at the very end of the processing, the proc screen script, there are those quality control scripts that I babbled about briefly, ss_review-- there's ss_review_basic and ss_review_driver. This actually runs the basic version of the quality control script at the very end. So it gives you some information-- actually, you can see it running it right there.
So it runs ss_review_basic. Basic shows the subject ID, how many time points were removed, the final anatomy, the grid voxel resolution, motion limit, how much motion was there, how many times did we exceed the limit, how many time points did we end up censoring, what's the fraction of time points that we censored. It even will show the fraction of time points censored per stimulus condition. And then, temporal signal to noise average over the brain, stuff like this-- the dice coefficient between the anat and EPI. The g core value-- that's the global correlation-- that's the average pairwise correlation in the brain. So for every pair of voxels in the brain, n squared correlations, what's the average of those?
So if the average correlation is higher, there could be breathing artifacts in the data. The subject's taking deep breaths during the run or higher motion-- anything. So just just some random bits of information.
These ACF parameters-- did you talk about, Daniel, ACF at all? So I guess I'll babble about these little more later. But this is, instead of estimating the blur as a full-width at half max, we estimate the blur using ACF parameters, autocorrelation function. A Gaussian blur is an example of an autocorrelation function, but now we have a mixed-model autocorrelation function that does not only a Gaussian model, but Gaussian model is decay as you go as r squared gets big. So you decay for r squared. So it decays quickly.
There's also a linear component in the mixed model-- a decay as a linear function of the radius. So this is a different estimation for the blur. You can use this in the cluster simulation later on. So what you would do is just take these parameters and average them across your subjects, and then you can use that for estimating the cluster size needed for 0.05 correction.
So all this stuff on the screen here is actually saved in a text file that you can run gen group command-- gen group table. I forget now. But you can run this one command, and say, here are all my subject star slash this text file, and it'll just drop it in a spreadsheet. And then you look at the spreadsheet, you can see which subjects have too much motion. I'm going to have to drop these subjects, because their average motion is too high or the censoring limit is too many time points have been censored. So I'll talk about this again, too, I guess. Little too many random things at once here.
So in the directory that we ran this stuff, now it has it has FT.results. It has a few new files. I'll run ls -ltr so we see the new stuff at the bottom. Bless you. Actually, let's put the new stuff at the top.
So we ran that AFNI proc command. First of all, it generated the proc script. That's the analysis script that we ran. That is 467 lines long. So it's a fairly long script.
That script, when running that script, it creates this FT.results directory. FT is the subject ID. So it creates a results directory, copies all of your data into that-- your anatomy, your EPI data, your timing files. It copy copies that all in there, so now you can remove these results trivially without worrying about your original data if you have to for some reason. It copies it all in there. It goes in, and then does the analysis in that directory.
This output.proc.ft, all this garbage that we saw on the screen from the processing-- here's all the processing dumped to my terminal window-- that's all saved in that text file. So if you run into something goes wrong in your processing and you want to, for example, ask me a question about what do I think went wrong, you can send me this text file along with the proc script, and that makes it much easier to see exactly where something might have happened. But that's output.proc.ft.
So now, at this point, let's start looking at that processing script. And then we'll start following along with the data as it was manipulated in there. So I'm going to-- you have a little more screen space than I do here, but c'es la vie. I'm going to use less to view the proc.ft file. You can use whatever you feel like. This is just a text file. You can open it in an editor. I like having terminal control over the text size, so I'm going to do it like this.
So now I'm looking at the proc script. We can start looking at this a bit, and then we'll open up a new window and go after the data. But let's just get a feel for how this is organized. First of all, it's a tshell script. Again, that's just done for readability.
There are all these comments in here. This pound symbol at the beginning of each line is a comment. So that's just some text in there for readability-- for someone to look at and be told what this is intending to do. So, for example, here it tells you how to execute this script. Execute it with this tsch -xef. You could just run tsch proc.ft, and it runs. That's enough.
However, in case there are problems, we want to save information. So most importantly is this x option, that says when you're running this shell script, echo the command to the terminal window before you run the command. For example, using 3dvolreg. So it will echo the 3dvolreg, the full command, to the terminal window, and then you'll see the output from the command. If you don't see the command, you just see the output, you have no idea which program is spitting this stuff out. So it's hard to parse. So it's nice to have this x option in there.
The e just says terminate on error. I'll ignore the f for now. And this this stuff just says, I want to save everything that appears on the terminal window. I want to save in this text file. So that's the output.proc.ft file that we saw in this directory-- in the parent directory.
This syntax right here is tshell syntax. So if you're a bash user, you can actually give a bash option. And all that means-- it's not going to write a bash script for you. It's still a tshell script. But it'll give you the bash syntax for running this, which is like 2 greater than ampersand 1 and then piped through t. It's just slightly different syntax.
It spits out the AFNI version. So when you look at this later on you can be reminded what version was used for the analysis. It checks the history. In some cases, people run AFNI proc on one computer, then later on do the analysis on a different one. Maybe the AFNI binaries on the other one aren't as current as the ones that were used to write the script, so it might whine at you and tell you that if. It looks like these binaries do not have all the abilities to run the script that you're trying to write, it might let you know.
Then we set a subject variable to be our subject ID, FT in this case, the output directory will be FT.results then. And if the output directory already exists, it just whines at you and quits. It's not going to trash an old results directory. You'll have to do that yourself. We try not to delete your data too often.
Then it makes this runs array that's just 01, 02, 03. So it runs this count command. That's just going to make a list 01, 02, 03. And then, later on in the script, we can say, for each run do something to that runs data set.
So then it finally makes the output directory, makes a stimuli directory, and then starts copying things in. The first thing it copies are the stimulus timing files. AV1viz and AV2odd.txt. Then it copies the anatomical data set into the results directory. Then we have the next block. Again, this is an automatic block.
It's not one that we requested. We didn't request tcat, we requested volreg, and blue, and stuff like that. But the tcat block is just used because it uses a 3dTcat program to copy your EPI data into the results directory. It could just copy it, but this is where we removed the first two time points from every run.
So just a little intro to AFNI-like syntax here. The syntax varies per program. Life is hard, right? But most programs, again, you'll have the program name. -prefix is the name of the output file that you want to create. So the prefix is going to be in the FT.results directory, pb00, our ft.subject ID, run1.tcat. And it will make a head and brick files corresponding to that.
And then the input, in this case, 3dTcat doesn't have a -input option or something like that. Some programs do. This one does not. Older programs more likely do not. They have inputs at the end of the line. I find that confusing-- unclear at least.
So the input is ft epi run1 plus orig. And then this syntax right here is specific to AFNI. 3dTcat will know this syntax. So this square bracket 2..dollar says, read in volumes 2 through the end. So we're skipping time point 0 and 1. So this is where we throw away the first two time points before we even make it to the results directory. They don't get there.
The quotes are around here because these are special characters to the shell. Do we do we know what the brackets are used for by the shells? Those are wildcards. So if you want to match anything that has ABC and then either D, or E, or F, or a G, you can put brackets DEFG, and then whatever comes after it. And it will use that for a wildcard match, just like the asterisk or the question mark. Question mark matches exactly one character.
Anyway. So those are wildcard. We don't want the shell to try to match any files. We want the brackets to go to 3dTcat. So the quotes hide those characters from the shell. They tell it not to interpret them.
And then the dollar is used by the shell for process IDs. It used for variable access and stuff like that. Again, the quotes are hiding the dollar from the shell. So this is a little ugly syntax, but that's how we go after specific volumes within a data set, and it's just a nice, convenient way to not have another program that says please remove the first two time points or something like that.
So now we've got all the input data is in the results directory. We have the timing files, and the anatomy, then the EPI data. So we're good to go. We can cd into the output directory and get to work. We also have this array of time points per run. When we actually do the analysis, we're down to 150 time points per run.
So the next processing block-- there's another automatic. This is where we look at outliers, by running 3dToutcount. For each volume, we compute-- why isn't this listed as a fraction? I have to look at that. I thought we needed the fraction option. Or maybe I made that the default. I probably made it the default.
Well, I've highlighted the fraction on it. You know, I slept OK last night, but it may not look like that. So within the automask, 3dToutcount makes a mask of where it thinks the brain is based on a contiguous blob of higher intensity voxels. So it looks around a bit. But then it makes this mask estimation of where the brain is, and then it computes the fraction of voxels within that mask that seemed to be outliers at each time point.
We talked about outliers a little bit yesterday, but I'll just quickly quickly remind you of that. At one voxel you have a time series. That time series has some basic trend to it. And at each time point you can see what's the deviation from the trend-- the absolute deviation.
So across all time points, you compute the median deviation. Then you start over. At each time point, how many median deviations am I from the trend? 0.2, 0.8, 3.6? 3.6 is big. That's 3.6 times whatever is the median absolute deviation from the trend. So you might consider that an outlier.
So now, for one voxel, each voxel at a time, you have a time series of whether or not it's an outlier. And to flip that around, at each time point you know what fraction of the brain is considered to be an outlier. If a lot of the brain is an outlier, you probably have motion, or a scanner artifact, or something like that.
In the golden or the tin days of years past, scanners had more problems. There were more artifacts that you had to worry about. This was looking for them. Now, scanners have fewer artifacts like that that you'll see. This is more attuned to finding motion. And it does a good job of finding motion. If there are a lot of outliers at a time point, the subject probably moved.
So we get the fraction of the brain that's an outlier at each time point, and then we make a time series of that across all the runs. And if we asked AFNI proc to do so, it can use this time series for censoring. But now that we have something to look at, let's go look at it.
So I'm going to open a separate terminal window, because I want to leave this open here. If you have an editor, you don't need a separate terminal window. So have a terminal window that you can now cd into the results directory with, please.
So I will just open a new one here, make it a little bigger. Yeah, that's OK. And I'll cd into AFNI_data6/FT_Analysis/FT.results.
What do you do after cd? You type ls. So let's see what we have here. And I have a whole bunch of stuff.
Maybe I should-- did I write a title? Option title. No, I didn't make a title. I should do that.
So I'm just going to hide that window for now, since my black background makes it a little confusing what's what. But we have a lot of files here. And the first thing I'll draw your attention to are these PB files. PB in the processing script is just Processing Block. So processing block 0 was our tcat block. Processing block 1 is Tshift. Processing block 2 is volreg. And, basically, anytime we alter the EPI data, we have a new set of PB files. So that's what all of this-- all the PB files are EPI data as it is being preprocessed.
But in the processing script itself, we are in the outcount section, and we wanted to look at the outliers across the runs. So, again, that's just a text file. I'll just type ls. And so we see, it's a file in this directory. I can run less on it to look at it or open it in an editor. It's just a series of numbers-- sequence of numbers.
At each time point, that's the fraction of the automasked brain that seems to have outliers. So the first time point, 0.41% of the brain had outliers. If you were to censor based on this, maybe you'd consider using 0.05, or 1 in 20 voxels had outliers. Looking at this with less is perhaps not the best way. Maybe we want to plot the thing.
So that's a nicer form. You see this is going between 0 and up to 45% of the brain. What is that huge spike at time point 42? That was a head motion. Remember, this troublesome subject didn't move enough for us, so we had to fake motion at this time point. So that was a two degree nod that we introduced into the data, and you see a big outlier spike there.
If we censor that 0.05, say, we might draw a red line across the screen at 0.05. And any time we touch that line, maybe those time points will get censored from the regression. And you see we have another spike out here, where the subject actually moved on their own. That would get censored like that. But we didn't ask for outlier censoring in this example.
So this is just something to look at then. One thing I want to note about this file is look at how clean it is. It's right down by 0, and then the spikes are sharp and clear. It's, in my opinion, a nice file for detecting motion, a nice metric to use or measure. Questions?
AUDIENCE: This is post-registered data too?
RICK REYNOLDS: This is pre-registered.
RICK REYNOLDS: So you could run this again after registration, and in a different way it might show you motion artifacts, where if the subject sneezes, or hiccups, or something, and just jerks their head a little bit in the scanner, you've got good slice, good slice, good slice, bad slice, bad slice, very bad slice, bad slice, then they go back. And you get some good slices back when you start wrapping around. So a third of the slices are screwed up to various degrees or more, depending on their motion, of course.
So how does 3dvolreg vol-- how does their volume registration step manage to align this with a base? It can't. Many of the slices are screwed up. Some are good. Some are horrible. It can't fix that.
You can't don't have the data. That's garbage. That's why we'll censor it hopefully.
So 3dvolreg's going to align that to the base volume. Are the motion parameters going to show a big motion there or not? I don't know. If most of the slices are OK, maybe it doesn't look like motion at all. If many of the slices are bad, maybe it will rotate the thing, and you'll see motion.
But you don't know. You can't necessarily predict that. But you will see outliers. So the outliers for something like that will perhaps be a more clear measure of the motion, even compared to our motion estimates from 3dvolreg, say.
AUDIENCE: You'll go over how to censor later on.
RICK REYNOLDS: Yeah, yeah. So this-- we've told it. We gave it that 0.3 millimeter combination with the degrees censor limit, so we'll talk about that when we get down there. So leaping back to the script.
So you notice we have a couple commands done here to set minindex, and then minoutrun, aznd minouttr. Again, as you go through this script on your own, which you should do, don't worry about mastering the syntax. You're not-- that's presumably not a goal of yours so much. But we're just storing-- now that we've gathered our outliers, remember, we're going to do volume registration to the time point with a minimum outlier. Why? Because that time point probably has basically no motion in it hopefully.
So here's where we're figuring that out. Now we know the time series of outlier fractions. So we get the minimum one and figure out which run and time point it came through, and then we see those in variables to use later on when we get to the volume registration stuff.
So we know which run and time point it was. Can we tell from here? No. Just for entertainment-- you don't have to do this-- I'm going to leap back to-- that's the PDF window-- the other window and grep for minouttr. You don't have to do this. So grep for minouttr in ../output. This is the text output that went by the screen.
And we are running this script, saving all the output in this file. So there we go. That was the line. If you search open that text file and look for this, there is where it set them outtr. And let's just say minout and see what it finds. Now we get the run and the time point. So we searched for any minout pattern in that text file, and it found two of them, highlighted in red here. run3time.tr-- time point 24. So for this subject, that happens to be the time point we're going to align the EPI data to.
Next processing block is that Tshift. So this is where we're-- remember, when we acquire the data, we acquire it using that interleaved time, where the scanner's got two seconds to acquire a volume. How does it spend the two seconds? 50 milliseconds or probably 30 some milliseconds-- 30, 30, 30. So it gets slice 0, 2, 4, 6, 8, and then comes back and gets sliced 1, 3, 5, 7, 9. And that spans two seconds.
But now, adjacent slices are acquired half a tr apart in time-- half one second. Maybe, maybe not. Maybe you want to account for that. Maybe you want to adjust the time series so it's as if you acquired each volume at the very beginning of the tr-- boom, one volume-- rather than over a period of time.
So for each slice, you have a time series at every voxel. You might want to interpolate that backward in time to the beginning of the volume. That's what we're doing here. And again, it's up to you if you want to do this. If you're tr is around two seconds and you have a fast event design, it probably makes the most sense in that case.
If you're tr for some reason is 6 seconds because you have these gaps in it for some auditory stimulation or something like that, then you can't interpolate well. Six seconds is too long. You shouldn't be doing this. Even though you'd really like to, because if this volume is acquired over six seconds, the timing is really different over space. But you don't have the temporal resolution of your data to do an interpolation. So tough luck.
If you're tr is short-- 1 second, 0.8 seconds-- maybe you think, well, do I really care? Because you've got to interpolate the noise along with the data. So nothing is easy. So always choices.
So you can ponder these things. In our case, maybe it is reasonable. Of course, we have a 20-second stimulus design with a 35-second response function. Is this going to make any difference? No, we don't really care. But that's an example with a two-second tr.
So let's see what this does to the data. There's the command. It's done for each run for runs 1, 2, and 3 through the 3dTShit command to do the temporal interpolation. So let's back to the results directory. Let's see what this looks like now in the data. So with your terminal, in the results directory, let's actually run AFNI here finally.
So I'm going to just type Q over the image windows or close a couple of the image windows, and I'm going to put this stuff on top. I'm going to do like a before picture on top and an after picture on the bottom. So let's set the underlay. Click on the black underlay button and set your underlay to the original EPI data. So pp00.run1.tcat. We'll just look at run1 initially.
So set that as your underlay, and there we go. And then we'll open an axial graph window. Yeah, my horizontal screen space is a little tight. I might-- I don't know. I'll ponder. I don't know if I want to use a higher resolution for this. And now let's go find a pretty happy voxel that seems to be doing something that we're interested in-- something that seems to be responding to our stimulus.
We could actually leap to the statistical results, and find a good voxel, and jump there. But let's just poke around a little bit. I'll go to slice 11. That's interesting. This is perfect.
So there we go. There's our there's our happy voxel. Let's all get there. So the coordinates here are 22, 83, 15. So how do we all get there? Well, we can right-click on the image window and get one of those jump-tos.
So let's jump to x, y, z. I'll choose jump to x, y, z, with a right-click over the image window. And then we can type in those coordinates-- 22, 83, 15. These voxels are about three millimeters on a side. So you don't have to be exact here, as long as you're approximately in the ballpark. That's why I just used integers. We could type in the actual decimal places, but those are going to land on the same Voxel. Because I'd have to be almost 1 and 1/2 millimeters off to jump to the neighbor. AFNI's going to find the closest voxel center to these coordinates. And now hit set.
Lo and behold, nothing happened. For me, I was already there. But hopefully we're all looking at the same thing. Any questions getting to this point?
OK. So now some things to note. Since it's probably hard to see, I'm going to enlargen this a bit. These spikes down here are a little irritating, because they actually affect the scaling of the window. So such is life.
We've already looked at this. We know it's motion. You notice that this voxel, there's no spike there. And a lot of these voxels, there's no spike. Sometimes, the spike goes up, sometimes it goes down, sometimes it will go down a lot. What's up with that? Don't we have motion here? Why do we get a spike at all?
And sometimes we do, sometimes we don't. How do we know if we have motion? Why is there a huge spike there and nothing here?
Well, the scanner is looking at one spot in space, a one voxel location. And the subject's possibly moving around. So if the subject moves and moves back, where are they moving to? So if I'm looking at gray matter here, and the subject moves, and I'm still looking at gray matter-- it's a different location, but it's still gray matter-- how is that going to affect the signal? Probably not much at all.
But what if I moved from gray matter to CSF? Now, it suddenly gets bright. So if you're at a gray or white matter and CSF edge of the brain, there's going to be a huge spike if you go from gray to CSF and back. Or stay at CSF-- maybe it'll stay up. And sometimes you're going to go the other direction. Maybe CSF to gray or whatever. So at any location in the brain, you can't predict exactly how the motion is going to affect the time series.
So what does that say about motion regression? If I want to stick in some regression regressors to capture the motion effects, sometimes it's going to go up, sometimes down, sometimes nothing. But the motion parameters may go up, may go down. Are you going to capture these effects well with the motion parameters? No. Hopefully, though, you'll cap mitigate them to some degree.
But how do you correct for the fact that the subject moved? How do you process the data so it's as if the subject did not move? You can't. Motion is a real problem, and we can't fix it. The best we can do, perhaps, is some scanners actually adjust in real time to follow the subject's head. That's probably the best we can do right now.
But if the motion gets into the data, you can't get it down. We do registration. That helps. We have motion regressors. That helps. But we can't get rid of it.
And remember, voxels is are far apart. These are 3 millimeter voxels-- 2.75 squared by three. That's pretty far apart in space. If the subject moves even a little bit, now they move from a voxel center here, down to here. At one point in time you have data here, at one point in time here, and you want to resample it back to here on that moved grid. You don't have data there. It doesn't exist.
You have it here, and here, and here, and here, but not in the middle. So you have to interpolate. Now, you're blurring things together. If you have a motion spike here, a motion spike here, no motion here, no motion here, and you average them together? Now you have a different spike-- maybe smaller, maybe bigger. Who know? So motion is a problem.
That's a lot of babbling. I haven't even gotten to the time shifting stuff yet. So here are our indices, and the grids run slice 11. That's zero based 0 through 11. So since it's odd, we're more than halfway through the tr. So if you notice, the value-- so we're not at index 0. Let me go back to index 0.
I can either click on the left to drag my red dot back there, or I can even just type a 0 in the GUI. Now the red dot is all the way on the left here. So now that we're on the first volume, volume 0, we can see the time. The value at that's this voxel is 1,215 MRI units-- MR units, whatever that means. That can be scaled arbitrarily by the scanner. At time t equals 1.332-- so 1 and 1/3 seconds from the beginning of the run was when we acquired this voxel's value. And that actually applies to all of slice 11.
So more than half of tr, I hit my right arrow and scroll through time, it's 1.333, 3.33, 5.33, et cetera. So the tr is 2. So subsequent volumes differ by 2 seconds, but it's still 1.33 seconds into the tr.
So the 3dTshift step, we're actually going to interpolate this backward in time by, in this case, 1.33 seconds, and then it will be as if we acquired the volume at the beginning of the tr-- the whole volume as a snapshot. But we're doing an interpretation. That means these spikes will be interpolated. So they can shift to their neighbors-- that's a little irritation.
Everything's got good and bad points to it. So the important thing is to bicker a lot until you're satisfied. But just feel you know comfortable with the choices you make, and know that every choice is going to have problems.
So now we see a before picture. Let's see an after picture. So I'll squeeze that back up there. So now let's open a second AFNI controller. So you've got these three windows on top. Let's click on the black New button in the lower-left corner of the main controller. And now we have a second controller.
Notice this A versus B in square brackets here? So all the windows on top, they're connected to Controller A. Now, all the windows on the bottom will be connected to controller B. If you're looking at five, six subjects at a time and you have a lot of windows, that helps you to keep organized.
So let's open the same axial image and graph window here. Again, the graph, in this case, just shows a number, because the anatomical data set that it's displaying only has one volume. So there's no time series to look at. Let's set the underlay now to be the results of Tshift-- of the Tshift operation.