Computational models of mouse visual cortex and tutorial on Allen Institute Data Observatory
August 11, 2019
August 11, 2019
All Captioned Videos Brains, Minds and Machines Summer Course 2019
Michael Buice, Allen Institute for Brain Science
MICHAEL BUICE: My original plan was to launch into a tutorial on the Allen SDK for use with the brain observatory. But if people have questions about the scientific analysis for the Allen Brain Observatory, least the 2P passive viewing stuff, I'm happy to go into more detail on any of that for a few minutes if people would like to ask some specific questions.
But first there-- so we're going to walk through a tutorial. There are two ways you can do this. One is you can clone this GitHub Repo. And what we're going to do is go through the tutorial in BMM 2019. If you've got Python installed, I can show you-- I can explain how, or will explain how to install the Allen SDK. And so we're going to go through the tutorials in this particular repo.
Or you can go to the URL that's listed over there on the left hand board and you will be asked for a token at one point. That token is listed there. I'm not saying it because I don't want it to be on the video. That token will expire in 30 days or something like that. But what you're doing is accessing a set of Amazon instances, AWS instances that we set up for doing computational work and for accessing the data. And you'll have access to not only the data that's easily downloadable but all of the raw calcium movies if you want to analyze those things. That's something you can't easily do on your own machine. And I'm going to show you how to access all that for a moment.
So if you want to, go ahead and go to that URL and try to start up an instance. This is what it will look like after you've authenticated. It will ask you for a GitHub account. So if you have a GitHub account, it will authenticate with GitHub. If you don't have a GitHub account, you either need to make one or download the repo I described. So I'm going to go ahead and get started while people are setting those up because it'll take a few minutes to get those instances running.
Eventually you'll get to a point where you'll see a pop up window with an action Start and the button Submit. As soon as you click that, it will spin up your very own Amazon AWS instance on which you can run Jupyter notebooks. Alternatively, you can look for Allen Institutes slash brain observatory examples on GitHub and clone that repo.
Once you get to the point you see this screen, if you see Open Notebook, just click Open Notebook and you're all set.
AUDIENCE: What is it again?
MICHAEL BUICE: So if you get to this screen, I thought it would take a little bit longer but he's already there, just click Open Notebook and it will open and Jupyter Notebook for you. So if you've gotten here, click Open Notebook. If you've already clicked Open Notebook, great. That will take you to-- actually let me just do it again.
So here, you'll have your own user name there. Click that. Brain observatory examples. And BMM 2019. And we're going to start with guided tutorial. And for the moment, this is going to be a little bit boring because it's just me telling you functions that are available. And then after we go through that, we're going to do a little more interactive-- I'm going to give you some problems that you can work on and not just me rattling off functions to you.
OK. So here we go. So as I mentioned, so there's at least one of-- is anybody using the GitHub repo, not the Amazon? So if you are using that, make sure you do pip Install AllenSDK. So you have the AllenSDK. Just for your edification, after the AWS instances are gone, this is the package that's available. So if you pip install that on whatever environment, you will have the SDK available to you.
A few quick words about this service. This service will be available to you throughout the course. The token expires August 30th, I think. So if you want to do analysis on this using the brain observatory data or whatever, these instances will be available to you for that time. OK. I've brazenly assumed everybody is comfortable with Python. If anybody is not comfortable with Python, hopefully most of what I say will at least be clear conceptually and you can follow along. Hopefully, if you've got enough-- nothing I do use it is going to use any sophisticated features of Python or anything like that.
So first thing we are going to do is just import some useful scripts or useful packages, Numpy, Pandas, Matpotlib. That having been said, if you have some questions about Python or anything that confuses you, by all means, ask them and I will clarify. OK. So the way the AllenSDK works is we have a thing called the model of all the SDK data sets. There's a thing called a cache. In this case, we're using the brain observatory data. So we're going to get the brain observatory cache. So we have to grab this object. And we're going to instantiate that.
A word of warning. So what this does is, we'll pass it a manifest file. If this doesn't exist in the path you specify, it will make it for you. So you could just leave this blank and it will make a repository for the data in your current working directory. Don't do that because all the data is already available on the Amazon instance at this path.
So if you're working locally, you'll need to make your own data directory. And it can be wherever you want it as long as you have read access to it. But again, if you're on AWS, just leave it as is and it will grab the data. OK. So this makes this observatory cache object. And this is the object that has all of the methods that we're going to use to access the data, download the data, examine the metadata about the experiment.
So one of the first things you might ask, Christof went over it very, very quickly. So I'm going to dive into some of those details so that you can understand the data set and the format of everything. So we usually refer to this as a data cube internally. Hypercube I guess would be more appropriate. But there are multiple dimensions over which we took a lot of different-- we did a lot of different recordings.
One of those is the set of targeted structures or areas. I apologize for the somewhat bizarre nomenclature. This is our internal nomenclature for the different cortical areas. I can decode at least VISP for you is what is normally called VISV1. But all of these others are the different higher order areas. You might be asking if you're not mouse people, which I gather from the responses earlier, most people are not, where's V2, where's V4, where's MT? And I wish I could tell you. These are all, other than V1, these are all anatomically named. So we don't have any functional correspondence with something you might think of as the macaque ventral stream.
But OK. So there's this method called Get All Targeted Structures. It will just give you a nice list of the strings telling you all the targeted structures. Similarly, we imaged across a bunch of different depths. And so you can grab all the imaging depths. So these are just useful functions in case you forget, OK, what areas are available to me? What imaging depths again?
Also the cre lines. There's a set of 13 cre lines, excitatory and then at least two inhibitory lines there. I have, in fact, never used this particular function other than in this kind of demo, Get Reporter Lines, but it's there in case you need that information. And all the stimuli.
So to run through this, because this is one of the more important things. So some of these are self-explanatory. We have drifting gradings and static gradings, for example. Locally sparse noise is a stimulus that has black and white spots with an exclusion zone. We have three different versions of this.
There's the original locally sparse noise stimulus, which is shown in one of the sessions. And there are two different what we call four degree and eight degree. Sadly, this is one of those things where systemization can bite you. This is not a four degree stimulus. And this is not an eight degree stimulus. But the name stuck. It's actually 4.5 and nine. Be aware of that.
But these are essentially two stimuli we introduced at a later date because the original locally sparse noise stimulus which has approximately four degree spots wasn't driving the higher visual areas the way we wanted to. So that's the reason there are three of these. Natural movie one, two, and three are the first 30 seconds, the second 30 seconds, and the following two minutes of the first three minutes of Touch of Evil. Natural scenes is the set of 118 natural images taken from three different natural image databases. Spontaneous just means the screen was gray for that time.
OK. So the first thing we're going to do is run through and figure out how to get information about the data that you might want to collect or might want to look at rather than just go right to data itself. And so we're going to choose a particular visual area and cre line. If you want to fiddle with this and change these values, by all means, go ahead.
So we're going to take VISP and CUX2. So CUX2 is the layer two, three, and layer four line. VISP again is V1, the mouse version of V1. OK. So I should have mentioned, I blazed right through this and didn't mention something very important, or at least useful. Where is it?
So one thing you can get from this particular repo is this cheat sheet that we made for the brain observatories. It's a two-page cheat sheet that basically covers a bunch of commonly used things plus information about the data set. So I'll remind you what the experiment is.
You've got a mouse on a wheel that can run, passively doing some stimuli, which stimuli are there, the specific values of the stimulus parameters that were run. And the reason I went to this is the way the recordings are organized. So since we showed all of these stimuli, you can't pack all of that imaging into one particular session with the mouse. So what we do, because you can't leave the mice on the wheel for that long. They get stressed and the eyes foam over and so forth.
So what we do is we separate that over three imaging sessions. There's a session A, session B, and a session C. And we interweave the different stimulus blocks like so. So this structure is what we call an experiment container. And we have a function that will return all of the experiment containers. You could leave the optional arguments out and just get all of the possible containers. Or if you want a particular visual area and cre line or sets of cre lines and areas, you can use this optional arguments to add a list of those areas and cre lines and you will just get the containers associated with those areas and cre lines.
Now this doesn't actually give you any data in the sense of recordings, traces. This is just a set of metadata about these things. In fact, this is a list of dictionaries. There it is. Each element of the list is a set of data about those containers. In particular, what depth it was recorded at, what targeted structure, cre line, reporter name, et cetera, and the ID.
So every chunk of data or grouping of data, whether it's a specific experimental session or a experiment container will have an ID that you can use to uniquely reference it. So rather than look at a list of dictionaries, let's make this a data frame. And you can see the information that we have about that container.
OK. So let's grab a particular experiment container. This should be the first one on that list. And now we can use another function, Get [INAUDIBLE] Experiments, that will just grab a list of actual sessions. So remember, a container is a session A, a session B, a session C. Individual experiments are one session.
So Get [INAUDIBLE] Experiments with no arguments will just return all of them. If you add a particular optional variable, in this case, the experiment container IDs, you can get just those container IDs. In this case, one container ID, which recall, I described a container as being a session A, a session B, and session C. So it shouldn't surprise you if the experiments that get returned, there are three of them. One has session A, one has session B, and one has session C.
And just in general, so you might want to slice this in a bunch of different ways. One thing that I wind up commonly doing depending on what's happening is I'll just grab this entire-- I'll leave out the optional arguments, grab the entire thing, and then I'll grab a data-- make it into a data frame and slice this in some way if I'm looking for sets of cre lines or sets of session types or whatever.
OK. So here's how the containers break down. Again, session A, which has these particular stimuli, session B, which has these particular stimuli, and then we have two different types of session C, based on which kind of locally sparse noise. It will either be the one normal locally sparse noise or a combination of the so-called four and eight degree locally sparse noise, which are really 4.65 and 9.3 degree.
If I'm going too fast, by all means, slow me down, ask questions. I'm going too slow, by all means, tell me speed up, as you wish. OK. So there's also an optional argument to [INAUDIBLE] experiments, which is stimuli, which will grab a particular stimuli. In this case, because I'm asking for natural scenes and this container ID, I only get one left, because there's one session that has natural stimuli.
And the important thing to know here, or at least one of the important things to know here, there's the container ID and there's the ID for the session. This is the number you need to have to grab that particular actual data set. So we're going to save that into a session ID. So now, actual data.
Get [INAUDIBLE] Experiment Data with the session ID will actually download the NWB file that contains that particular data. If everything is hooked up correctly for the AWS, this should just return instantly. If you're using this locally or if you've got your path set up wrong, you'll get a warning that you're downloading data. If you want to be downloading data, like if you're on your own local machine, that's fine. Everything's working as it should. If you're on AWS and you're downloading data, something's wrong. Go check your path.
So from here, a lot of it's pretty self-explanatory and well-documented. You can just tab Complete Get Underscore the methods in this data set object. Let me step back. This will return a data set object that points at and grabs all of the information in the NWB. So Christof mentioned NWB. We're actually not going to use them directly. These are NWB files in the background. But everything here just grabs them through this API. So you just need these function calls that are going to grab this information for you.
So in this case, this data set object has these Get Underscore methods that will grab things from the NWB file. You've got specimen IDs, indices, and a bunch of different kinds of traces, D mix traces, DFF traces, et cetera. The reason there are all those kinds of traces is that these are different steps of the processing. The one you'll wind up wanting, and we'll go through this in a second, is get DFF traces. That's the final stage of what we call our processing. But if you wanted things before neuropill subtraction or before D mixing or you just want the actual raw fluorescence traces, which is what you Get Fluorescence Traces is, that's all available to you.
And we'll step through a bunch of these. But it is possibly worth your time just poking through these to see what's available and what's not. Yes?
AUDIENCE: So everything that we've talked about so far is sort of identifying what information is available and how to access it?
MICHAEL BUICE: Yep. That's right. Yep. That's what we're doing. Yes. OK. So let's actually access that information now. So for example, you've gotten your data set and you could've arrived at this in a bunch different ways. The way I described it is you start looking through the experiments and you want to slice it in some way because you want such and such a stimuli or you want such and such an area or cre line or layer or something like that.
So you've made these selections and now you've got a set of experiments or one particular experiment. And so you can get a quick look at what cells were recorded by looking at the max projection. And so we can plot that. Of course, this is just an image. This is a max projection of the calcium movie, which is, all of these are roughly 70 minutes or so, depending on exactly which session we're talking about.
But of course, that doesn't tell you the actual cells. That's just a picture. So there's a thing called Get RI Masquere. And I want to warn you, there's a thing called Get ROI Mask that, in my opinion, is basically just useless. This is a historical thing. So this is the thing that I find more useful. It's actually just a Numpy array of the masks. And so that returns an array that is number of cells by image by image height, or height by width. And so if we just plot that on top of each other, bam. So these are all the cells that were actually segmented.
So I should note, in terms of the content of the data and how we processed it, this is not every cell that any image segmentation algorithm could possibly find from the calcium movie. So when we process this, we have a set of processing steps that happen. One of them is a segmentation algorithm that grabs everything that kind of looks like a cell. Another is a filtering step that removes things that we think are not cells. It removes things that are probably dendrites or various processes. And so we try to get the things that are actually just soma.
So for example, if you have your own particular segmentation algorithm you want to try, by all means, do that. But know that the thing you want to compare it against is not this set of cells. It's something that's actually not not available to you. It's available to you, but it's not in this file. So your segmentation algorithm will find things that aren't pictured here. And we found some, too. We just threw them away because we wanted this to be somas and just regular cells. So--
AUDIENCE: Another question.
MICHAEL BUICE: Yes.
AUDIENCE: So we're looking at a particular experiment to give us some background about what, like, in the scope of general imaging experiment [INAUDIBLE]?
MICHAEL BUICE: Yeah. So let me run over to my cheat sheet here. Right. So yes. Sorry. I guess I went-- this was this was too fast and too compressed. So all of the data we're talking about is a set of industrial scale experiments where we have the mouse head fixed in a calcium microscope on a wheel with a monitor over here. It's being imaged for roughly 70 minutes, depending on which of these sessions is being shown.
Every experiment is one of these things. And they're grouped in this container way that I've described. It's passive viewing the whole time. The mice are trained to do anything. They're not required to do anything. They are not required to pay attention. They are not required to run or not. Some of the mice run all the time. Some of the mice run never. Most of them sort of half and half.
That's the general setup of the experiment. So for each one of these things, we do this imaging and then we send the calcium imaging, which gives us some 70 minute movie, we send that through a set of processing steps, motion correction, denoising, trace extraction. The traces go through D mixing and neuropill subtraction, et cetera. And so at the end of the day, you end up with these DF over F traces. And all of that, the metadata for that, the intermediate processing steps, all of that goes into this NWB file, which can be accessed by this SDK. Did I leave anything out?
AUDIENCE: No. You're good.
MICHAEL BUICE: Good. Excellent. Sorry about that. OK. So speaking of said traces, here we go. So this will return two things. One is a set of timestamps. The other is an array of the DF over F traces, which is number of cells by acquisition frames. So something you should know is the time as far as these processed data is concerned. Not pre but post process data. Everything is maps to acquisition frame. Not, for example, stimulus frame. So even the stimulus times, we're going to get to how to look at when the stimuli were shown and so forth. That's all in acquisition frames, the counts of the number of frames of acquisition of the 2P scope. And so that's what this 113,888 is. Yes?
AUDIENCE: So how do we get the actual time?
MICHAEL BUICE: That is what TS is. That's the timestamp. That's the clock timestamp. The indices for all the arrays will be the acquisition frames. Poof. So there you go. That's 50 cells from this particular experiment. You might already notice, for example, that you can maybe by eye bracket off some things as to when particular stimuli were shown, because you see cells that respond to those stimuli but not others.
That's one of the weird things we noticed is that something that is very characteristic of the data set is that cells like one stimulus and not another stimulus. And in fact, you can't really-- if I grab a cell from the data set, I veered into scientific results here but this one's worth talking about. If you grab a random cell from the data set and I tell you it responds to drifting gradings, that gives you almost no information about whether it responds to any other stimulus. Not none, but almost none. And certainly, if I tell you it responds to movies, you know nothing about how it responds to any other stimuli.
MICHAEL BUICE: This is just cell number. Yeah. If you--
MICHAEL BUICE: Yes. This is acquisition frame. Yeah. OK. So now, Christof mentioned this L zero event extraction that we do. So this is something we adopted later after this NWB format and SDK was finalized. So there's a separate function that will grab the L zero events. So this is a deconvolution algorithm that was invented by Daniel Whitten and Sean Jewel at the University of Washington.
This is essentially like, if you're familiar with some of these deconvolution methods, it's the same cost function as the Vogelstein extraction method with the prior replaced by an L zero penalty. So it just penalizes how many times something changes, as opposed to the count or something like that.
The reason we used it, it's not because any-- all of these algorithms from our testing are all essentially about the same. Looking at the [INAUDIBLE] data that-- the joint recording of electrical and optical signals. All of these things don't really-- none of them are perfect and they're all kind of in the same ballpark. The reason we use this one because it's really, really, really fast and we have a lot of cells that need to be processed a lot. Nothing deeper to it than that.
But to answer a question that hasn't been asked but you're probably worrying about it, if we use some other version of this, some other deconvolution algorithm, you get roughly similar results in terms of the things that Christof was describing. OK. So the important thing to note here is this-- you won't get the [INAUDIBLE] experiment event. And we call them events because they're not spikes, obviously, per Christof's discussion.
So yes. All right. So if we plot these instead, unsurprisingly it looks very similar. Perhaps it's easier to bracket off where stimuli were shown. So let's ask that question. What's going on with the obvious bracketing there? So there's a thing called a stimulus epic. A stimulus epic is when a particular drifting grading, natural scenes, whatever, is on the screen. And so there's a table called the stimulus epic table that you can return. And it will tell you-- there we go.
This gives you a data frame that tells you what stimulus was on, in what order, and the start and end acquisition frames of that stimulus. So static grading started at acquisition frame 747 and ended at 15196, et cetera.
MICHAEL BUICE: I'm sorry?
MICHAEL BUICE: 30 Hertz. Yeah, acquisition frames are required at 30 Hertz. And because this is a session B, we have natural scene, spontaneous, static ratings, and natural movie one. I didn't mention, but I should mention, natural movie one was shown in each session. That's the one stimulus we show every time. So it's something you can use to look at the variability across sessions.
OK. So now we can use those epics to color the background. And you can see when static gradings were on, when natural images were on, when the spontaneous activity was on, and when the movie was on. And you can see there are certain things that like the images, certain things that like the movies, certain things that like the, again, natural images. Where are we? OK. All right.
So traces are easy. It's just a set of traces. You also probably want more detailed stimulus information. So there are two types of objects that have stimulus information. One is the stimulus table. There's a stimulus template. The stimulus table is just simply a table. Get stimulus table for a particular stimulus. Is just a table of what stimulus was on when. So for natural scenes, it's which frame. So every national scene has as a frame index. And the start and end acquisition frames of that particular scene. For drifting gradings or static gradings, this is just going to be a list of what orientation and so forth.
For some of the stimuli, namely the movie, the images, and the locally sparse noise, there's actually a template that tells you the exact stimulus. So instead of frame 34, you might want to know what picture was frame 34? That's what the stimulus template will tell you. So if you ask for the stimulus template for natural scenes, you get back something that's 118 by, height by width.
reason the height by width is kind of funny is because it was set up for the full size of the monitor but we do a warping. So the monitor is 10 centimeters or 12 centimeters from the mouse's face. And so that's enough to distort optically the image. And so we correct for that by warping the image so that the image will still look flat. If you will, it's like projecting it an infinite distance away and making it very large.
And so this is just the part of the image that would be on the monitor if it were not warped that stays on the monitor. We clipped it because we had a loading problem. So that's why the dimensions are funny. The 118 is the fact that there are 118 images. OK. So grab one of the frames. And there it is.
And similarly, we can use these start and end times to figure out when those frames were on screen and plot those against the traces. OK. Lots of people ask about running speed. So here's the running speed. There's a Get Running Speed, which returns the displacement of the wheel in that acquisition frame and the timestamp.
And as I mentioned, sometimes the mice will run sometimes and not others. So this was a mouse that was fairly stationary until for some reason he woke up at the last-- he or she woke up at the last minute, last chunk of the imaging session, and just started taking off. And so we can align all these data. OK.
Now I should mention, so we have this website, observatory.brain-map.org. This is sort of a portal to the brain observatory data. You can click on cells here and get a list of summary data. So depending on how you want to interact with the data, you could just say you're interested in some of the natural movie-- or the running responses, right? So here's speed tuning for session A, right? So you get these speed tuning curves. You might want to find some cell that has a particular-- say it's tuned for low speeds. And so I have this cell. So I click on it and I get a summary of this particular cell. Hopefully.
OK. And so this is a page that has a set of summary statistics for all of the cells' properties. One thing I should note. So note there's a bunch of NAs here. So we tried to image the same field of view across these sessions. We mostly succeed, but of course every cell can't be captured in every session, sometimes because it just doesn't fire and doesn't show up and other times because it actually isn't in the imaging plane because the imaging plane's slightly off that day.
And so if we don't segment that cell in that session, we won't actually get it recorded. So most of the 65,000 cells don't actually appear in every session. There's about a third of them that appear in all three sessions. So that's the reason you see the NAs here.
The reason for this detour, though, is if you have a particular cell you like, that you might have, for example, found through the website, there's this number here, which is the cell specimen ID. That's actually a unique number that every cell gets assigned. So we can grab what's called the cell specimen table. We get cell specimens. There are 63,251 cells. We have 60 different pieces of summary information. And here you go.
And so, for example, everyone has a particular cell specimen ID. So you can either get a cell specimen ID by finding it in your experiment and deciding you want to look at that cell or you could find it on the website if you were looking for something like the running tuning example I just gave you. And this table also has other things like the direction selectivity index to drifting gradings or the orientation selective index to drifting gradings, et cetera. And the reason some of those are NAs are for the reason I just described. They don't appear in every session so we can't actually measure that particular feature.
OK. And if you see all these different values, you can just look at the keys of that dictionary and these are all the summary statistics we provide you. There's a white paper on the website that I just went to. Poof. That's not it. Too many here in documentation. There's a white paper here that will describe what all of those different parameters are in great detail.
OK. So here, another way of grabbing cells from an experiment container is we can take the cell specimen table and filter it by experiment container ID and we get a set of cells. So pop quiz. That number is 225. The number we saw before was, where is it? 174. Why are they different? These are supposedly all the cells from that experiment container.
So the cell specimen table is a complete list of all of the cells that we've imaged across all of the sessions, across all of the different experiments. So if I say, give me all of the cells with a certain experiment container ID, that's all of the cells that appeared across three different sessions. There's 225 of those for this particular container ID. Before, we just looked at session B, which told us only the cells that appeared in that session for that experiment container, which is a subset of the 225. That's the reason the numbers are not the same.
MICHAEL BUICE: Oh, gosh. Where did we say that?
MICHAEL BUICE: OK. Good. I was like, thought I had zoned out of my own presentation. OK. Excellent. Which has happened, but OK. . Yes. So just as example, let's look at the cells in experiments that have image 22, which we've shown here as their preferred image. So one of things we can do is there's-- one of the things in the cell table is a p-value for the significant response to natural scenes. That's what p_ns is. And the information about what their preferred image is, so that's these four cells.
Something I should mention, because it will be very important and it will be a problem if you dig into the data, there are cell specimen indices and cell specimen IDs. The cell specimen ID is a unique number that's a nine-digit number assigned to every single cell. But when you look into a particular experiment, for example, we got the DFF traces. That was something that was a 100 and whatever it was by acquisition-- number of cells by acquisition frame. That number of cells, of course, is an integer from 0 to n minus 1.
The cell specimen index is something that's specific to a session that is that index in that array. And cell specimen ID is the unique thing across the entire data set. And so we have Get Cell Specimen Indices that will map from cell IDs to indexes. And there is the corresponding opposite function that will map from the index to the cell specimen ID. Sir?
AUDIENCE: I have a question. Do you have any idea on average how many times these natural images were shown to the same [INAUDIBLE]?
MICHAEL BUICE: I can do you better than on average. They were shown exactly 50 times each.
AUDIENCE: 50 times.
MICHAEL BUICE: Unless the experiment was cut off for some reason. But the session definition was, every one of the 118 images was shown 50 times. That information is on the cheat sheet. The about 50 is because it could be cut off, but it should be 50. Some will be--
MICHAEL BUICE: Yes.
MICHAEL BUICE: That PDF is in the repo you just cloned. Yeah. Just to make sure that's clear. So at the top level of this brain observatory examples repo, there is this visual coding to cheat sheet. Poof. That's what that is. And so here we can plot when image 22 is shown.
And similarly, we can grab the individual trials using the stimulus table frame that we downloaded earlier. And so this is the individual 10 frames before the stimulus, so many frames after, and each individual showing a display of that particular image. And so note, back to science for a second, this kind of weird thing that there's a bunch of trials where it just doesn't care about that image. It looks like it's just not responding. And then a few trials where it is responding normally. Well, normally. OK.
So something you will not be able to do if you just clone the repo, but you will be able to do on the AWS instance, is access the full calcium movie. In general, if you wanted the full calcium movie outside of this, you'd have to either-- this these are hosted on Amazon, so you could pay for your own AWS instance or we can mail you a hard drive. But they're too big for us to just say, here, download them.
But these are stored in this particular directory. There are H5 files with the title, the name [INAUDIBLE] Experiment Session ID dot H5. And this is just a proof of concept that, in fact, one can do such a thing. And voila. Movie. There's nothing special about the code that generated this. Just merely intended to be proof of concept. So that's the whirlwind tour of how to access basic data. Does anybody have any questions at the moment? Yes.
AUDIENCE: So mine's sort of more of an overarching question, because in looking at all this data, and I don't work with mouse data ever, but looking at this huge repository of so much information, it's mind blowing to me, especially as someone who does research in a small lab that, we get [INAUDIBLE]. It takes a long time. So from this formulation of so much data to be found, [INAUDIBLE] hypothesis that a [INAUDIBLE] research labs that seem to be supported by small research but benefit from [INAUDIBLE].
MICHAEL BUICE: Kind of, depending on your perspective. So let me switch to some slides here along those lines. So this is sort of what Christof was getting at with the discussion of-- please? Maybe? Hey, there we go. OK. So right, when we went into this, we had this perspective that we know how V1 works. we know what simple cells are, we know what complex cells are, this should be easy.
We thought we would just survey this thing and get this big picture and then maybe figure out what the higher visual areas are doing, et cetera. And all of that fell on its face. At least as far as I can-- my perspective is that's what happened. And this is one-- they're many different ways of seeing this I think in this data set.
This is one way of looking at it, where you take what I would like to call the standard model, right? It's got simple type things and it's got complex type things. So if you had traditional simple and complex cells, this model should take care of you. It's also got the running speed involved in it. And you get nothing out of this, right?
Now depending on your perspective you might look at and say, well, we've done this. We know. One of our referees just, my God, said everybody knows this information already. But I think there's a stronger point to be made here, which is, let me actually-- I'll come back to this in a second.
So I'll explain these different things in a moment. But we have these different classes of cells. And I just zoomed through all the slides that explain where these classes come from. I'll go back in a second. Where the things that we identify as nones, meaning they don't reliably respond to any of the stimuli, that's a third of the data, right?
There are two statements here. One, a third of the data, the cells do not reliably respond to any of the stimuli we showed. And we showed a pretty broad range of stimuli. And in particular, we showed stimuli that, if you have the sort of standard picture of low level vision that says I've got some sort of local frequency decomposition, basis function, yada, yada, yada.
Then if you respond to drifting grading, you should respond to a natural image. And what's wrong with the world, right? This should all hang together. Instead there are things that respond to only one of those sets of stimuli in combinations that don't make any sense, meaning they respond to the drifting gradings but why not respond to that same frequency in the natural movies?
And unsurprisingly, if they don't respond, you would not expect the model to predict their performance. And by God, that's what you get. But then you get pictures like this, where everything-- that we have these sort of all cells where they respond reliably to every one of our stimuli. This is about 10 percent of the data. And these pictures look like if you open one of Jack DeLon's papers or something like that and you see standard models, these are the spread of R values you get.
This looks like one of those things where it's 0.4 or something. That's like respectable neuroscience territory for a model. Whereas down here is like, oh my God, we're embarrassed, right? And so that's, to me this is the big thing. Where it's like, well, why the hell is this happening, right? It's who ordered that? Like, what's going on here?
And so, at least for me, it's like we have to step back and say maybe vision doesn't work the way we thought it did. At least it's certainly the naive picture that I learned growing up that, OK, we've got our simple cells, we've got our complex cells. V1's done. Let's do something interesting now. It's just not the case.
AUDIENCE: Well, do you have inclinations for why the model has predicted-- we don't know what's happened, but explanations for why the model is so predictive of certain cells? Like, some of the really high predictive [INAUDIBLE] pretty harmful. But the fact that it doesn't generalize obviously a problem. But do you think you have good model of some of the cells?
MICHAEL BUICE: Well, we definitely have a good model of some of the cells. So I think one shouldn't be too-- so Christof showed, right, this cherry picked 0.7. and 0.7. So I'm not getting too excited about that because you've got some spread. After 65,000, cells you're going to have some accidents where, hey, this model just happens to work really well. We've done no control over 65,000 cells to think about the accidental fits of the model. I shouldn't say no control, but you have to keep a sober mind.
Where did I put this? Yes. So this is my picture of what's going on. This is my hypothesis, if you will, as to how to think about this. So this is a graph of the different cell classes, which I haven't defined for you. But the black blob here is the none cell. Yes, sorry.
AUDIENCE: Can I ask you something about the previous slide? The function of the [INAUDIBLE]?
MICHAEL BUICE: Yes.
AUDIENCE: How did you get the [INAUDIBLE]?
MICHAEL BUICE: All right. Fine. I was trying to get to the punch line without getting into the details. But I shouldn't do that. All right. I was going to go back and do that. But OK. Functional classes. So what we do is we take the reliability. We define a reliability for every cell. That's the percentage of time we got a significant response to the cells' preferred stimulus condition for that stimulus. That's the reliability to that stimulus.
So for every cell and for every stimulus, we've got some number between 0 and 1, right? That's the percentage that represents the percentage of time that it responded. We do a clustering on that, specifically to a Gaussian mixture model on that. And we do a cross validation to find the number of different possible clusters. Here's one particular run of that where I get a set of clusters that map the reliability profile of cells that belong to that cluster.
We take the least reliable cluster and use that to develop a threshold for-- we say you respond essentially if you're above that particular reliability threshold. So that defines a particular none class. In fact, that's what we call the none class is everything that goes in that cluster. And this color bar is defined according to that. So white here means I'm right at the threshold defined by that none class.
And now we can actually group everything by whether you're above threshold. And so you've got cells that reliably respond to the drifting gradings and natural movies, you've got cells reliably respond to all the natural stimuli, cells that reliably respond to drifting gradings, natural scenes, and natural movies, et cetera. And so these are the classes we see. And these are the distributions of those classes that we see.
And so this is where a third of the data, up there in the none class. The other interesting classes are, we seem to like natural and moving things, right? Drifting gradings, natural movies, or natural scenes, natural movies. About 10% of the data responds to everything. So out of 65,000 cells, we've got 6,000 of them or so that seem to actually reliably respond to every stimulus we showed. So these are at what I'd call your standard model cells, perhaps. There's some caveats with that statement, but that's the picture. OK.
And this is now the distribution. Now I don't have to define each of those because I've just done it. Each of these is the different populations of the different classes across areas. And these are the standard literature names for the six areas that were measured before. VISP, VISL, VISAL, VISPM, VISAM, et cetera.
AUDIENCE: [INAUDIBLE] which kind of cell can be [INAUDIBLE]?
MICHAEL BUICE: Oh, so that's a slightly different question. So let's go back to what the model looks like. So the results we showed for the model before were just every cell. For every cell, we developed a model based on natural movies, or natural stimuli and a model based on artificial stimuli. And this was the scatter plot of everything, irrespective of what class it fell into.
So now what I'm showing you here is these are the none cells in that. These are the cells that are natural scenes, natural movies. I'm only showing you three classes. And these are the all cells.
MICHAEL BUICE: Sorry?
AUDIENCE: So this is decided by the [INAUDIBLE] by the traditional model?
MICHAEL BUICE: Yeah. Exactly. So here's the median. So this is basically zero for the nones. Great. Not surprising. And this is respectable neuroscience territory for the alls, right? Not great if you're Google or something, trying to predict people's email. But fine for neuroscience. Different standards.
So back to what I would suggest is possibly going on. I think this can actually-- this picture can explain things a bit. So what I've done here, this is VGD16, and these are 25 randomly chosen units in the first three pooling layers. And I've computed their optimal stimuli.
So a standard thing you'll see is people say, hey, I trained my model on blah, blah, blah, blah. And look, the lower levels look like simple cells. Great, right? The brain must work that way. So here's this version of that for VGD16. Note that when you're somewhere in the middle, you get a set of cells here, some of which look like Gabor cells, most of which do not.
If you do the same thing with, say, Alex Net, actually not not the same thing. But if you take, say, Alex Net you start doing physiology with it, like, let's measure receptive fields, and so forth, you get a lot of beautiful simple cells in early layers, as you'd expect, but the percentage of simple cells starts going down as you get more complex features, right? So you get some things that just because of scale and variance and whatnot, or partial scale and variance, look sort of like Gabor type receptor fields, and the rest of it looks like garbage, because it just doesn't respond to that stimulus in any useful way.
And so I would suggest to you that this is actually what's going on, is that per Christof's discussion of selectivity, that you send your electrode in, you wait for it to make a sound, you're going to find the cells that respond to the stimulus you're showing it. And so you're going to find the things that look like simple cells if you're using those stimuli.
And so owing to this, if you're somewhere in the middle of one of these hierarchies, those are the cells you're going to find. So if V1 is in fact much higher order than we thought and it operates kind of like this, you'd expect some small fraction of it to look like the standard model. Say, maybe, 10%. Picking the number out of nowhere. And the rest of it is going to look like something else that won't respond to your stimuli, right? Or it will respond very sparsely because, for example, in VGD16, if you look at the response sparsity to natural features, I'm drawing it as going down, it goes up, because the selectivity of features gets more precise and more specific.
AUDIENCE: [INAUDIBLE] respond to natural stimulants or natural [INAUDIBLE].
MICHAEL BUICE: Sorry. Say it again.
AUDIENCE: So for these none cells, for these none cells, is that a consistent no, these allow-- it doesn't respond to any pictures or any--
MICHAEL BUICE: The none cells are defined by the fact that they do not reliably respond to any of the stimuli.
AUDIENCE: And it's consistent across all the data cells?
MICHAEL BUICE: Uh-huh. That's right. One third of the data. So you may have seen these various tricks people like to play that we can argue about in terms of the utility of these things or the specific meaning of these statements. But people try to compare representations in neural networks with actual brain representations.
And so is a version of doing this kind of thing with the brain observatory where you take what's called representational similarity analysis or representational dissimilarity analysis, depending on which paper you look at. And if you do this kind of thing, just for those of you who aren't familiar with this, basically what you're doing is you're comparing the entire representation by looking at how the similarity of different images matches or the dissimilarity different images matches across the entire population representation.
And you compare those two things to each other between, say, some layer in a deep network and a brain representation. And so if you do this for the brain observatory with VGD16, these indices are pooling layers. And so you see all the different colored lines here, different cre lines, in areas and layers, you wind up with the peaks of these representations being somewhere in the middle of VGD16, which is like layer 10, right? The third pooling layer is like layer 10 or layer 11. Not the first few that supposedly all look like simple cells.
AUDIENCE: Say it louder. I'm sorry. [INAUDIBLE]
MICHAEL BUICE: Yes. So I'll just restate the whole thing. So each of these lines here is the representational similarity correlation between VGD16 and the corresponding pooling layer and the corresponding cre layer, cre line layer area of the brain observatory. And the peak tends to be somewhere in pooling layer three or four, which is something like 10 or 11 layers deep into VGD16. It's not the first few layers.
And this is a general-- there's nothing special about VGD16 here. Grab any pretrained network you want and you'll see a very similar analysis where several, like inception, it's 100 layers or something like that. It's not the first few layers that look like simple cells. It's actually much, much deeper. And so I would suggest to you that that mouse cortex looks higher order than we expected.
Now, whether that's a mouse specific thing or just vision doesn't work that way we think it is, is a different story because I can't compare this. We don't have this experiment in macaques, for example. And as I've often said to people, we also don't, for example, put macaques on treadmills and things like that. Different class of experiments.
MICHAEL BUICE: So the reason I think it has this shape is because of this phenomenon here, where, right, you actually have a lot more complex features and it's matching the representation appropriate to those complex features. Right.
AUDIENCE: No, I meant why is there confusion [INAUDIBLE]?
MICHAEL BUICE: Yeah. Because it's most similar to those things in the middle.
MICHAEL BUICE: Right. Because it gets less similar to them. Right. So for example, right, none of these correlations are perfect, either, right? They max out at like 0.4. So I don't by any stretch of the imagination imagine VGD16 or any pre-trained network looks like the mouse. And so at some point you're just going to get features that are more specific to solving the image net problem rather than doing anything the mouse is doing. And so it's going to get more and more and more similar and then drop off. So to me, the interesting part is where it gets most similar not the fact that it drops off anything. Yes?
AUDIENCE: [INAUDIBLE] they all tend to show low complex functions. For example, [INAUDIBLE]. Do you think that has something to do-- that there might be just more sophisticated processing, potentially?
MICHAEL BUICE: It's possible. So the one more piece of data I can hand you is-- so one of things we tried to do is take a version of VGD16 that we just sort of crush, because VGD16's too big to fit in mouse cortex. So if you take something that has the number of units that you'd fit in, say, layer four, just imagine there's a naive feed forward circuit which is just going through layer four of the first few areas. And say, that's all the units I get. I have to somehow make something that looks like VGD16 with only these units.
If you do that, you wind up with a network that tries to get as complex as it can as fast as possible. So even though it's very, very shallow it's sparsity goes up very, very quickly because it's still trying to solve the image-- when I say train, I mean train on ImageNet. So doing the exact same thing. And it actually does pretty well. It gets like 50, 60% on ImageNet, with this really just terribly anemic network. But you wind up getting more complex representations.
And so it's entirely possible that this is just simply the fact that mice don't have a whole lot of brain. And so they have to solve a complex problem with very few resources. And that's how you have to do it. So that's my perspective as to what's going on. Yes?
MICHAEL BUICE: Yes. Do I have a slide for that? I have a really high level slide meant for people who are not computational. Sorry. So what happens-- let me explain the computational part. So what you do is you take an image by image population correlation for a given layer. So you take your VGD layer five, for example. You show an image. You show another image. You compute the correlations of that. So you have this image by image correlation matrix. That's your representation for layer five, say.
You do the same thing for some other thing, some, V1 layer four, whatever. And you take those two things. You compute their correlation. And that's what the representational correlation is. That's what's plotted here. Sir?
MICHAEL BUICE: I think the answer is no, but give me an example of what you mean.
AUDIENCE: Just as we [INAUDIBLE]. Settings that you can compare the presentation. Is the presentations more towards the [INAUDIBLE] cell theory or the [INAUDIBLE]?
MICHAEL BUICE: I don't know that we can speak to that, actually. We can certainly talk about the sparsity of the representations, which one might speculate. But I think one can only speculate in that case. Nothing is definitive. Yes?
AUDIENCE: So what's weird to me about this is that it seems like all of them have the same shape across all of these. Like, if there was-- OK, maybe you think that some of these differences are significant.
MICHAEL BUICE: Didn't say that, but go ahead.
AUDIENCE: Like, if you were modeling, like visual cortex as a convolutional neural network, then you would expect there'd be some kind of structure, some kind of correlation with these things. And it doesn't seem like that is--
MICHAEL BUICE: Yeah. So this is kind of interesting in the sense that, if you really want to try and read tea leaves, like some reasonableness to the tea leaves that I'm going to try and unpack. You might say that V1 sort of sits at the beginning and maybe LM, AL and PM are a little bit higher than that. And AM and RL are just not to be bothered, right? There not part of the circuit, right? They're somewhere else. So you might have this weird hierarchy but you can't distinguish between LM, AL, and PM.
So one interpretation of that is that you either have a really, really shallow hierarchy where the similarities are so close that you can't distinguish them, or you have a parallel hierarchy, right, where you go V1 and everything else. The anatomy actually supports V1 than everything else. If you look at the actual connections.
And this is actually in a paper that's going to come out in Nature from some other people at the Allen Institute actually looking at the connectivity of mouse visual cortex. That's exactly what you see is V1 sits at the bottom and there's a broad parallel hierarchy which is at least consistent with this. Sir?
AUDIENCE: Maybe you already explained this, but these numbers that you're showing, are they [INAUDIBLE]? What is the maximum prediction that we expect?
MICHAEL BUICE: The maximum that we expect?
AUDIENCE: What is the ceiling [INAUDIBLE] given the noise in the data?
MICHAEL BUICE: Oh. That's a good question. I do not know the answer to that.
AUDIENCE: These numbers look low, but I was just wondering--
MICHAEL BUICE: Yeah. They should be, right?
AUDIENCE: -- the data, but I don't see data. Is it,like, [INAUDIBLE].
MICHAEL BUICE: So a lot of it is the noise in the data. I can guarantee you that because of, I mean, the variability that we've shown you. I blew by it, but the threshold for what we call responsive is 25%, meaning 25% of the time we show you preferred stimulus, you respond, we call you responsive. You might already be depressed at that number, right? It's not like 90% or something. In the tutorial, remember, I showed you this set of trials to the same preferred stimulus. And it's like a handful of them. It's not most of the time.
AUDIENCE: Another question, [INAUDIBLE]. Those numbers zero to five, their layer numbers or--
MICHAEL BUICE: These are the index of the pooling layers where zero means the input. So there are five pooling layers in VGD69.
AUDIENCE: So five is like the--
MICHAEL BUICE: Is the last pooling layer. And the reason I did by pooling layer is because the similarity of the things between the pooling layers essentially matches the next pooling layer. So you can just approximate it with the pooling layers.
AUDIENCE: I'm just curious about the method. Were you using linear research question? What were you using to determine--
MICHAEL BUICE: This is just computing correlations. There's no prediction here. I'm not making a model.
AUDIENCE: So you it's like single features [INAUDIBLE].
MICHAEL BUICE: Yeah. You compute an image by image correlation. You take the correlation of that across two different things. That's it. There's no attempt at predicting anything here. This is not a predictive model. This is just simply asking how close two things are. Sir?
AUDIENCE: So if you think these are higher order narratives, can you compute what features the cells are responding to?
MICHAEL BUICE: I would love to do this. So one of the things we're trying to do now in response-- so something that-- I didn't highlight it. Hello. That's terrible. Whatever. So I didn't highlight it. But the things that generate the most reliable responses are basically drifting gradings and movies. The natural stuff tends to be very evocative. And one of the things that's limiting the models that we have is the fact that we're not generating enough events.
And so what we're trying to do now is use the advantage of calcium to go back to the same cells day after day after day to just show hundreds of minutes of dense movies with lots and lots of features. So if this perspective is correct, that I've been showing you, then we should actually be able to make a really good model with just hundreds of minutes of movies. And the feature specificity won't be a problem and we can actually answer that question, say, this cell responds to this feature and this cell responds to this feature by building a sufficiently complex model.
As much data as we have, that was just described as overwhelming a moment ago, it's still not anywhere near the kind of data you need to answer that question because the feature selection is actually relatively sparse. If you're used to thinking about the kind of data sets you need to fit some multi-layer model, we're not anywhere close to that, right? We have 118 images and three minutes of movie that's shown 10 times. So that's really, really sparse for a feature dense network in the modern sense.
This went in a different direction than I had planned, which is cool, but people should be aware that there is this other thing actually called Tutorial, and it is nothing but a homework assignment, if you will. So if you want to actually get some-- build up some chops using the data set, this will take you through some simple things of computing tuning curves. And it's nothing complicated, but it sort of leaves it to you to solve everything and when and if you get stuck, hello, there is a-- oh, it's not up here. I can put it up here, there's a solutions notebook that has everything refilled filled out that you can you can refer to because this is essentially blank it's just the questions but those have the solutions.
And I'm here through Tuesday morning. So if you have any questions, want to go into anything in more detail, want me to help you deal with the SDK or whatever, I'm happy to help. So we're at noon here so I guess we should stop. So thanks a lot.