A Conversation with Prof. Blake Richards
May 2, 2019
May 2, 2019
All Captioned Videos Scientific Interviews
On April 26, 2019, CBMM graduate student Martin Schrimpf took the opportunity to sit down and chat briefly with Prof. Blake Richards of the University of Toronto / McGill University.
[MUSIC PLAYING] MARTIN SCHRIMPF: Hi. My name is Martin Schrimpf. I'm a PhD student with Jim DiCarlo here at the Center for Brains, Minds, and Machines. And it's my pleasure to chat with Blake Richards today, who is a professor in the Department of Biological Sciences at the University of Toronto and will soon move to McGill. Blake, welcome to CBMM.
BLAKE RICHARDS: Thanks very much for having me, Martin.
MARTIN SCHRIMPF: So to start, why don't you tell us a little bit about your scientific background and how you came to work on the kinds of questions you work on today?
BLAKE RICHARDS: The genesis of my interest and the kinds of questions that I work on today was, in fact, my experience as an undergraduate student. I was originally studying Cognitive Science and Artificial Intelligence was the name of the program at the University of Toronto as an undergraduate. I wound up doing that after my initial plan, which was to be a rave promoter, failed. So my father after me and my friends lost a bunch of money on a particular rave was like, why don't you go actually take school seriously?
So I did that and I entered this program. And this Cognitive Science and Artificial Intelligence program exposed me to the field of neural networks, which was at that time kind of an old field that had died and was out of flavor. But it was nonetheless very compelling to me. I found it very interesting. And I found it so interesting, actually, that I decided I wanted to understand a little bit more about how real neurons worked and then maybe circle back to neural networks.
So I ended up going to the University of Oxford for their neuroscience program. They have a four year PhD program where the first year lets students from other disciplines get up to speed on neuroscience. And I did that and then was able to delve into more physiological level neuroscience. I ended up studying synaptic plasticity in xenopus tadpoles with Colin Ackerman and did a lot of patch clamp experiments and STDP experiments and that kind of stuff.
But I was always still interested in the theory of neural networks, and I wanted to circle back to that and start to do experiments that actually addressed that. So for my postdoc, I ended up working with Paul Frankland at SickKids hospital in Toronto. And I worked with him because I was very interested in a particular theory in neural networks called complementary learning systems, which was this idea that our brain had effectively two different neural networks that learned at different speeds because they accomplish different goals. And I wanted to test some of the predictions that came out of that. So in Paul's lab, I did some behavioral tests of some of the ideas that came out of complementary learning systems theory.
And from there I went and started my own lab. And when I started my own lab, my original plan was to continue to mix experiments and theory kind of 50/50. Though I rapidly discovered how much overhead is actually involved in running a wet lab. And at some point, I decided to let the wet lab die off because I really enjoyed the theory, it turns out.
So now I'm back to kind of focusing on neural networks and thinking about how they can help us to understand the brain. And I still collaborate a lot with experimentalists, and we're working towards trying to actually see if we can use general networks as a framework for understanding the computations that occur in real brains.
MARTIN SCHRIMPF: So even though you're [INAUDIBLE] you still have to deal with recordings and with learning. You've chosen a really tough problem there.
BLAKE RICHARDS: Yes.
MARTIN SCHRIMPF: When we record from neurons and visual cortex, then we can just repeat the image and that's all good, but when you deal with that, then they learn it once and that's it. There's no more data for you. So how do you deal with it?
BLAKE RICHARDS: In terms of the neural networks? Like they learn at once and then there's no more data or do you mean in terms of the experiments?
MARTIN SCHRIMPF: Yeah, in terms of the experiment.
BLAKE RICHARDS: So in terms of the experimental side of things, I think that the study of what goes on during learning is something that some labs have done in neuroscience but I think should be done a lot more. Now, one of the difficulties that way is probably a lot of the most interesting learning occurs early in life. And this is maybe one area where hopefully as technologies develop, we'll be able to observe learning in early life more and more. But even in the adult, I think you can actually observe a lot of interesting learning phenomenon in an experimental situation.
So in the infants, I'll just say when I did my xenopus tadpole work, you could see in five, 10 minutes changes in receptive fields as a result of stimuli that you applied that you gave to the tadpoles. In an adult, that's not going to be quite the same. But nonetheless, many groups report and certainly you see behaviorally changes in the adult over the scale of an hour or two kind of thing. And so that's really how we're trying to approach this.
And both in terms of the work I did with Paul and the work we're now doing with some of our collaborators at, say, the Allen Brain Institute and stuff like that, the idea is that we're trying to observe the dynamics of learning over the course of the hour or two that you have with an animal, an adult animal. And I think it's possible. It's not ideal, I agree. There's a hard problem, which is that you are going to just get this tiny little snapshot of the system changing. But you gotta start somewhere.
MARTIN SCHRIMPF: So in your mind, what do you think are the most promising hypothesis of how learning really occurs in the brain and what experiments can be done to [INAUDIBLE] some of them?
BLAKE RICHARDS: So I think that one of the most important things for understanding how learning works that we have not always incorporated into neuroscience, often we have, but perhaps not often enough, is the question of how you actually would guarantee that an animal gets better at something. Now, let me put that into slightly more quantitative terms. When we say that an animal is learning, that implies something normative. Learning means you're getting better at something. Otherwise you're just changing.
Now, what is that something? Typically in machine learning and neural networks, we quantify that something with loss functions. So the system gets better if the loss function goes down. And I think that that same approach has to be applied to neuroscience, in fact. We need to think about what loss functions are at play in the brain itself. And we have to ask how does the brain ensure that when it's learning it can reduce that loss function?
So the example I think of is my son, who's seven years old, is learning to play piano right now. And he's really come a long way in the two years that he's been playing it. He obviously has some internal representation of what it means to play the piece well. He knows what it's supposed to sound like. And he's got some way in his brain that we don't fully understand yet of ensuring that the phenotypic changes that happen over the course of his learning make him get better according to that internal metric that he has of what he's supposed to sound like.
Now, that question of how do you guarantee that your loss function goes down, I think, is this is where I said, I think, this is where maybe neuroscientists don't think about it enough. That's actually a really hard problem if you just try to rely on very simple plasticity rules. And that's where I think that probably the key to learning in the real brain as in artificial mental networks is to have some way of doing something like estimating the gradient of the loss function and using that information to update your phenotype. Because when you're working in a really high dimensional space, it's very hard to actually improve without some guidance like that.
So I think that the key to learning is probably that we've got internal algorithms that are trying to calculate how to get better at a particular thing according to whatever loss function that is for that thing. And what we're maybe missing from our knowledge about learning in the brain is precisely how those bits about gradients and stuff get computed. And that's a large part of what I'm trying to investigate in my lab.
MARTIN SCHRIMPF: So with [INAUDIBLE] a lot of your focus is on finding a biologically plausible algorithm for learning and for the propagation of gradients. And you're strongly being informed by progress in deep learning. However, there's also a lot of pushback against these kinds of ideas, about asymmetry of weights, and where do the labels come from, and so forth. So how does your research address those questions or concerns?
BLAKE RICHARDS: Well, so some of those questions are very important questions that we are explicitly trying to address. So say with the question of asymmetry of weights. One of the things that we're working on right now is how can you actually use spikes to help you learn symmetric weights? And we've got some interesting results on that that I will talk about later today. But some of the other parts of that, I think, are actually misunderstandings about what's important in neural networks.
So take the question of labels. Many people have said to me, oh, well, but the brain doesn't have labels. The labels are actually inconsequential to how deep networks operate. Deep networks are not designed just to learn label data. It just so happens that label data is an easy way of determining whether or not your network is learning well. That's really why we use labels and supervised learning and ImageNet and CIFAR-10 and all of these data sets. It's because it's an easy way to assess whether the learning is effective.
The actual labels themselves, though, are just a tool to get that quantification of how well you're doing. And really, the core principle of you want to learn a hierarchical system. You want to do so using end to end optimization that at least follows in part the gradient of your loss function. Those are the core questions. And so it's when you have something that prevents you from using those core tools, like the symmetry of weights, that you have to step back and think, well, how could the brain deal with that issue?
MARTIN SCHRIMPF: So in addition to the gradients, you already brought up earlier that there has to be some version of a loss function or maybe multiple loss functions. Do you have ideas of where those might occur in the brain?
BLAKE RICHARDS: Well, so one of the things that's interesting, though, is that you don't necessarily have to explicitly represent the loss function in order to follow the gradient of it. So we've known that for a long time with respect to many unsupervised learning algorithms for doing things like, for example, sparse representation learning or [INAUDIBLE] machine learning.
There's no actual representation of the loss function anywhere in the network. It's just that you can demonstrate that the plasticity rules follow the gradient of that loss function. And I suspect that often that is what's happening in the brain as well. We might not represent the loss function, but instead we want to actually-- we have plasticity mechanisms that reduce it.
But sometimes I do think we actually have a representation of the loss function. And I think we can even introspectively see when we have that, like I was saying with my son playing the piano. We sometimes know when we're doing poorly. We've all had that experience. I've had far too many experiences this way of when you're playing sports, you do something and you immediately know that you're screwed. You've totally missed the goal. You've totally missed the basket, whatever. It's not even close. And you don't have to even wait to see the outcome. You just know that you sucked in that moment.
And the reason is, I think, because you've got an internal representation of your loss function. So you know what you were supposed to be doing. You know you have some internal measure of the difference between what you were doing and what you were supposed to be doing. And you can use that information to then help you learn. And I think that probably a big part of the sort of learning to learn that we do early in development and throughout our lives when we gain expertise on something is, in fact, about learning those loss functions. You learn what it is to do well on something.
And I think about this with athletes, again, a lot. When athletes are training, they're often told to visualize doing the thing properly and to think about it. And I think that's actually a way of encouraging their brains to develop a good representation that will allow them to actually capture the loss function that they need to be descending on to do that task.
One other thing. I realized I didn't address part of your question, which was where are the loss functions. And I think the answer to that is they're probably in many different places depending upon the task you're doing. So motor tasks are going to have loss functions that are represented by one part of the brain and maybe something like an artistic task would have some other loss function potentially. Who knows?
MARTIN SCHRIMPF: [INAUDIBLE] the discussion so far we've already often swapped over a little bit to machine learning. So I wonder, what do you think right now neuroscience are the biggest steps that we can take to actually inform the next generation of machine learning algorithms with respect to learning?
BLAKE RICHARDS: Yeah, so that's a good question. I think that there's probably one primary answer to that, and it's just a single pair of words. Inductive biases. We really can say with confidence that the brain has in place inductive biases that allow it to be particularly good at learning particular types of tasks.
We know from a variety of theoretical results, even this is very broad, but no free lunch theorem and other results in generalization, that you can't have a general purpose learner that's going to be good at learning anything. You always have to introduce some bias towards what it's going to learn. Now, the extent to which you have a bias I suspect is different depending upon the species that you look at.
So I suspect C. elegans and Drosophila are a lot more biased in terms of the type of tasks that they can learn than humans are. Humans can apparently learn almost anything with enough practice and enough motivation. But at the same time, humans are better at learning certain things.
Now, why does this all matter for AI? The reason it matters for AI is because when we talk about artificial intelligence, presumably what we mean is actually computers that do the kind of stuff that we do. Because really, we've had computers that were smarter than us in other ways for decades. Computers were much better at us at doing mathematical proofs back in the 1960s.
So when we say artificial intelligence, what we actually mean is can we get a computer to have the sorts of abilities that humans and animals can have? This is an idea that Yoshua Bengio and Yann LeCun articulated in an article back in 2007. And they called this-- this is the concept of the AI set. The AI set is the set of tasks that you care about for AI. And it presumably has a lot of overlap with what humans are actually good at. Not necessarily complete overlap but a lot of overlap.
So when you're trying to design artificial intelligence that's good at learning in the AI side, it's probably actually going to be very helpful to look at what inductive biases are built into the human brain. Because it is already biased towards learning those tasks in the AI set quite well. And we've already seen that in deep learning. The hierarchical structure of deep networks is inspired by the brain.
The convolutional structure of deep networks is in part, it's obviously not a perfect match, but is in part inspired by the brain. There are all sorts of inductive biases that are probably still to be discovered in terms of how our brains operate that could do a lot in helping us to build better AI, I think.
MARTIN SCHRIMPF: In addition to the arrow going from neuroscience towards AI, this is the reverse arrow, which I think especially your lab is taking up on. So do you see your lab more as neuroscience inspired AI or as AI inspired neuroscience?
BLAKE RICHARDS: We are now kind of working on both, but I think over the last few years it's been more AI inspired neuroscience. And the reason we've been doing that is because I really do believe that there are general principles of intelligence and general principles of learning that are applicable to any agent, whether a natural or artificial agent. And AI is good for exposing some of those general principles precisely because when you're doing AI, your only goal is to get something that works.
And in the course of trying to get something that works, you'll end up discovering something about a very broad principle. And that thing we were discussing earlier regarding gradient descent is an example. It turns out that things work a lot better when you're able to descend a gradient than when you're not. And I think that's a general principle. And that has informed the direction I've gone in neuroscience.
Given that I think that's a general principle, I suspect it applies equally to the brain as it does to artificial neural networks. And so I think it's reasonable to try to look in the brain to see how it might be solving that same problem that we've already solved for ANNs.
MARTIN SCHRIMPF: You were recently also named one of 29 CIFAR AI chairs. Could you explain to us some of the more detail how your neuroscientific perspective influences the CIFAR AI strategy?
BLAKE RICHARDS: So CIFAR is a very unique organization and I think a really special one. And I'm not just saying that because they provide me with money. CIFAR has a very interesting approach to funding research. What they do is they have what are called programs. And if you're a member of the program, you get a small amount of money that's totally unrestricted every year.
But in exchange for that, you have to come to a few meetings every year where you get together with all the other members of your programs and talk about things. And the programs are structured to be interdisciplinary to bring people together from many different perspectives who nonetheless are all interested in the same problems. Because obviously if they're not interested in the same problems, it's not going to be useful.
But CIFAR has one particular program called the Learning in Machines and Brains program, which I'm a part of. And the Learning in Machines and Brains program is founded on the idea, as I've just articulated, that there are general principles of intelligence that apply equally to brains and to computers and that if you bring people who study neuroscience together with people who study computer science, you're going to go further in terms of what you're actually able to achieve.
And in many ways, that vision has already borne out for CIFAR. They were funding people like Geoff Hinton and Yoshua Bengio and Yann LeCun long before they became trendy and had come to dominate the AI field. And the program that Geoff started long ago included many neuroscientists. Sebastian Seung, Bruno Olshausen, et cetera. They would come together at these meetings to think about just general principles of intelligence.
So I have been brought into that program precisely because they're still following that philosophy that you want neuroscientists there interacting with computer scientists. And in some ways, I'm also a computer scientist. My new job at McGill is actually a joint position in computer science and neurology. And so I do view myself as a computer scientist but a computer scientist who comes at it from the perspective of what can we learn from the brain and how can we also inform our understanding of the brain with computers? And because that fits well with CIFAR's general philosophy with this program, they have welcomed me into their fold.
MARTIN SCHRIMPF: Awesome. So I'm sure many students will watch this video and [INAUDIBLE] really excited about all the cool work you're doing. What piece of advice would you give them? What do you look for in a student or potential collaborators?
BLAKE RICHARDS: So I think the most important thing, whether it's a student or a collaborator or anyone, is actually just a combination of enthusiasm and having tried a bunch of things. So certainly with students, one of the things that has always made me immediately be like, students or postdocs, oh, I'd really like you in my lab is when they come to me and they show me a project that they actually have kind of done on their own because they were interested in it, because they both had the enthusiasm and they have this instinct to try different things and really explore the parameters a bit.
Because I do think that science is inevitably a path with many dead ends. And so if you get overly focused on just going on one particular path, you're going to struggle occasionally. Sometimes it'll work out and it'll be amazing but other times it won't. And so my advice to many students is don't get overly restricted on a particular project. That's not to say that you don't want to do a lot of work on a project.
But I think sometimes what people fail to recognize is that the time that you actually want to focus in on a project and really just do that project alone is once you've already got some initial results to tell you that this is something special. And otherwise, it's good to have your hand in many different pots. You need to focus in once you've realized which pot has the honey in it, as it were. But before that, really exploring things and being excited to try many different ideas is super important.
MARTIN SCHRIMPF: it was a pleasure to have you here with us, Blake. Thank you very much.
BLAKE RICHARDS: Thank you.
MARTIN SCHRIMPF: Later, Blake will give a talk about all the cool research he's been telling us about with a little more detail. We'll make sure to post that to [INAUDIBLE] as well. And thank you for watching and see you next time.