Laura Schulz: Cognitive Development and Commonsense Reasoning, Part 2, and Joshua Tenenbaum: Machine vs. Human Learning and Development of Intuitive Physics
June 6, 2014
June 6, 2014
All Captioned Videos Brains, Minds and Machines Summer Course 2014
Topics: (Laura Schulz) Inferential economics; learning from instruction vs. exploration (Gweon, Pelton, Schulz, Cognitive Science 2011); rational learning through play
(Joshua Tenenbaum): General introduction to the CBMM research thrust on Development of Intelligence; introduction to the concept of probabilistic programs; learning as theory building; learning physics from dynamic scenes (Ullman, Stuhlmuller, Goodman, Tenenbaum, in prep 2014); hierarchical modeling approach, from meta-theories to theories to events; stages of childrens’ development of intuitive physics concepts
PRESENTER: So Laura's going to not finish her full lecture, but just give you one or two little highlights that will help [? key ?] part of the discussion later. And then I'm going to do likewise. I'm going to just wrap up a couple of loose ends from the lecture that I didn't quite get to finish last time that are all oriented towards what are some of the key phenomena and modeling questions for modeling children's learning?
And then Tomer is going to talk more. Tomer and Laura are going to have a little structured exchange. What are some, again, challenges and ways to make progress on modeling, basically the kinds of hard problems of learning that you've heard about? And then we hope that will just actually turn into more of a discussion.
And then we'll either take a break and then come back and have a panel on bringing in some of the more neuro people and saying OK, what are the prospects for making links between the phenomena of children's learning and the models that are oriented towards addressing the problems of children's learning at the more abstract computational level and the phenomena involved with learning are the neural level. And I hope that that will be one of the more interesting kind of interactions at this summer school environment. So Tommy and maybe Gabriel, is Gabriel here? Some of our more neuro-oriented faculty were going to participate in that panel. And if we finish in time by 4:00 Laura will. OK?
LAURA SCHULZ: OK, So I'm just going to like I said, pick up where I left off. But I'll make a tiny, tiny way in. And in particular I want to return to this idea that there are real trade offs in learning. And that one of the reasons learning is hard is because information is costly. And it's hard to pin.
So it's costly for the learner because of course, we can't learn everything there is to learn. But it's also costly for teachers because they can't communicate all the information they might want to provide. And given the sort of adjustments between these two, you might want to have a kind of learner who assumes that the evidence that the teacher's providing is helpful.
And by helpful here I mean it's likely to distinguish the correct hypothesis from all of the other hypotheses that are also true and maybe consistent with the evidence but not to what the teacher wants you to learn. And the teacher should likewise assume that the learner knows that the teacher is going to pick evidence that's representative and distinctive and should make inferences accordingly.
So to flush that out a little bit, imagine I tell you there are 15 states in the United States. Well that is true, right? There are 15 states in the United States. But it doesn't really distinguish that kind of information from the [INAUDIBLE] that there's 16, 17, 18, 19 states. If you don't assume that the teacher's providing maximally informative evidence you have to entertain all of those kinds of possibilities.
And if the teacher doesn't assume that the learner knows that she's being informative, then if she wants to say, well look there's 50 states in the United States she didn't explicitly rule out and by the way, not 51, 52, not 53. Right? So it's helpful if you think that what the teacher is doing is providing evidence that is representative of the hypothesis.
This predicts some trade offs between instruction and exploration. And we're going to make this very simple in the case of children. We're going to say if a teacher demonstrate a property of a toy, say this toy has one thing that it does. Well, then I should really assume that that is all the toy does. It does just that one thing.
Because if it did more the teacher should have demonstrated them, right? And what that means is in these conditions where you think you're getting evidence in terms of who is knowledgeable of helpful you really should constrain your exploration just to the instruction information. In a sense that's what teaching is supposed to do. It's supposed to constrain the hypothesis bit.
But it's also very specific to a condition where it's warranted to believe that the evidence is being sampled under those assumptions, where the person who is providing the information is in fact knowledgeable and is in fact being helpful. If somebody naive comes up and says, hey look, this toy did this thing, you may be much less likely to say, and it doesn't do anything else. Or if you know, for instance, that the teachers is interrupted, you know, says it does this one thing hang on. I got to go make a phone call, then you shouldn't make the inference that the information that has been provided is provided under the assumption that they're in fact providing distinctive information.
So you can spell this out a little bit by presenting kids with identical evidence under different kinds of assumptions about how that evidence has been generated. And to do this we should give a toy. And this toy actually had four interesting properties. A squeaker, a light, it played music. There was a hidden mirror.
But we're only ever going to show the children this property, that it squeaks. And again, you can do it in four different ways. By pedagogical here we mean something very lean, just that the person providing the evidence is indeed knowledgeable and is indeed helpful which here just means watch this. I'm going to show you my toy while I do that pedagogical demonstration because I presumably know what I'm doing. And I'm conveying the information with the intention of helping you.
And we compare that condition with three other conditions. One case where you show the same demonstration but accidentally. Look at this, I found this here. Wow, see that?
No demo at all, where you just demonstrate the whole toy. And an interrupting condition, which is just like the pedagogical condition except as I said, disrupts the implicature that I have now been helpful in communicating all the information that's relevant.
So I'm going to say, watch this. I'm going to show you my toy. Excuse me I got to go make a phone call. And the prediction here is that the children should constrain their exploration of the toy.
When you leave and let them play by themselves, in this case they should make the inference that this is all the toy does. And they should be much less likely to make the inference in any of the other conditions. So let me go ahead and show you a child in a pedagogical condition.
If you find yourself doing this in your research program, think about it. And of course, he has no idea right? Because he hasn't found it. And that's in general indeed what we found, that children performed many fewer actions or discovered many fewer properties of the toy in the pedagogical condition than the other condition. Now, you might be reasonably worried that what kids are not making as an inductive inference about the properties of the toy. Maybe they're making some kind of permission inference. You showed them one thing so that's all they're supposed to do, regardless of what else is actually present in the toy.
But we actually know that's not the case. I can very quickly tell you why. But actually I think the best evidence of why is that some students [? involved ?] with an undergraduate project actually tried it. They explicitly told the kids this is what you're supposed to do with the toy as opposed to let me show you what my toy does. And the permission constraint did not constrain their exploration. The kids explored more problems.
But we have some better evidence than that, that the kids really do think it's a function of the toy. I'm going to walk you through that piece very quickly also and then just end on a small note. But the point here is that because there are costs to both learning and teaching, it is sensible to make some assumptions about how the teacher is limiting their costs and also the costs to the learner.
And those kind of trade offs are, that child is very smart for a five year old, right? It's making as sensible an inference as you might make in other form of pragmatic communication like if I said do you have two sisters? And you say yes, I'm not, unless I'm in a court of law, going to follow up with and any more than two. Right?
This kind of inputs implicatures that help from this, they're very important to how you might adjust for this. Really quickly, I want to say a few more things. Of course there are many good reasons why teachers might provide limited information as I suggested. Evidence supports generalization. If I show you one rattle squeaks I don't need to point out every other rattle that squeaks.
Also, of course, a lot of times evidence, like for instance, what we're doing today might be too much. Right? We're just going on and on with more information, perhaps, than the learner can handle. So there are good cases and good reasons where teachers should provide limited information.
But In some contexts, like for instance, when we told them there was only one function of a toy that in fact had many functions, that didn't fall into these categories. That was misleading information. Right?
We provided information that was like, that induced the wrong hypothesis in the learner. And those are the kinds of cases that we think of as sins of omission, right? You're providing evidence that is likely to produce exactly the wrong hypothesis. Like if you were from another planet and they said there were 15 states in the United States, right? It might be true. But it's not helpful.
So we wanted to know whether kids would actively make these judgments if they knew the truth, the grand truth, would they judge teachers accordingly? So this is some lovely work by [? Yolan, ?] a follow up in a previous study, where she showed children a toy, a different toy. And it had one thing that worked here, this wind-up mechanism. And the child played with it and learned that for themselves.
Another toy looks exactly identical, but we, this time, enabled all the functions. And in this case the child got to play and learn that the toy did four things. OK? So the child knows the grand truth here.
And then we have a teacher. And the teacher's going to go ahead and teach on the toy and always going to do the same thing, always going to teach the wind-up mechanism. Right? And the question is do the children judge the teacher differently in the two conditions? The teacher's always doing the same thing but the child knows, in one case, that evidence, although true, is misleading. And in the other case it isn't.
And the answer, of course, is that I'll skip through the details here. We anchored it on the teacher who was correct and incorrect. And just show you that indeed they do think that the teacher, despite identical demonstrations, is much worse in the one condition than the other. So they really do recognize when the information fails to be informed.
I can't resist. Some I'm going to show you one more study. And show you a final follow up.
You might want to know how knowing that a teacher is being misleading affects the kids' own exploration. And they comment. What do they do if they think that the teachers are insufficiently informative?
So again, we ran the kids in a condition where they knew the properties of the toy. And the toy teacher taught the students the toy. But this time instead of rating the teacher, what we did is we let the children then play with this toy themselves. And this same toy teacher who had taught them about this toy and this toy said, now here's another toy and showed them the squeaker just like before.
And for this experiment we introduced one more toy which actually did have four functions. So know I'm going to walk you through, more slowly, the logic of the experiment. In the teach one of one condition where the teacher has just showed Elmo there is one function. And the child knows there's one function. And when that teacher goes to teach them about the squeaker toy and says look what it does, the children should trust the teacher, say that toy probably does one thing, and do what they did before, which was constrain their exploration for that function.
And to teach one of four conditions, where the child knows that the teacher was misleading and provided insufficient information, then the children, if they assented to this, should not trust the teacher. They should say, well, they were wrong about this one. They didn't respect the [INAUDIBLE]. Or the same one thing, I should no longer hold that inference. I should assume that this toy actually has many properties.
And the difficulty with this is we don't know if children are actually making an inference about the way the evidence is sampled for them, or if they're just saying well, look, here's a toy that did one thing. So this toy probably only does one thing. Here's a toy that did four things. So maybe this toy does four things.
So to rule out that possibility, we have this toy here. And in this case, the toy does four things. And the teacher accurately showed Elmo all four things. So when the teacher goes ahead and says about this toy, look at the squeaker, the children, if they're going from the teacher and not from the toy, should go ahead and constrain their exploration. Is that clear? I know it's a lot going very quickly.
PRESENTER: So do you do both of these?
LAURA SCHULZ: So we did all between subjects there's kids in each, 16 kids in each one of these conditions. And what you'll see is in fact, that's what happens. So when the teacher was reliable the kids go ahead and constrain their exploration in the new toy. They respect the inference. So if you say one there's probably only one.
But in the case where the teacher was not previously reliable, the kids explore more broadly. So in some sense they're compensating for an inadequate instruction by broader exploration. I'm going to end with this and skip a bunch of other slides and make one final point here, which is somewhere.
I've suggested that children engage in a lot of sophisticated [INAUDIBLE] practices. They're limited, however, by information processing exchange, by world knowledge, and also these kinds of real world costs and how you're going to make conductive trade offs, right? You can be very smart for a five year old.
You can make the best inferences you've got. And that information could be misleading either because your original sample of evidence wasn't representative or because someone intentionally misled you about the kind of evidence, was a poor teacher in some sense. And all of those things are going to place limits on how you learn.
And a lot of those are really because it's costly to learn. It's costly to gain information. It's costly to communicate it. It's costly to process it.
And what this might leave you with is a sense of OK, well, we understand a lot about what happens in terms of how children go about learning. We have some pretty good computational models for a lot of this. Even the things that look like limits on learning in terms of information processing constraints or cost, we had this utility calculus, we can begin to make a lot of headway understanding learning.
And I think that that's true and promising and exciting. But I don't want you to think oh, we're pretty close to solving everything there is to know about children and learning and a few more summer classes and you can all go home. Because to give you a sense of just how broad the actual space of the behavior and learning is, I think I need to return to a point that Emily has been making a few times.
She sent it around in an email. And she asked the other day, well, you know, this is all very goal directed, right? You're trying to learn this. You're trying to learn that or there's like, real uncertainty and you know what you're trying to do.
But a lot our behavior doesn't look that directed. Right? A lot of what we do looks like exploration but not just exploration in conditions where I can say, oh well, the base rate of this is different than that and so there's ambiguity here and you know, you should do this.
It looks like we're just kind of rushing around doing things, trying things out, thinking. And Tomer and I are going to talk a lot more about this later. But I want to point it out because it's a very distinctive feature of our most intelligent creatures. And our most intelligent creatures, when they're at the time in life when they're solving all the hard of cogniscience, all those hard problems of that moral planning, all the reasoning, moral reasoning, they're doing that between zero and five.
What are they doing with their time when they're playing? And what is play? Well, it has some of these properties, but it doesn't look like all of them. And to give you just a sense of how-- and really they are playing all the time. And they're not just doing exploratory play. They're doing pretend and imaginative play, which opens up all kinds of cans of worms.
And I think it's really genuinely is a hard problem. it's a hard problem to explain why an organism who is solving all the problems that our machines and computers can't solve is doing this and basically all of its available free time, right? So what is this? And what does it really look like?
I'm going to show you outtakes from some work in my dissertation. I was asking kids a simple problem to decide which of two gears was making another gear move. And they did many things that you might expect them to do if they're sensitive to patterns of evidence.
But this is the rest of what they did. And I just want to leave and end on that note. sometimes it generated useful evidence but in a fairly chaotic manner. So I really want to leave you with this picture because you might not be laughing at this for your dissertation. I was trying to bridge a gap, which I think is a deep important one and a remaining endearing one between behavior, between play between what human beings are actually like in the world and are really elegant computational models of what learning is.
And that gap, although we are beginning to close it in these various ways. I think this really makes clear like, all the things we do not know. So I'd like to end there and I'll turn it over to Josh. Thanks very much.
PRESENTER: What about final frame where they actually get it right?
LAURA SCHULZ: That, that's the boring stuff. This is the future.
JOSHUA TENENBAUM: So I just want to tie up a few loose ends and kind of just summarize where we have come to here, right? So this is a slide I think I showed you in the very first lecture that I gave. This idea that in some ways animates the whole summer school, that animates this exciting convergence of fields, neuroscience, cogniscience, and team learning, that in some form, intelligence is making sense of data on a grand scale.
Something like statistics on a grand scale. At least that's something that all these different fields have made some progress with that idea. That's the idea that's the intersection of the Venn diagram, if you like. But a lot of what we're getting at here are things that go beyond this, ways in which when you look at say, for example, intelligence and learning in young children, from a certain behavioral and computational perspective, it doesn't necessarily look so simple like this picture.
I want to kind of contrast this with a view that says, OK, if only we understand how to find structure in data using this basic idea of statistics but just scaled up on a grander and grander scale, you could say this is kind of the promise of deep learning. This is just from an article in The New York Times a year or two ago which is a really nice article, one of the first popular press articles on deep learning, highlighted a lot the recent work of Jeff Hinton, and it ends with this nice quote from Hinton. "The point about this approach is that it's scaled beautifully." I mean, of course it's inspired by what the brain does.
And you can build great technology. But it's scales. "Basically you just need to keep making it bigger and faster and it will get better. There's no looking back now." So you might ask is the answer to intelligence just taking what worked here and scaling it up to more data, more cores, more hidden layers and so on.
And I think the point of this whole thrust of this center and what we've been trying to do here for the last couple of days is that no, that isn't enough in that if you, again, remember if you look at children's cognitive development you have two insights which at least on their face and also I think, when you dig more deeply, call that into question. Say it's not just about finding patterns and data on a bigger and bigger scale. And that has to do with these two insights.
One is that pretty much as early as you can look, children are not just finding patterns in data. But they're actually building real theories. Our intuitive physics, our intuitive psychology which act, in some important ways, like scientific theories even from early infancy.
And that insight too, that has been more the focus of the more recent lectures here and the focus of a Laura's research in particular. This is just a couple of frames of her work. Now you've heard about a lot of this great stuff.
That it's not just our knowledge of something like these abstract systems of concepts and laws like scientific theory, but the way we build it, the way we go beyond what we start with and how our knowledge grows is something like the kind of theory building you see in science. And I sort of wanted to contrast here this idea of the child as scientist, which you've heard about Laura with the idea of the child as data analyst in a sense.
Like you could say the big data paradigm, and the idea that that's the basis of intelligence, that would be like saying, well if just more and more data analysis was going to be the basis of knowledge, then that should take you where you need to go. And we all know, as scientists, that data analysis is a big part of what we do. Right?
But it's not the only thing we do. And one way to view all those bullet points that you saw from Laura, all these activities that scientists do, the first one or two are like data analysis. And then maybe the third or fourth bullet point is fancy data analysis. Like when you saw hierarchical days when we talked about learning at multiple levels of distraction and second order generalizations, that's kind of like fancy data analysis.
And it's already there. The basic math for that is already in an advanced Bayesian statistics text. But think about all the activities of science and children's learning that are not just data analysis, all these things that you've been seeing from Laura. And it's a real challenge. How do you make sense of those computationally?
So this has motivated our turn to thinking about [INAUDIBLE] programs as a knowledge representation sort of an answer to the first question. What kind of knowledge could explain children's early abstract theories of physics and psychology, as well the enriched ways that these systems developed over the next few years of life. So let's say you buy the idea that you're going to use something like probablistic program capture our knowledge of intuitive physics, intuitive psychology, these ideas of the naive utility calculus, for example, and how that notions of efficiency in goal directness and planning captured these programs that map beliefs and desires to actions.
If that's the form of our knowledge then what does learning have to look like? Well, it has to look like something like programming your own brain. Laura referred to that a little bit at the end the session before lunch. And I think that's exactly right.
I think, in a sense, that the formal activity that scientists do, which is more like children's learning, which I think has more of a hope to get at the kinds of activities that Laura was talking about, beyond just data analysis is programming. One of the ways we analyze our data is by writing programs. But we write programs for all sorts of things. Like these days in most of our areas of science our scientific theory basically takes the form of a program.
And if it's a very simple program we can analyze it mathematically, figure out its behavior. But usually if we're interested in the complex natural world you need to simulate it to figure it out. Like Emily's great video, has everybody's seen that? That project video, you sent that to everyone right? From your [? woodsoul ?] project from before.
EMILY: I did it on the Facebook page.
JOSHUA TENENBAUM: You put it on the Facebook. So if you haven't checked it out, check this out. it's a great example of a [? Woodsoul ?] project. And basically you see the simple scientific theory of how the squid's skin works. And you've formulate it as a computer program and study this behavior by simulate. So that sort of that's the sort of modern view of what scientific activity is about.
You didn't really, in that project, do a lot of formal data analysis. You just made a theory. You instantiated the mechanism in a simple kind of a random field or something like that.
You simulated. You saw, OK, it kind of looked like the situation though, and add another mechanism. And it's that kind of science, in a sense, which I think is one paradigm for children's curiosity driven exploration of the phenomena in their world. So how can we formalize that process? How could we describe what you were doing that kind of model building computationally?
Well, I'll just indicate a little bit about how we've been thinking about it and then turn it over more to Tomer and Laura to talk about it. But this is the part of the summer school where mostly we don't really know. We get very quickly to the limits of our capacity.
But the basic idea, which I'll motivate here by just giving you examples of a couple of scientific theories which you might be familiar with, is again, this idea that learning is a kind of theory building which formally is a kind of programming or writing programs, modifying your programs, writing code libraries or something. So think again about how did Darwin come to his theory of evolution with natural selection? Or Newton come to his law of universal gravitation or Mendel come to his theories of genetics by looking at pea pods and so on.
So with Mendel and his pea pods or Darwin and his finches in Galapagos or Newton and his sparse noisy data he had of orbits of planets and dropping some apples and so on, right was it a kind of data analysis? Well, they all had some data. There wasn't a lot of extensive statistical modeling. It was mostly kind of like what Emily was doing. You look at the data. You try to say what could possibly explain this?
Now, none of them had the formal tools of computer programs. But I would say that if you look at the structure of Mendel's classical genetics or if you look at the structure of Darwin's account of evolution via mutation and natural selection, they're described in words. But they're basically probabilistic programs.
They are stochastic processes with abstract structure that when run forward, would produce something that looks like a pattern of data that they have to see. And actually, you could say the same thing about Newton's law of gravitation. Now, his probabilistic programs-- they were simple enough. They were in the form of equations-- that if you were also as brilliant as Newton and happened to also invent calculus, you could analyze their behavior analytically. You didn't have to simulate unfortunately, because he didn't have computer programs.
But he still had to do a lot of probabilistic analysis. For example, he couldn't estimate the value of G, the gravitational constant, in a law of gravitation with the data he had. He could at best make a guess it was within an order of magnitude. And it's quite interesting that he was still able to basically argue and convince himself and convince others and he basically writes that this same law described how the planets go around the sun, how the moon goes around the earth, how an apple drops when you drop it on Earth.
And even though he couldn't pin down the value of G. But he could make the right kinds of order of magnitude approximate arguments that showed that it seemed like it kind of worked. It kind of made sense. So how do we describe that? Well, one thing that we've been exploring is this idea that this kind of theory building is something like hierarchical Bayesian inference with probabilistic programs.
So I'll just show one example of this as Homer's work. And I think it's satisfying as far as it goes. But it doesn't go nearly far enough and that's the challenge where we're talking about. So this also addresses some of the questions that came up at the end of my last lecture where we were talking a lot about models of intuitive physics.
And people said, what about alien physics or what if the laws of physics were different? Do people learn those? So that's what we've been doing here.
And this is both a modeling project but also an experimental project. So far only with adults, though we're hoping to take it to kids, where we show people things like this. So these are brief displays. Again in sort of our hockey puck world. So you're like, looking down at an air hockey table from the top. And just watch this movie again.
You'll get the sense that the balls are sort of bouncing off each other in kind of basically Newtonian ways. But there's something kind of interesting going on. Like any idea of physics of this world. Actually, this one, is this one interesting?
Yeah, anything slightly not unusual here? That isn't just purely inertial collision? Yeah, it looks like there's some force of attraction, maybe between the yellow ones, right?
What about between the red ones? No, I don't think so. Am I right? Yeah, OK. Sorry.
Here, I'll just show you a whole bunch of other videos. So we showed people many different videos, each just a few seconds long, with different kinds of physics. And you can see that sometimes there's funny forces between objects of different colors. Some of the objects are designed to look more, have sort of hidden properties like mass. Some of them are more massive than others.
These patches of color have different properties. Like some of them are kind of rougher or smoother patches. So you might, looking at this, make inferences that, for example, I don't know can you tell, does one of these patches here look like it has sort of high surface friction? If you can see it. I think the green one. Right? OK.
Yeah, that turns out to be actually a relatively hard property to infer. But actually people can infer all these things. They can infer whether there is, with just viewing each of these videos a couple of times, much higher than chance. You can infer whether there's attraction or repulsion between objects of different kinds, which ones are more or less massive, which surfaces have more or less friction. And you can model that, to some extent, by combining the tools that you've seen.
First of all, a hierarchical Bayesian model. Here I'm showing three levels in this hierarchy. Like each level is generating the hypotheses at the lower level.
And the idea of a probabilistic program because each level is described, in this case, by some kind of probabilistic code where the very highest level is like, basically Newtonian mechanics. And one hypothesis-- it may be a slightly crazy hypothesis, but it's one we've been exploring-- is that something like that might be either innate or sort of at least the core set of concepts of the physical domain. In the same way that like, some version of utility calculus seems to be the core organizing principle of psychology, naive psychology, something like f equals ma seem to be a kind of core organizing principle of intuitive physics of objects.
But you could ask could we learn that? But at least in this model we're just putting that in, not sort of setting up the whole model. And then at the bottom level you have these dynamic events, of these objects moving around bouncing off each other. And the interesting inferences are going on at middle level here, where we have particular force laws that define, like, say, that red ones are attracted to blue ones or that repel yellow ones.
And particular parameters of objects or surfaces like their masses or their coefficients of friction. And then the idea would be, in a sense, that that sort of metatheory sets up a hypothesis base of particular theories, particular forces and parameters that then themselves generate expectations about the events you see. And if you have the top and if you have the bottom, then you can make Bayesian inferences to kind of make good guesses about what's going on in the middle. And what that means includes all the different kinds of things that you as a sort of computational scientist might be doing when you are writing or modifying code to describe the world.
So one of the things you might do, one of the simplest things you might do to your code is tweak the parameters. Tune the parameters of a piece of code that's already there to better fit phenomena you're trying to model. And that kind of parameter tuning, you know, that's not so different from learning in neural networks or maybe the kind of learning that we have in [INAUDIBLE].
That might be like, kind of tuning the parameter and the code that refers to the mass of a certain object. But other things you can do to your code are write new functions, right? Don't just modify one you have, but write a new function. Or you could modify the form of the function. That's an intermediate sort of step of [INAUDIBLE] you need.
Modify the form of the function but don't write a new function. Or you could actually write a new function that describes some new force law or some new term of interaction like, again, I'm referring to Emily's video, which maybe you haven't all seen. You should all watch it.
The way she describes is she basically like, says OK, I want to describe these cool patterns of how the squid's skin seems to change in space and time. And she sort of posits one kind of law or one function. And then she says, oh that does something but not enough. We to add another function basically.
So that's a very standard thing we do in science. And can talk about adding a new piece of code in order to better explain the world. And the idea is, at least in principle, we can think about all of those kinds of modifications as a kind of Bayesian inference. But now over a hypothesis space, which is not the parameters in exponential family model or the weight from the deep network, but basically this sort of code that generates other code.
So I won't describe a lot of the details of how you make that work. But at least at that high level of abstraction, oh no, actually, Tomer will say a little bit about how you make that work in a bit. But at least at that high level of distraction these kind of models are able to account for the inferences that people make in this domain. And one of the challenges for the center that we've posed for ourselves is to see can we get this to also make sense of the phenomenon development.
So if we look at again, the things now that mostly you saw from Laura's previous lecture from two days ago, where she showed some of the kinds of the stages of knowledge that are there in young infants, like in intuitive psychology, the early understanding of reaching is a goal-directed action, or the sort of efficiency inferences from her work with [INAUDIBLE], the thing I briefly referred to as the teams who emerge a little bit later maybe, one and a half year old's understanding about false belief, that if somebody hides an object inside the green box and while they're not looking the object is moved to the yellow box they'll go to look for it where they last saw it, namely in the green box, not where it actually is.
So this was sort of an increasing development of children's intuitive psychology. Of this was an example she showed from Renee [INAUDIBLE] work on understanding of the intuitive physics and understanding of what factors determine physical support. And again, kids are coming to an increasingly sophisticated understanding of these things.
So what we'd like to be able to do or again, it's a question. Can this work? Can you describe each of these different stages of knowledge with one of these different kinds of probabilistic programs?
And then describe the transitions between them as some kind of evidence-driven learning or some kind. I mean again, it's not going to just be a simple passive process of tweaking the code to better fit the data. But it might be can we formalize the kinds of exploratory processes that Laura was talking about today as something like the child as hacker. Child is hacking on their own code to come up with better, more elegant, more explanatory computational modeled probabilistic programs that could explain how these trajectories are developed.
That's at least the daring hypothesis we might want to explore. And we might want to try to test it using some of the cool kinds of developmental interventions that you heard a little bit about, like remember the sticky mitten study, where even a three-month-old, way younger than the original study by Amanda [INAUDIBLE] and colleagues shown up there on the top with understanding of goal-directed reaching, if you give the kid these sticky mittens with just a few minutes of experience, it seems like it lets them tap into something that was already there or somehow changes what they know. I mean, we don't really know what's going on.
But taking interventions like that, or things like shape [INAUDIBLE] training that I talked about a couple times ago, where you just give kids a little bit of experience, a couple of objects organized by shape for a few weeks, a couple of pairs of objects, but that's enough to then again given them the second order generalization. Renee [INAUDIBLE] studied the intuitive physics has done similar kind of things like giving kids a couple of examples of these sort of stability support things she can basically, you know, get some kid to learn from just a few examples at 10 and 1/2 months what normally they only learn at 12 and 1/2 months. So we'd like to understand these kinds of interventions on children's experience, they seem to have a very rapid effect on what they know, what they seem to know. So we'd like to understand how that works from the same kind of computational framework.
I had a few slides on trying to do this. One previous successful example of this is in the work that Steve [INAUDIBLE] did in trying to model kids development of number. But remember you don't need this for what you guys are doing, right? So I'll just skip that and just sort of wrap up.
And saying here is where we are. I say this sort of ends what we know. Which is that we've taken these two insights, trying to describe early knowledge with this probabilistic programs for intuitive theories. And the growth of knowledge learning is something like that a kind of program induction of program construction, program synthesis.
What we haven't talked about is, in some sense, what makes that so hard. I think Tomer will talk about this in his talk. Like what could be the actual processes by which you could modify your code and come to better intuitive theories and how that interacts with some of the issues In curiosity-driven of exploration. And anyway, I should mostly just stop it.
Associated Research Thrust: