Andrew Saxe
- NIPS 2015 Symposium: Brains, Minds and Machines
Hallmarks of Deep Learning in the Brain
Anatomically, the brain is deep. To understand the ramifications of depth on learning in the brain requires a clear theory of deep learning. I develop the theory of gradient descent learning in deep linear neural networks, which gives exact quantitative answers to fundamental questions such as how learning speed scales with depth, how unsupervised pretraining speeds learning, and how internal representations change across a deep network. Several key hallmarks of deep learning are consistent with behavioral and neural observations. The theory can be further specialized for specific experimental paradigms. Taking perceptual learning as an example, I show that a deep learning theory accounts for neural tuning changes across the cortical hierarchy; and predicts behavioral performance transfer to untrained tasks as a function of task precision, restricted position training, and learning time. Together, these findings suggest that depth may be a key factor constraining learning dynamics in the brain. A better scientific understanding should eventually contribute to engineering advances, and I discuss one example from this work: a class of scaled, orthogonal initializations which permit rapid training of very deep nonlinear networks. Joint work with Surya Ganguli and Jay McClelland
have an interactive transcript feature enabled, which appears below the video when playing. Viewers can search for keywords in the video or click on any word in the transcript to jump to that point in the video. When searching, a dark bar with white vertical lines appears below the video frame. Each white line is an occurrence of the searched term and can be clicked on to jump to that spot in the video.
LEARNING