Deep learning networks have revolutionized the creation of artificial intelligence and have also led to new insights about neural processing in the brain.
Research on learning in deep networks has led to impressive performance by machines on tasks such as recognition in many domains. Theoretical studies such as those in this unit help us to better understand the behavior of these networks, why they perform so well, what are their limitations, and what fundamental questions remain to be answered.
Terrence Sejnowski reflects on the historical evolution of deep learning in Artificial Intelligence, from perceptrons to deep neural networks that play Go, detect thermal updrafts, control a social robot, and analyze complex neural data using methods that are revolutionizing neuroscience.
The reasons for why deep learning works well for tasks such as recognition remain a mystery. Tomaso Poggio first uses approximation theory to formalize when and why deep networks are better than shallow networks, then draws on optimization theory to determine where complexity control lies in deep networks, and finally considers classical and modern regimes in learning theory and provides new foundations.
The study of deep learning lies at the intersection between AI and machine learning, physics, and neuroscience. Max Tegmark explores connections between physics and deep learning that can yield important insights about the theory and behavior of deep neural networks, such as their expressibility, efficiency, learnability, and robustness.
Haim Sompolinsky introduces neural representations of object manifolds, the problem of manifold classification, and properties of manifold geometry. He then describes applications to visual object manifolds in deep convolutional neural networks, applications to neural data, and few shot learning of objects.
Lorenzo Rosasco first introduces statistical learning from a theoretical perspective, and then describes the construction of learning algorithms that are provably efficient so that they can achieve good results with a minimal amount of computation. Finally, he presents empirical results for the case study of large-scale kernel methods.
Constantinos Daskalakis illustrates common ways that datasets constructed for training neural networks can be biased, such as censoring and truncation of data samples, giving rise to biased models. Building connections to high-dimensional probability, harmonic analysis, and optimization can lead to computationally and statistically efficient methods for solving statistical learning tasks with truncated data in a way that reduces bias.
Additional information about the speakers’ research and publications can be found at these websites:
Banburski, A., Liao, Q., Miranda, B., Rosasco, L., De La Torre, F., Hidary, J., Poggio, T. (2018) Theory III: Dynamics and generalization in deep networks, axXiv:1903.04991
Chung, S., Cohen U., Sompolinsky, H., Lee, D. D. (2018) Learning data manifolds with a cutting plane method, Neural Computation, 30(10), 2593-2615
Cohen, U., Chung, S., Lee, D. D., Sompolinski, H. (2020) Separability and geometry of object manifolds in deep neural networks, Nature Communications, 11, 746
Daskalakis, C., Rohatgi, D., Zampetakis, M. (2020) Truncated linear regression in high dimensions, 34th Conference on Neural Information Processing Systems, arXiv:2007.145392007.14539
Poggio, T. A., Anselmi, F (2016) Visual cortex and deep networks: Learning invariant representations, The MIT Press, Cambridge
Sejnowski, T. J. (2018) The deep learning revolution: Artificial intelligence meets human intelligence, The MIT Press, Cambridge
Tegmark, M. (2017) Life 3.0: Being human in the age of artificial intelligence, Alfred A Knopf, New York