Flying in the face of recent rival studies, these scientists point to generalization as key to order-of-magnitude performance gains
by Gareth Halfacree
Researchers from the Massachusetts Institute of Technology (MIT) and Brown University have taken steps to open up the "black box" of machine learning — and say that the key to success may lie in generalization.
"This study provides one of the first theoretical analyses covering optimization, generalization, and approximation in deep networks and offers new insights into the properties that emerge during training," explains co-author Tomaso Poggio, the Eugene McDermott Professor at MIT. "Our results have the potential to advance our understanding of why deep learning works as well as it does."
Machine learning has proven outstanding at a range of tasks, from surprisingly convincing chat bots to autonomous vehicles. It comes, however, with a big caveat: it's not always clear how or why a machine learning system comes to its outputs for a given input. Many networks operate as a black box, performing unknowable tasks on incoming data — and but the researchers' work is helping to open that box and shine a light within.
The team's work focused on two network types: fully-connected deep networks and convolutional neural networks (CNNs). A key part of their study involved investigating exactly what factors contribute to the state of "neural collapse," when a networks' training maps multiple class examples to a single template.
"Our analysis shows that neural collapse emerges from the minimization of the square loss with highly expressive deep neural networks," explains co-author and post-doctoral researcher Akshay Rangamani. "It also highlights the key roles played by weight decay regularization and stochastic gradient descent in driving solutions towards neural collapse..."
Read the full article on Hackster.io's website using the link below.