On the Implicit Bias Towards Minimal Depth of Deep Neural Networks

Title	On the Implicit Bias Towards Minimal Depth of Deep Neural Networks
Publication Type	Journal Article
Year of Publication	2022
Authors	Galanti, T, Galanti, L
Journal	arXiv
Date Published	03/2022
Abstract	We study the implicit bias of gradient based training methods to favor low-depth solutions when training deep neural networks. Recent results in the literature suggest that penultimate layer representations learned by a classifier over multiple classes exhibit a clustering property, called neural collapse. We demonstrate empirically that neural collapse extends beyond the penultimate layer and emerges in intermediate layers as well. In this regards, we hypothesize and empirically show that gradient based methods are implicitly biased towards selecting neural networks of minimal depth for achieving this clustering property.
URL	https://arxiv.org/abs/2202.09028

Download:

Associated Module: