|Title||On the Implicit Bias Towards Minimal Depth of Deep Neural Networks|
|Publication Type||Journal Article|
|Year of Publication||2022|
|Authors||Galanti, T, Galanti, L|
We study the implicit bias of gradient based training methods to favor low-depth solutions when training deep neural networks. Recent results in the literature suggest that penultimate layer representations learned by a classifier over multiple classes exhibit a clustering property, called neural collapse. We demonstrate empirically that neural collapse extends beyond the penultimate layer and emerges in intermediate layers as well. In this regards, we hypothesize and empirically show that gradient based methods are implicitly biased towards selecting neural networks of minimal depth for achieving this clustering property.
- CBMM Funded