On the Implicit Bias Towards Minimal Depth of Deep Neural Networks

TitleOn the Implicit Bias Towards Minimal Depth of Deep Neural Networks
Publication TypeJournal Article
Year of Publication2022
AuthorsGalanti, T, Galanti, L
JournalarXiv
Date Published03/2022
Abstract

We study the implicit bias of gradient based training methods to favor low-depth solutions when training deep neural networks. Recent results in the literature suggest that penultimate layer representations learned by a classifier over multiple classes exhibit a clustering property, called neural collapse. We demonstrate empirically that neural collapse extends beyond the penultimate layer and emerges in intermediate layers as well. In this regards, we hypothesize and empirically show that gradient based methods are implicitly biased towards selecting neural networks of minimal depth for achieving this clustering property.

URLhttps://arxiv.org/abs/2202.09028
Download:  PDF icon 2202.09028.pdf

Associated Module: 

CBMM Relationship: 

  • CBMM Funded