# Dynamics and Neural Collapse in Deep Classifiers trained with the Square Loss

 Title Dynamics and Neural Collapse in Deep Classifiers trained with the Square Loss Publication Type CBMM Memos Year of Publication 2021 Authors Xu, M, Rangamani, A, Banburski, A, Liao, Q, Galanti, T, Poggio, T Abstract Here we consider a model of the dynamics of gradient flow under the square loss in overparametrized ReLUnetworks. We show that convergence to a solution with the maximum margin, which is the inverse of the product of the Frobenius norms of each layer weight matrix, is expected when normalization by a Lagrange multiplier (LM) is used together with Weight Decay (WD). We prove that SGD converges to solutions that have a bias towards 1)large margin and 2) low rank of the weight matrices. In addition, the solutions are predicted to show Neural Collapse. Non-vacous bounds are shown for expected error based on empirical margin.
