%0 Generic %D 2020 %T Implicit dynamic regularization in deep networks %A Tomaso Poggio %A Qianli Liao %A Mengjia Xu %X
Square loss has been observed to perform well in classification tasks, at least as well as crossentropy. However, a theoretical justification is lacking. Here we develop a theoretical analysis for the square loss that complements the existing asymptotic analysis for the exponential loss.
%8 08/2020 %2