%0 Report %D 2017 %T Fisher-Rao Metric, Geometry, and Complexity of Neural Networks %A Liang, Tengyuan %A Tomaso Poggio %A Alexander Rakhlin %A Stokes, James %K capacity control %K deep learning %K Fisher-Rao metric %K generalization error %K information geometry %K Invariance %K natural gradient %K ReLU activation %K statistical learning theory %X

We study the relationship between geometry and capacity measures for deep  neural  networks  from  an  invariance  viewpoint.  We  introduce  a  new notion  of  capacity — the  Fisher-Rao  norm — that  possesses  desirable  in- variance properties and is motivated by Information Geometry. We discover an analytical characterization of the new capacity measure, through which we establish norm-comparison inequalities and further show that the new measure serves as an umbrella for several existing norm-based complexity measures.  We  discuss  upper  bounds  on  the  generalization  error  induced by  the  proposed  measure.  Extensive  numerical  experiments  on  CIFAR-10 support  our  theoretical  findings.  Our  theoretical  analysis  rests  on  a  key structural lemma about partial derivatives of multi-layer rectifier networks.

%B arXiv.org %8 11/2017 %G eng %U https://arxiv.org/abs/1711.01530