We show that deep networks are better than shallow networks at approximating functions that can be expressed as a composition of functions described by a directed acyclic graph, because the deep networks can be designed to have the same compositional structure, while a shallow network cannot exploit this knowledge. Thus, the blessing of compositionality mitigates the curse of dimensionality. On the other hand, a theorem called good propagation of errors allows to {\textquotedblleft}lift{\textquotedblright} theorems about shallow networks to those about deep networks with an appropriate choice of norms, smoothness, etc. We illustrate this in three contexts where each channel in the deep network calculates a spherical polynomial, a non-smooth ReLU network, or another zonal function network related closely with the ReLU network.

}, author = {Mhaskar, H. N. and T. Poggio} } @article {3662, title = {Deep vs. shallow networks: An approximation theory perspective}, journal = {Analysis and Applications}, volume = {14}, year = {2016}, month = {01/2016}, pages = {829 - 848}, abstract = {The paper briefly reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in function approximation problems than shallow, one-hidden layer architectures. The paper announces new results for a non-smooth activation function {\textemdash} the ReLU function {\textemdash} used in present-day neural networks, as well as for the Gaussian networks. We propose a new definition of *relative dimension* to encapsulate different notions of sparsity of a function class that can possibly be exploited by deep networks but not by shallow ones to drastically reduce the complexity required for approximation and learning.

},
keywords = {blessed representation, deep and shallow networks, Gaussian networks, ReLU networks},
issn = {0219-5305},
doi = {10.1142/S0219530516400042},
url = {http://www.worldscientific.com/doi/abs/10.1142/S0219530516400042},
author = {Mhaskar, H. N. and Tomaso Poggio}
}