A comparison of deep networks with ReLU activation function and linear spline-type methods

Konstantin Eckle, Anselm Johannes Schmidt-Hieber

Research output: Contribution to journalArticleAcademicpeer-review

44 Citations (Scopus)

Abstract

Deep neural networks (DNNs) generate much richer function spaces than shallow networks. Since the function spaces induced by shallow networks have several approximation theoretic drawbacks, this explains, however, not necessarily the success of deep networks. In this article we take another route by comparing the expressive power of DNNs with ReLU activation function to linear spline methods. We show that MARS (multivariate adaptive regression splines) is improper learnable by DNNs in the sense that for any given function that can be expressed as a function in MARS with M parameters there exists a multilayer neural network with O(Mlog(M∕ε)) parameters that approximates this function up to sup-norm error ε. We show a similar result for expansions with respect to the Faber–Schauder system. Based on this, we derive risk comparison inequalities that bound the statistical risk of fitting a neural network by the statistical risk of spline-based methods. This shows that deep networks perform better or only slightly worse than the considered spline methods. We provide a constructive proof for the function approximations.
Original languageEnglish
Pages (from-to)232-242
JournalNeural networks
Volume110
DOIs
Publication statusPublished - 1 Feb 2019

Keywords

  • Deep neural networks
  • Nonparametric regression;
  • Splines
  • MARS
  • Faber–Schauder system
  • Rates of convergence

Fingerprint Dive into the research topics of 'A comparison of deep networks with ReLU activation function and linear spline-type methods'. Together they form a unique fingerprint.

Cite this