Heiss, J. M. (2019). Implicit regularization for artificial neural networks [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2019.69320
The main result is a rigorous proof that artificial neural networks without explicit regularization implicitly regularize the integral of the squared second derivative. when trained by gradient descent by solving very precisely the smoothing spline regression problem := arg min C2(Ni=1((xi train)yi traini)2+(′′)2dx) under certain conditions. Artificial neural networks are often used in Machine Learning to estimate an unknown function True by only observing finitely many data points. There are many methods that guarantee the convergence of the estimated function to the true function True as the number of samples tends to infinity. But in practice there is almost always only a finite number N of samples available. Given a finite number of data points there are infinitely many functions that fit perfectly through the N data points but generalize arbitrary bad. Therefore one needs some regularization to find a suitable function. With the help of the main theorem one can solve the paradox why training neural networks without explicit regularization works surprisingly well under certain conditions (in the case of 1-dimensional wide ReLU randomized shallow neural networks).