On the Effect of Initialization: The Scaling Path of 2-Layer Neural Networks

Neumayer, Sebastian Jonas; Chizat, Lenaic; Unser, Michael

Neumayer, Sebastian Jonas; Chizat, Lenaic; Unser, Michael

2024

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

In supervised learning, the regularization path is sometimes used as a convenient theoretical proxy for the optimization path of gradient descent initialized from zero. In this paper, we study a modification of the regularization path for infinite-width 2-layer ReLU neural networks with nonzero initial distribution of the weights at different scales. By exploiting a link with unbalanced optimal -transport theory, we show that, despite the non-convexity of the 2-layer network training, this problem admits an infinite-dimensional convex counterpart. We formulate the corresponding functional-optimization problem and investigate its main properties. In particular, we show that, as the scale of the initialization ranges between 0 and +infinity, the associated path interpolates continuously between the so-called kernel and rich regimes. Numerical experiments confirm that, in our setting, the scaling path and the final states of the optimization path behave similarly, even beyond these extreme points.

Details

Title On the Effect of Initialization: The Scaling Path of 2-Layer Neural Networks

Author(s) Neumayer, Sebastian Jonas ; Chizat, Lenaic ; Unser, Michael

Published in Journal Of Machine Learning Research

Volume 25

Pages 15

Date 2024-01-01

ISSN 1532-4435

Keywords

Gradient-Descent Training; Regularization Path; Neural Tangent Kernel; Gamma- Convergence; Hellinger-Kantorovich Distance

Other identifier(s) View record in Web of Science

Laboratories DOLA
LIB
LIB

Record Appears in Scientific production and competences > SB - School of Basic Sciences > MATH - Institute of Mathematics > DOLA - Chair of Dynamics of Learning Algorithms
Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIB - Biomedical Imaging Group
Scientific production and competences > Euler Center for Signal Processing
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Grant European Research Council (ERC) under European Union's Horizon 2020 (H2020)
101020573

Record creation date 2024-03-18

Files

Abstract

Details

PDF