On the symmetries in the dynamics of wide two-layer neural networks

Hajjar, Karl; Chizat, Lenaic

doi:10.3934/era.2023112

Hajjar, Karl; Chizat, Lenaic

2023

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

We consider the idealized setting of gradient flow on the population risk for infinitely wide two-layer ReLU neural networks (without bias), and study the effect of symmetries on the learned parameters and predictors. We first describe a general class of symmetries which, when satisfied by the target function f* and the input distribution, are preserved by the dynamics. We then study more specific cases. When f* is odd, we show that the dynamics of the predictor reduces to that of a (non -linearly parameterized) linear predictor, and its exponential convergence can be guaranteed. When f* has a low-dimensional structure, we prove that the gradient flow PDE reduces to a lower-dimensional PDE. Furthermore, we present informal and numerical arguments that suggest that the input neurons align with the lower-dimensional structure of the problem.

Details

Title On the symmetries in the dynamics of wide two-layer neural networks

Author(s) Hajjar, Karl ; Chizat, Lenaic

Published in Electronic Research Archive

Volume 31

Issue 4

Pages 2175-2212

Date 2023-01-01

Publisher Springfield, AMER INST MATHEMATICAL SCIENCES-AIMS

ISSN 2688-1594

Keywords

neural networks; gradient descent; infinite -width limit; representation learning

DOI https://doi.org/10.3934/era.2023112

Other identifier(s) View record in Web of Science

Laboratories DOLA

Record Appears in Scientific production and competences > SB - School of Basic Sciences > MATH - Institute of Mathematics > DOLA - Chair of Dynamics of Learning Algorithms
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2023-03-27