Fundamental Limits in Statistical Learning Problems: Block Models and Neural Networks

Cornacchia, Elisabetta

doi:10.5075/epfl-thesis-10519

Cornacchia, Elisabetta

2023

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

This thesis focuses on two selected learning problems: 1) statistical inference on graphs models, and, 2) gradient descent on neural networks, with the common objective of defining and analysing the measures that characterize the fundamental limits. In the first part of the thesis, we consider spin synchronization problems on graphs, which consist of reconstructing a vector of n independent spins living on the vertices of a graph, based on noisy observations of their interactions on the edges of the graph. In particular, we consider synchronization models with erasure (BEC) side-information, where the spins of a small fraction of nodes are revealed, and investigate how the addition of such side-information influences the correlations between spins at distant sites. We show that on trees, whenever spins at distant sites are nearly independent given the edge observations, then they are still nearly independent given the edge observations and the side-information. We conjecture this to hold for any graph. On the other hand, (Kanade et al., 2014) conjectured that, on regular trees and on Galton-Watson trees, whenever any small fraction of node labels are revealed, the boundary at infinite depth becomes ineffective for detecting the root bit, even in the reconstruction regime. We explain how this can be used for computing the limiting entropy of the sparse Stochastic Block Model (SBM) with two symmetric communities. Finally, we show that the latter conjecture does not hold for every tree. In the second part of the thesis, we consider the problem of learning Boolean target functions with gradient descent (GD) on fully connected neural networks. We introduce the notion of "Initial Alignment" (INAL) between a neural network at initialization and a target function and prove that if a network and target do not have a noticeable INAL, then noisy gradient descent on a fully connected network with i.i.d. Gaussian initialization cannot learn the target in polynomial time. We show that for finite depth networks trained with the correlation loss, the result can be extended beyond Boolean inputs. Moreover, we prove that in a similar setting, the generalization error can be lower-bounded in terms of the noise-stability of the target function, supporting a conjecture made in (Zhang et al., 2021). We then show that in the distribution shift setting, when the data withholding corresponds to freezing a single feature, the generalisation error admits a tight characterisation in terms of the Boolean influence for several relevant architectures. This is shown on linear models and supported experimentally on other models such as MLPs and Transformers. In particular, this puts forward the hypothesis that for such architectures and for learning logical functions, GD tends to have an implicit bias towards low-degree representations. Finally, we consider a 'curriculum learning' (CL) strategy for learning k -parities on d bits of a binary string. We show that a wise choice of training examples, involving two or more product distributions, allows to learn k -parities in d O (1) time with a fully connected neural network trained with GD. We further show that for another class of functions - namely the 'Hamming mixtures' - CL strategies involving a bounded number of product distributions are not beneficial.

Details

Title Fundamental Limits in Statistical Learning Problems: Block Models and Neural Networks

Author(s) Cornacchia, Elisabetta

Advisor(s)

Abbé, Emmanuel

Pagination 183

Date 2023

Publisher Lausanne, EPFL

Keywords

statistical learning; fundamental limits; complexity measures; stochastic block models; synchronization on graphs; neural networks.

Language English

DOI https://doi.org/10.5075/epfl-thesis-10519

Laboratories MDS1

Record Appears in Scientific production and competences > SB - School of Basic Sciences > MATH - Institute of Mathematics > MDS1 - Chair of Mathematical Data Science
Scientific production and competences > EPFL Theses
Work produced at EPFL
Published
Theses

Record creation date 2023-07-04

Files

Abstract

Details

PDF