How deep convolutional neural networks lose spatial information with training

Tomasini, Umberto Maria; Petrini, Leonardo; Cagnetta, Francesco; Wyart, Matthieu

doi:10.1088/2632-2153/ad092c

Tomasini, Umberto Maria; Petrini, Leonardo; Cagnetta, Francesco; Wyart, Matthieu

2023

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

A central question of machine learning is how deep nets manage to learn tasks in high dimensions. An appealing hypothesis is that they achieve this feat by building a representation of the data where information irrelevant to the task is lost. For image datasets, this view is supported by the observation that after (and not before) training, the neural representation becomes less and less sensitive to diffeomorphisms acting on images as the signal propagates through the network. This loss of sensitivity correlates with performance and surprisingly correlates with a gain of sensitivity to white noise acquired during training. Which are the mechanisms learned by convolutional neural networks (CNNs) responsible for the these phenomena? In particular, why is the sensitivity to noise heightened with training? Our approach consists of two steps. (1) Analyzing the layer-wise representations of trained CNNs, we disentangle the role of spatial pooling in contrast to channel pooling in decreasing their sensitivity to image diffeomorphisms while increasing their sensitivity to noise. (2) We introduce model scale-detection tasks, which qualitatively reproduce the phenomena reported in our empirical analysis. In these models we can assess quantitatively how spatial pooling affects these sensitivities. We find that the increased sensitivity to noise observed in deep ReLU networks is a mechanistic consequence of the perturbing noise piling up during spatial pooling, after being rectified by ReLU units. Using odd activation functions like tanh drastically reduces the CNNs' sensitivity to noise.

Details

Title How deep convolutional neural networks lose spatial information with training

Author(s) Tomasini, Umberto Maria ; Petrini, Leonardo ; Cagnetta, Francesco ; Wyart, Matthieu

Published in Machine Learning-Science And Technology

Volume 4

Issue 4

Pages 045026

Date 2023-12-01

Publisher Iop Publishing Ltd, Bristol

ISSN 2632-2153

Keywords

Deep Learning Theory; Convolutional Neural Networks; Curse Of Dimensionality; Representation Learning; Feature Learning; Learning Invariants

DOI https://doi.org/10.1088/2632-2153/ad092c

Other identifier(s) View record in Web of Science

Laboratories PCSL

Record Appears in Scientific production and competences > SB - School of Basic Sciences > IPHYS - Institute of Physics > PCSL - Physics of complex systems laboratory
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Grant Simons Foundationhttp://dx.doi.org/10.13039/100000893: 454953
Simons Foundation

Record creation date 2024-02-20

Files

Abstract

Details

PDF