Making Computer Vision Models Robust and Adaptive

Yeo, Shuqing Teresa

doi:10.5075/epfl-thesis-9215

Yeo, Shuqing Teresa

2023

Télécharger

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Résumé

Visual perception is indispensable for many real-world applications. However, perception models deployed in the real world will encounter numerous and unpredictable distribution shifts, for example, changes in geographic locations, motion blur, and adverse weather conditions, among many others. Thus, to be useful in the real world, these models need to generalize to the complex distribution shifts that can occur. This thesis focuses on three directions aimed at achieving this goal. For the first direction, we introduce two robustness mechanisms. They are training-time mechanisms as inductive biases are incorporated at training-time and at test-time, the weights of the models are frozen. The first robustness mechanism we introduce ensembles predictions from a diverse set of cues. As each cue responds differently to a distribution shift, we adopt a principled way of merging these predictions and show that it can result in a final robust prediction. The second mechanism is motivated by the rigidity and biases of existing datasets. Examples of dataset biases include containing mostly scenes from developed countries, professional photographs, and so on. Here, we aim to control pre-trained generative models to generate targeted training data to account for these biases, that we can use to fine-tune our models. Training-time robustness mechanisms attempt to anticipate the shifts that can occur. However, distribution shifts can be unpredictable and models may return unreliable predictions if this shift was not accounted for at training time. Thus, for our second direction, we propose to incorporate test-time adaptation mechanisms so that models can adapt to shifts as they occur. To do so we create a closed-loop system that learns to use feedback signals computed from the environment. We show that this system is able to adapt efficiently at test time. For the last direction, we introduce a benchmark for testing models on realistic shifts. These shifts are attained from a set of image transformations that take the geometry of the scene into account. Thus, they are more likely to occur in the real world. We show that they can expose the vulnerabilities of existing models.

Détails

Titre Making Computer Vision Models Robust and Adaptive

Auteur(s) Yeo, Shuqing Teresa

Directeur(s)

Roshan Zamir, Amir
Dillenbourg, Pierre

Pagination 179

Date 2023

Editeur Lausanne, EPFL

Mots-clés (libres)

distribution shifts; robustness; adaptation; benchmarks

Langue Anglais

DOI https://doi.org/10.5075/epfl-thesis-9215

Laboratoires VILAB

Le document apparaît dans Production scientifique et compétences > I&C - Faculté Informatique & Communications > GR-SCI-IC - Groupe de scientifiques IC > VILAB - Laboratoire d'intelligence et d'apprentissage visuels
Production scientifique et compétences > Thèses EPFL
Travail produit à l'EPFL
Publié
Thèses

Date de création de la notice 2023-12-04

Files

Résumé

Détails

PDF