Neural ADMIXTURE for rapid genomic clustering

Mantes, Albert Dominguez; Montserrat, Daniel Mas; Bustamante, Carlos D.; Giro-i-Nieto, Xavier; Ioannidis, Alexander G.

doi:10.1038/s43588-023-00482-7

Mantes, Albert Dominguez; Montserrat, Daniel Mas; Bustamante, Carlos D.; Giro-i-Nieto, Xavier; Ioannidis, Alexander G.

2023

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Characterizing the genetic structure of large cohorts has become increasingly important as genetic studies extend to massive, increasingly diverse biobanks. Popular methods decompose individual genomes into fractional cluster assignments with each cluster representing a vector of DNA variant frequencies. However, with rapidly increasing biobank sizes, these methods have become computationally intractable. Here we present Neural ADMIXTURE, a neural network autoencoder that follows the same modeling assumptions as the current standard algorithm, ADMIXTURE, while reducing the compute time by orders of magnitude surpassing even the fastest alternatives. One month of continuous compute using ADMIXTURE can be reduced to just hours with Neural ADMIXTURE. A multi-head approach allows Neural ADMIXTURE to offer even further acceleration by computing multiple cluster numbers in a single run. Furthermore, the models can be stored, allowing cluster assignment to be performed on new data in linear time without needing to share the training samples.

Neural ADMIXTURE is a neural-network-based, interpretable autoencoder that performs rapid genomic clustering in biobank-scale databases.

Details

Title Neural ADMIXTURE for rapid genomic clustering

Author(s) Mantes, Albert Dominguez ; Montserrat, Daniel Mas ; Bustamante, Carlos D. ; Giro-i-Nieto, Xavier ; Ioannidis, Alexander G.

Published in Nature Computational Science

Date 2023-07-06

Publisher London, SPRINGERNATURE

ISSN 2662-8457

Keywords

population-structure; inference; ancestry; models

DOI https://doi.org/10.1038/s43588-023-00482-7

Other identifier(s) View record in Web of Science

Laboratories UPLAMANNO
GR-WEIGERT

Record Appears in Scientific production and competences > SV - School of Life Sciences > IBI-SV - Interfaculty Institute of Bioengineering > GR-WEIGERT - Weigert Group
Scientific production and competences > SV - School of Life Sciences > BMI - Brain Mind Institute > UPLAMANNO - Prof. La Manno Group
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2023-07-31

Files

Abstract

Details

PDF