Simulating the Large-Scale Erosion of Genomic Privacy Over Time

Backes, Michael; Berrang, Pascal; Humbert, Mathias; Shen, Xiaoyu; Wolf, Verena

doi:10.1109/TCBB.2018.2859380

Backes, Michael; Berrang, Pascal; Humbert, Mathias; Shen, Xiaoyu; Wolf, Verena

2018

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

The dramatically decreasing costs of DNA sequencing have triggered more than a million humans to have their genotypes sequenced. Moreover, these individuals increasingly make their genomic data publicly available, thereby creating privacy threats for themselves and their relatives because of their DNA similarities. More generally, an entity that gains access to a significant fraction of sequenced genotypes might be able to infer even the genomes of unsequenced individuals. In this paper, we propose a simulation-based model for quantifying the impact of continuously sequencing and publicizing personal genomic data on a population's genomic privacy. Our simulation probabilistically models data sharing and takes into account events such as migration and interracial mating. We exemplarily instantiate our simulation with a sample population of 1,000 individuals and evaluate the privacy under multiple settings over 6,000 genomic variants and a subset of phenotype-related variants. Our findings demonstrate that an increasing sharing rate in the future entails a substantial negative effect on the privacy of all older generations. Moreover, we find that mixed populations face a less severe erosion of privacy over time than more homogeneous populations. Finally, we demonstrate that genomic-data sharing can be much more detrimental for the privacy of the phenotype-related variants.

Details

Title Simulating the Large-Scale Erosion of Genomic Privacy Over Time

Author(s) Backes, Michael ; Berrang, Pascal ; Humbert, Mathias ; Shen, Xiaoyu ; Wolf, Verena

Published in Ieee-Acm Transactions On Computational Biology And Bioinformatics

Volume 15

Issue 5

Pages 1405-1412

Date 2018-09-01

Publisher Los Alamitos, IEEE COMPUTER SOC

ISSN 1545-5963
1557-9964

Keywords

genomic privacy; simulations; inference; graphical models; complex pedigrees; genetic analyses; genotypes

Note 3rd International Workshop on Genome Privacy and Security (GenoPri), Chicago, IL, Nov 12, 2016

DOI https://doi.org/10.1109/TCBB.2018.2859380

Other identifier(s) View record in Web of Science

Laboratories ISC
LDS

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IC Archives > ISC - Institute of Communication Systems
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2018-12-13