Enabling Privacy-Preserving GWASs in Heterogeneous Human Populations
Author(s)
Sahinalp, Cenk; Simmons, Sean Kenneth; Berger Leighton, Bonnie
Download1-s2.0-S2405471216301211-main.pdf (1.218Mb)
PUBLISHER_CC
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
The proliferation of large genomic databases offers the potential to perform increasingly larger-scale genome-wide association studies (GWASs). Due to privacy concerns, however, access to these data is limited, greatly reducing their usefulness for research. Here, we introduce a computational framework for performing GWASs that adapts principles of differential privacy-a cryptographic theory that facilitates secure analysis of sensitive data-to both protect private phenotype information (e.g., disease status) and correct for population stratification. This framework enables us to produce privacy-preserving GWAS results based on EIGENSTRAT and linear mixed model (LMM)-based statistics, both of which correct for population stratification. We test our differentially private statistics, PrivSTRAT and PrivLMM, on simulated and real GWAS datasets and find they are able to protect privacy while returning meaningful results. Our framework can be used to securely query private genomic datasets to discover which specific genomic alterations may be associated with a disease, thus increasing the availability of these valuable datasets.
Date issued
2016-07Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of MathematicsJournal
Cell Systems
Publisher
Elsevier
Citation
Simmons, Sean et al. “Enabling Privacy-Preserving GWASs in Heterogeneous Human Populations.” Cell Systems 3, 1 (July 2016): 54–61
Version: Final published version
ISSN
2405-4712