My sister's keeper?: genomic research and the identifiability of siblings

Cassa, Christopher A; Schmidt, Brian; Kohane, Isaac S; Mandl, Kenneth D

Author(s)

Cassa, Christopher A.; Schmidt, Brian; Kohane, Isaac; Mandl, Kenneth D.

DownloadCassa_My sister's keeper.pdf (631.1Kb)

PUBLISHER_CC

Terms of use

Creative Commons Attribution http://creativecommons.org/licenses/by/2.0

Metadata

Show full item record

Abstract

Background: Genomic sequencing of SNPs is increasingly prevalent, though the amount of familial information these data contain has not been quantified. Methods: We provide a framework for measuring the risk to siblings of a patient's SNP genotype disclosure, and demonstrate that sibling SNP genotypes can be inferred with substantial accuracy. Results: Extending this inference technique, we determine that a very low number of matches at commonly varying SNPs is sufficient to confirm sib-ship, demonstrating that published sequence data can reliably be used to derive sibling identities. Using HapMap trio data, at SNPs where one child is homozygotic major, with a minor allele frequency ≤ 0.20, (N = 452684, 65.1%) we achieve 91.9% inference accuracy for sibling genotypes. Conclusion: These findings demonstrate that substantial discrimination and privacy risks arise from use of inferred familial genomic data.

Date issued

2008-07

URI

http://hdl.handle.net/1721.1/49471

Department

Harvard University--MIT Division of Health Sciences and Technology; Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Civil and Environmental Engineering

Journal

BMC Medical Genomics

Publisher

BioMed Central Ltd.

Citation

Cassa, Christopher, Brian Schmidt, Isaac Kohane, and Kenneth Mandl. 2008. My sister's keeper?: genomic research and the identifiability of siblings. BMC Medical Genomics 1, no. 1: 32.

Version: Final published version

ISSN

1755-8794

Collections

MIT Open Access Articles

DSpace@MIT