My sister's keeper?: genomic research and the identifiability of siblings
Author(s)
Cassa, Christopher A.; Schmidt, Brian; Kohane, Isaac; Mandl, Kenneth D.
DownloadCassa-2008-My sister's keeper__.pdf (631.1Kb)
PUBLISHER_CC
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
Background
Genomic sequencing of SNPs is increasingly prevalent, though the amount of familial information these data contain has not been quantified.
Methods
We provide a framework for measuring the risk to siblings of a patient's SNP genotype disclosure, and demonstrate that sibling SNP genotypes can be inferred with substantial accuracy.
Results
Extending this inference technique, we determine that a very low number of matches at commonly varying SNPs is sufficient to confirm sib-ship, demonstrating that published sequence data can reliably be used to derive sibling identities. Using HapMap trio data, at SNPs where one child is homozygotic major, with a minor allele frequency ≤ 0.20, (N = 452684, 65.1%) we achieve 91.9% inference accuracy for sibling genotypes.
Conclusion
These findings demonstrate that substantial discrimination and privacy risks arise from use of inferred familial genomic data.
Date issued
2008-07Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Civil and Environmental EngineeringJournal
BMC Medical Genomics
Publisher
BioMed Central Ltd.
Citation
Cassa, Christopher et al. “My sister's keeper?: genomic research and the identifiability of siblings.” BMC Medical Genomics 1.1 (2008): 32.
Version: Final published version
ISSN
1755-8794