My sister's keeper?: genomic research and the identifiability of siblings
Author(s)
Cassa, Christopher A.; Schmidt, Brian; Kohane, Isaac; Mandl, Kenneth D.
DownloadCassa_My sister's keeper.pdf (631.1Kb)
PUBLISHER_CC
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
Background: Genomic sequencing of SNPs is increasingly prevalent, though the amount of familial
information these data contain has not been quantified.
Methods: We provide a framework for measuring the risk to siblings of a patient's SNP genotype
disclosure, and demonstrate that sibling SNP genotypes can be inferred with substantial accuracy.
Results: Extending this inference technique, we determine that a very low number of matches at
commonly varying SNPs is sufficient to confirm sib-ship, demonstrating that published sequence
data can reliably be used to derive sibling identities. Using HapMap trio data, at SNPs where one
child is homozygotic major, with a minor allele frequency ≤ 0.20, (N = 452684, 65.1%) we achieve
91.9% inference accuracy for sibling genotypes.
Conclusion: These findings demonstrate that substantial discrimination and privacy risks arise
from use of inferred familial genomic data.
Date issued
2008-07Department
Harvard University--MIT Division of Health Sciences and Technology; Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Civil and Environmental EngineeringJournal
BMC Medical Genomics
Publisher
BioMed Central Ltd.
Citation
Cassa, Christopher, Brian Schmidt, Isaac Kohane, and Kenneth Mandl. 2008. My sister's keeper?: genomic research and the identifiability of siblings. BMC Medical Genomics 1, no. 1: 32.
Version: Final published version
ISSN
1755-8794