My sister's keeper?: genomic research and the identifiability of siblings
Author(s)Cassa, Christopher A.; Schmidt, Brian; Kohane, Isaac; Mandl, Kenneth D.
MetadataShow full item record
Background: Genomic sequencing of SNPs is increasingly prevalent, though the amount of familial information these data contain has not been quantified. Methods: We provide a framework for measuring the risk to siblings of a patient's SNP genotype disclosure, and demonstrate that sibling SNP genotypes can be inferred with substantial accuracy. Results: Extending this inference technique, we determine that a very low number of matches at commonly varying SNPs is sufficient to confirm sib-ship, demonstrating that published sequence data can reliably be used to derive sibling identities. Using HapMap trio data, at SNPs where one child is homozygotic major, with a minor allele frequency ≤ 0.20, (N = 452684, 65.1%) we achieve 91.9% inference accuracy for sibling genotypes. Conclusion: These findings demonstrate that substantial discrimination and privacy risks arise from use of inferred familial genomic data.
DepartmentHarvard University--MIT Division of Health Sciences and Technology; Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Civil and Environmental Engineering
BMC Medical Genomics
BioMed Central Ltd.
Cassa, Christopher, Brian Schmidt, Isaac Kohane, and Kenneth Mandl. 2008. My sister's keeper?: genomic research and the identifiability of siblings. BMC Medical Genomics 1, no. 1: 32.
Final published version