Privacy and identifiability in clinical research, personalized medicine, and public health surveillance

Cassa, Christopher A

Author(s)

Cassa, Christopher A

DownloadFull printable version (25.82Mb)

Other Contributors

Harvard University--MIT Division of Health Sciences and Technology.

Advisor

Peter Szolovits.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Electronic transmission of protected health information has become pervasive in research, clinical, and public health investigations, posing substantial risk to patient privacy. From clinical genetic screenings to publication of data in research studies, these activities have the potential to disclose identity, medical conditions, and hereditary data. To enable an era of personalized medicine, many research studies are attempting to correlate individual clinical outcomes with genomic data, leading to thousands of new investigations. Critical to the success of many of these studies is research participation by individuals who are willing to share their genotypic and clinical data with investigators, necessitating methods and policies that preserve privacy with such disclosures. We explore quantitative models that allow research participants, patients and investigators to fully understand these complex privacy risks when disclosing medical data. This modeling will improve the informed consent and risk assessment process, for both demographic and medical data, each with distinct domain-specific scenarios. We first discuss the disclosure risk for genomic data, investigating both the risk of re-identification for SNPs and mutations, as well as the disclosure impact on family members. Next, the deidentification and anonymization of geospatial datasets containing information about patient home addresses will be examined, using mathematical skewing algorithms as well as a linear programming approach. Finally, we consider the re-identification potential of geospatial data, commonly shared in both textual form and in printed maps in journals and public health practice. We also explore methods to quantify the anonymity afforded when using these anonymization techniques.

Description

Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2008.

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.

Includes bibliographical references (p. 191-200).

Date issued

2008

URI

http://hdl.handle.net/1721.1/45624

Department

Harvard University--MIT Division of Health Sciences and Technology

Publisher

Massachusetts Institute of Technology

Keywords

Harvard University--MIT Division of Health Sciences and Technology.

Collections

Doctoral Theses