MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Clinical Text De-identification Using Large Language Models: Insights from Organ Procurement Data

Author(s)
Dahleh, Omar
Thumbnail
DownloadThesis PDF (663.2Kb)
Advisor
Ghassemi, Marzyeh
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
This thesis presents a novel approach to the de-identification of clinical notes from Organ Procurement Organization (OPO) records, leveraging advanced natural language processing (NLP) methodologies. Specifically, we employ in-context learning using large language models (LLMs) to effectively identify and remove protected health information (PHI), aiming to maintain high data utility post-redaction. Our work systematically evaluates the performance of the LLM-based method against established baseline techniques, including traditional Named Entity Recognition (NER) and rules-based systems. Through a slew of experiments, we assesses the strengths and limitations of each method regarding precision and recall. This work will contribute to a uniquely extensive dataset, comprising millions of de-identified OPO clinical notes, which will facilitate ethical healthcare research and enhance compliance with contemporary data protection standards. Ultimately, this dataset holds significant potential for improving processes and outcomes within the field of organ donation and procurement.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/162739
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.