MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Few Shot Learning for Rare Disease Diagnosis

Author(s)
Alsentzer, Emily
Thumbnail
DownloadThesis PDF (22.80Mb)
Advisor
Kohane, Isaac
Szolovits, Peter
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Rare diseases affect 300-400 million people worldwide, yet each disease has very low prevalence, affecting no more than 50 per 100,000 individuals. Many patients with rare genetic conditions remain undiagnosed due to clinicians' lack of experience with the individual diseases and the considerable heterogeneity of clinical presentations. Machine-assisted diagnosis offers the opportunity to shorten the diagnostic delays for rare disease patients. Recent advances in deep learning have considerably improved the accuracy of medical diagnosis. However, much of the success thus far is contingent on the availability of large annotated datasets containing thousands of examples per condition for training machine learning models. Machine-assisted diagnosis of rare diseases presents unique challenges; approaches must learn from limited data and extrapolate beyond training distribution to novel genetic conditions. The goal of this thesis is to develop few shot learning methods that can overcome the data limitations of deep learning approaches to diagnose patients with rare genetic conditions. Motivated by the need to infuse external knowledge into models, we first develop novel graph neural network methods for subgraph representation learning that encode how subgraphs (e.g., a set of patient phenotypes) relate to a larger knowledge graph. To address the issue of data scarcity, we next develop a framework for simulating realistic rare disease patients with novel genetic conditions and demonstrate how these simulated patients are similar to real rare disease patients. Finally, we leverage these advances to develop \name, a few shot method for diagnosis of patients with rare genetic conditions in the Undiagnosed Diseases Network. SHEPHERD reasons over biomedical knowledge via geometric deep learning to learn generalizable representations of rare disease patients. \name can operate at multiple facets throughout the rare disease diagnosis process: performing causal gene discovery, retrieving “patients-like-me" with the same causal gene or disease, and providing interpretable characterizations of novel disease presentations. Our work illustrates the potential for deep learning methods to rapidly accelerate molecular diagnosis and shorten the diagnostic odyssey for rare disease patients.
Date issued
2022-09
URI
https://hdl.handle.net/1721.1/147431
Department
Harvard-MIT Program in Health Sciences and Technology
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.