MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Uncertainty and Generality of Transfer Learning Models in Predicting Signaling History

Author(s)
Lu, Claire
Thumbnail
DownloadThesis PDF (3.866Mb)
Advisor
Li, Pulin
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Proper cell-cell communication is essential for multicellular development, from embryogenesis to stem cell differentiation. To map these networks, we developed IRIS (Intracellular Response to Infer Signaling state), a semi-supervised deep learning method that fits conditional variational autoencoders (CVAE) to single-cell RNA sequencing (scRNA-seq) data. IRIS is able to annotate cellular signaling states of individual cells using only their gene expression. Currently, IRIS has been validated in developmental contexts, including gastrulation, early endoderm organogenesis, and mesoderm lineages in mouse embryos. However, its predictions often show extremely high or extremely low confidence, suggesting a need for methods to prevent overconfidence and better account for uncertainty. To generalize IRIS to broader cell-cell communication problems, we combined engineering and experimental approaches, integrating uncertainty quantification techniques with new biological datasets. We implemented three approaches for estimating uncertainty in IRIS predictions: stochastic sampling, Monte Carlo dropout, and ensemble prediction. These approaches were evaluated on two new endoderm and mesenchyme combinatorial perturbation screens. Across all methods, uncertainty values reliably reflected the varying difficulty of predicting different signaling pathways, driven by both biological complexity and dataset representation. Moreover, higher uncertainty was consistently associated with lower prediction accuracy, confirming uncertainty as a useful proxy for model confidence. All three methods identified similar high-uncertainty cell populations, supporting their consistency and validity. By incorporating uncertainty quantification into IRIS, we provide more robust and interpretable predictions that can guide future experiments and enhance the model’s applicability across diverse biological contexts.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/162704
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.