Computational methods for analyzing and modeling gene regulation and 3D genome organization
Author(s)
Belyaeva, Anastasiya.
Download1252628026-MIT.pdf (72.98Mb)
Other Contributors
Massachusetts Institute of Technology. Computational and Systems Biology Program.
Advisor
Caroline Uhler.
Terms of use
Metadata
Show full item recordAbstract
Biological processes from differentiation to disease progression are governed by gene regulatory mechanisms. Currently large-scale omics and imaging data sets are being collected to characterize gene regulation at every level. Such data sets present new opportunities and challenges for extracting biological insights and elucidating the gene regulatory logic of cells. In this thesis, I present computational methods for the analysis and integration of various data types used for cell profiling. Specifically, I focus on analyzing and linking gene expression with the 3D organization of the genome. First, I describe methodologies for elucidating gene regulatory mechanisms by considering multiple data modalities. I design a computational framework for identifying colocalized and coregulated chromosome regions by integrating gene expression and epigenetic marks with 3D interactions using network analysis. Then, I provide a general framework for data integration using autoencoders and apply it for the integration and translation between gene expression and chromatin images of naive T-cells. Second, I describe methods for analyzing single modalities such as contact frequency data, which measures the spatial organization of the genome, and gene expression data. Given the important role of the 3D genome organization in gene regulation, I present a methodology for reconstructing the 3D diploid conformation of the genome from contact frequency data. Given the ubiquity of gene expression data and the recent advances in single-cell RNA-sequencing technologies as well as the need for causal modeling of gene regulatory mechanisms, I then describe an algorithm as well as a software tool, difference causal inference (DCI), for learning causal gene regulatory networks from gene expression data. DCI addresses the problem of directly learning differences between causal gene regulatory networks given gene expression data from two related conditions. Finally, I shift my focus from basic biology to drug discovery. Given the current COVID19 pandemic, I present a computational drug repurposing platform that enables the identification of FDA approved compounds for drug repurposing and investigation of potential causal drug mechanisms. This framework relies on identifying drugs that reverse the signature of the infection in the space learned by an autoencoder and then uses causal inference to identify putative drug mechanisms.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Computational and Systems Biology Program, February, 2021 Cataloged from the official PDF of thesis. Includes bibliographical references (pages 261-281).
Date issued
2021Department
Massachusetts Institute of Technology. Computational and Systems Biology ProgramPublisher
Massachusetts Institute of Technology
Keywords
Computational and Systems Biology Program.