dc.contributor.advisor | Gifford, David K. | |
dc.contributor.author | Krismer, Konstantin | |
dc.date.accessioned | 2022-02-07T15:25:48Z | |
dc.date.available | 2022-02-07T15:25:48Z | |
dc.date.issued | 2021-09 | |
dc.date.submitted | 2021-11-17T22:09:43.200Z | |
dc.identifier.uri | https://hdl.handle.net/1721.1/140131 | |
dc.description.abstract | Many advances in functional genomics and in biology more broadly can be attributed to the rise of massively parallel sequencing technology and its derivatives. As the volume of sequencing and other high-throughput experimental data increases exponentially, so does the need for computational methods to analyze and condense these vast amounts of data, and to help explain the underlying phenomena. In this thesis, I describe five projects that introduce novel techniques and methods in functional genomics.
The first project introduces a simulation-based framework to investigate neural network architectures that are trained on biological sequence data, as is common in functional genomics. The second project describes a two-pronged approach to study the determinants of cell type-specific chromatin accessibility, with an ensemble of neural networks trained on DNase-seq data to predict chromatin accessibility, and MIAA, the multiplexed integrated accessibility assay, to validate, experimentally, these in silico predictions. The third project presents a method to identify long-range genomic interactions from ChIA-PET and HiChIP data. Enabled by this work, the fourth project aims to provide a means to identify reproducible long-range genomic interactions. We continue the analysis of long-range interactions in the fifth project by performing co-enrichment analysis of transcription factor sequence motifs.
Collectively, these methods provide new approaches to a range of problems in functional genomics, from finding appropriate neural network architectures for sequence-based prediction tasks to uncovering patterns in long-range genomic interactions. | |
dc.publisher | Massachusetts Institute of Technology | |
dc.rights | In Copyright - Educational Use Permitted | |
dc.rights | Copyright MIT | |
dc.rights.uri | http://rightsstatements.org/page/InC-EDU/1.0/ | |
dc.title | Principled Methods and Models for Deep Learning Based Functional Genomics | |
dc.type | Thesis | |
dc.description.degree | Ph.D. | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Biological Engineering | |
dc.identifier.orcid | https://orcid.org/0000-0001-8994-3416 | |
mit.thesis.degree | Doctoral | |
thesis.degree.name | Doctor of Philosophy | |