Show simple item record

dc.contributor.advisorManolis Kellis.en_US
dc.contributor.authorHerr, Taylor(Taylor J.)en_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2020-04-13T18:22:23Z
dc.date.available2020-04-13T18:22:23Z
dc.date.copyright2017en_US
dc.date.issued2019en_US
dc.identifier.urihttps://hdl.handle.net/1721.1/124573
dc.descriptionThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.en_US
dc.descriptionThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019en_US
dc.descriptionCataloged from student-submitted PDF version of thesis. "June 2019."en_US
dc.descriptionIncludes bibliographical references (pages 89-91).en_US
dc.description.abstractDisease-associated nucleotides lie primarily in non-coding regions, increasing the urgency of understanding how gene-regulatory circuitry impacts human disease. Here, we use the increasing availability of functional genomics datasets and models elucidating how regulatory proteins control genes, to evaluate the impact of genetic variants on the activity of diverse regulators. First, we generate a comprehensive compendium of predicted binding intensities across the entire genome for over 500 transcription factors. Second, we create a novel dataset to connect how these binding intensities change in the context of disease datasets. Third, we develop a statistical framework to integrate these two datasets using dimensionality reduction, latent cluster discovery, and topic modeling. We use these techniques to show that regulatory proteins with analogous biological functions share similar global changes in binding due to genome-wide genetic variation. We also use our framework to discover a latent set of topics behind all genomic locations in chromosome 1, to link the locations in each of the topic clusters with a class of related diseases, and to show that relevant biological processes are statistically enriched in the genomic locations most related to each cluster.en_US
dc.description.statementofresponsibilityby Taylor Herr.en_US
dc.format.extent91 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleDissecting the gene-regulatory circuitry of disease-associated genetic variantsen_US
dc.typeThesisen_US
dc.description.degreeM. Eng.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.identifier.oclc1149038812en_US
dc.description.collectionM.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienceen_US
dspace.imported2020-04-13T18:21:54Zen_US
mit.thesis.degreeMasteren_US
mit.thesis.departmentEECSen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record