MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Simultaneous computational discovery of DNA regulatory motifs and transcription factor binding constraints at high spatial resolution

Author(s)
Guo, Yuchun
Thumbnail
DownloadFull printable version (3.214Mb)
Alternative title
Simultaneous computational discovery of Deoxyribonucleic acid regulatory motifs and transcription factor binding constraints at high spatial resolution
Other Contributors
Massachusetts Institute of Technology. Computational and Systems Biology Program.
Advisor
David K. Gifford.
Terms of use
M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
I present three novel computational methods to address the challenge of identifying protein-DNA interactions at high spatial resolution from noisy ChIP-Seq data. I first present the genome positioning system (GPS) algorithm which predicts protein-DNA interaction events from ChIP-Seq data using a single-base resolution generative probabilistic model. Using synthetic and actual ChIP-Seq data, I show that GPS improves the effective spatial resolution and accuracy in resolving proximal binding events when comparing with existing methods. Second, I present the k-mer set motif (KSM) representation and the k-mer motif alignment and clustering (KMAC) method which discovers DNA-binding motifs from ChIP-Seq derived sequences. I demonstrate that the KSM model is more predictive than the widely used position weight matrix model, and that KMAC outperforms other existing motif discovery programs in recovering known motifs from a large collection of human ChIP-Seq experiments. Finally, I present an integrative method, genome wide event finding and motif discovery (GEM), which models ChIP data with explanatory motifs and binding events at high spatial resolution. The GEM model links binding event discovery and motif discovery with positional priors in the context of a generative probabilistic model of ChIP data and genome sequence. I show that GEM further improve upon previous methods for processing ChIP-Seq and ChIP-exo data to yield unsurpassed spatial resolution and discovery of proximal binding events. GEM enables a systematic analysis of in vivo transcription factor binding to discover hundreds of spatial binding constraints between factors in human and mouse cells, including known factor pairs and novel pairs such as c-Fos:c-Jun/USF1, CTCF/Egr1, and HNF4a/FOXA1. I also discovered a complex spatial binding relationship involved 6 key regulatory factors in mouse embryonic stem (ES) cell that is likely to be functional in ES cell gene regulation. Such computational discoveries propose testable models for regulatory factor interactions that will help elucidate genome function and the implementation of combinatorial control.
Description
Thesis (Ph. D.)--Massachusetts Institute of Technology, Computational and Systems Biology Program, 2012.
 
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
 
Cataloged from student-submitted PDF version of thesis.
 
Includes bibliographical references (p. 126-135).
 
Date issued
2012
URI
http://hdl.handle.net/1721.1/77640
Department
Massachusetts Institute of Technology. Computational and Systems Biology Program
Publisher
Massachusetts Institute of Technology
Keywords
Computational and Systems Biology Program.

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.