Characterizing and predicting enhancers in the human genome
Author(s)
Roytman, Megan (Megan D.)
DownloadFull printable version (2.083Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Matthew L. Eaton.
Terms of use
Metadata
Show full item recordAbstract
Characterizing the functions of sequences in the human genome is crucial for the study and treatment of human disease. Though it is known that approximately 5% of the human genome is conserved, about 40% of these sequences have yet to be characterized, many of which may be important players in human disease pathways (1). Experimental and computational techniques have been developed which use histone modifications to segment the human genome into 25 different chromatin states, including states corresponding to various functional sequences like promoters and enhancers (4). However, the availability of this data is very limited, as these assays have been performed on a limited number of cell types, and the distribution of chromatin states varies across different cell types. We therefore took a computational rather than experimental approach to discovering regulatory regions. We characterized the nucleotide contents, regulatory motif contents, conservation, gene distance, and human variation patterns of a subset of these regulatory sequences. By training a generalized linear classifier on this data, we created a predictor for enhancer sequences that achieved 70% accuracy.
Description
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013. Cataloged from PDF version of thesis. Includes bibliographical references (page 31).
Date issued
2013Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.