MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

A computational framework for the identification, cataloging, and classification of evolutionary conserved genomic DNA

Author(s)
Saluja, Sunil K. (Sunil Kumar), 1968-
Thumbnail
DownloadFull printable version (1.661Mb)
Other Contributors
Harvard University--MIT Division of Health Sciences and Technology.
Advisor
Isaac S. Kohane.
Terms of use
M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
Evolutionarily conserved genomic regions (ecores) are understudied, and yet comprise a very large percentage of the Human Genome. Highly conserved human-mouse non-coding ecores, for example, are more abundant within the Human Genome than those regions, which are currently estimated to encode for proteins. Subsets of these ecores also exhibit conservation that extends across several species. These genomic regions have managed to survive millions of years of evolution despite the fact that they do not appear to directly encode for proteins. The survival of these regions compels us to investigate their potential function. Development of a computational framework for the classification and clustering of these regions may be the first step in understanding their function. The need for a standardized framework is underscored by the explosive growth in the number of publicly available, fully sequenced genomes, and the diverse set of methodologies used to generate cross-species alignments. This project describes the design and implementation of a system for the identification, classification and cataloguing of ecores across multiple species. A key feature of this system is its ability to quickly incorporate new genomes and assemblies as they become available. Additionally, this system provides investigators with a feature rich user interface, which facilitates the retrieval of ecores based on a wide range of parameters. The system returns a dynamically annotated list of evolutionarily conserved regions, which is used as input to several classification schemes, aimed at identifying families of ecores that share similar features, including depth of evolutionary conservation, position relative to known genes, sequence similarity,
 
(cont.) and content of transcription factor binding sites. Families of ecores have already been retrieved by the system and clustered using this feature space, and are currently awaiting biological validation.
 
Description
Thesis (S.M.)--Harvard-MIT Division of Health Sciences and Technology, 2004.
 
Includes bibliographical references (leaves 27-29).
 
Date issued
2004
URI
http://hdl.handle.net/1721.1/28590
Department
Harvard University--MIT Division of Health Sciences and Technology
Publisher
Massachusetts Institute of Technology
Keywords
Harvard University--MIT Division of Health Sciences and Technology.

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.