Comparative gene identification in mammalian, fly, and fungal genomes
Author(s)
Lin, Michael F. (Michael Fong-Jay)
DownloadFull printable version (4.547Mb)
Other Contributors
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Advisor
Manolis Kellis.
Terms of use
Metadata
Show full item recordAbstract
An important step in genome interpretation is the accurate identification of protein-coding genes. One approach to gene identification is comparative analysis of the genomes of several related species, to find genes that have been conserved by natural selection over millions of years of evolution. I develop general computational methods that combine statistical analysis of genome sequence alignments with classification algorithms in order to detect the distinctive signatures of protein-coding DNA sequence evolution. I implement these methods as a software system, which I then use to identify previously unknown genes, and cast doubt on some existing gene annotations, in the genomes of the fungi Saccharomyces cerevisiae and Candida albicans, the fruit fly Drosophila melanogaster, and the human. These methods perform competitively with the best existing de novo gene identification systems, and are practically applicable to the goal of improving existing gene annotations through comparative genomics.
Description
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006. Includes bibliographical references (leaves 55-56).
Date issued
2006Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.