Show simple item record

dc.contributor.advisorBonnie Berger.en_US
dc.contributor.authorSchmid, Patrick R. (Patrick Raphael)en_US
dc.contributor.otherMassachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2012-07-02T14:20:01Z
dc.date.available2012-07-02T14:20:01Z
dc.date.copyright2012en_US
dc.date.issued2012en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/71280
dc.descriptionThesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.en_US
dc.descriptionThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (p. 343-356).en_US
dc.description.abstractAlthough there are a variety of high-throughput technologies used to perform biological experiments, DNA microarrays have become a standard tool in the modern biologist's arsenal. Microarray experiments provide measurements of thousands of genes simultaneously, and offer a snapshot view of transcriptomic activity. With the rapid growth of public availability of transcriptomic data, there is increasing recognition that large sets of such data can be mined to better understand disease states and mechanisms. Unfortunately, several challenges arise when attempting to perform such large-scale analyses. For instance, public repositories to which the data is being submitted to were designed around the simple task of storage rather than that of data mining. As such, the seemingly simple task of obtaining all data relating to a particular disease becomes an arduous task. Furthermore, prior gene expression analyses, both large and small, have been dichotomous in nature, in which phenotypes are compared using clearly defined controls. Such approaches may require arbitrary decisions about what are considered "normal" phenotypes, and what each phenotype should be compared to. Addressing these issues, we introduce methods for creating a large curated gene expression database geared towards data mining, and explore methods for efficiently expanding this database using active learning. Leveraging our curated expression database, we adopt a holistic approach in which we characterize phenotypes in the context of a myriad of tissues and diseases. We introduce scalable methods that associate expression patterns to phenotypes in order to assign phenotype labels to new expression samples and to select phenotypically meaningful gene signatures. By using a nonparametric statistical approach, we identify signatures that are more precise than those from existing approaches and accurately reveal biological processes that are hidden in case vs. control studies. We conclude the work by exploring the applicability of the heterogeneous expression database in analyzing clinical drugs for the purpose of drug repurposing.en_US
dc.description.statementofresponsibilityby Patrick Raphael Schmid.en_US
dc.format.extent356 p.en_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleBeyond differential expression : methods and tools for mining the transcriptomic landscape of human tissue and diseaseen_US
dc.title.alternativeMethods and tools for mining the transcriptomic landscape of human tissue and diseaseen_US
dc.typeThesisen_US
dc.description.degreePh.D.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc795581540en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record