Show simple item record

dc.contributor.advisorDavid K. Gifford.en_US
dc.contributor.authorEdwards, Matthew Douglasen_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2016-12-05T19:11:07Z
dc.date.available2016-12-05T19:11:07Z
dc.date.copyright2016en_US
dc.date.issued2016en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/105572
dc.descriptionThesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.en_US
dc.descriptionThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.en_US
dc.descriptionCataloged from student-submitted PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 97-105).en_US
dc.description.abstractModern genetics has been transformed by a dramatic explosion of data. As sample sizes and the number of measured data types grow, the need for computational methods tailored to deal with these noisy and complex datasets increases. In this thesis, we develop and apply integrated computational and biological approaches for two genetic problems. First, we build a statistical model for genetic mapping using pooled sequencing, a powerful and efficient technique for rapidly unraveling the genetic basis of complex traits. Our approach explicitly models the pooling process and genetic parameters underlying the noisy observed data, and we use it to calculate accurate intervals that contain the targeted regions of interest. We show that our model outperforms simpler alternatives that do not use all available marker data in a principled way. We apply this model to study several phenotypes in yeast, including the genetic basis of the surprising phenomenon of strain-specific essential genes. We demonstrate the complex genetic basis of many of these strain-specific viability phenotypes and uncover the influence of an inherited virus in modifying their effects. Second, we design a statistical model that uses additional functional information describing large sets of genetic variants in order to predict which variants are likely to cause phenotypic changes. Our technique is able to learn complicated relationships between candidate features and can accommodate the additional noise introduced by training on groups of candidate variants, instead of single labeled variants. We apply this model to a large genetic mapping study in yeast by collecting multiple genome-wide functional measurements. By using our model, we demonstrate the importance of several molecular phenotypes in predicting genetic impact. The common themes in this thesis are the development of computational models that accurately reflect the underlying biological processes and the integration of carefully controlled biological experiments to test and utilize our new models.en_US
dc.description.statementofresponsibilityby Matthew Douglas Edwards.en_US
dc.format.extent105 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleInformation-sharing models for computational geneticsen_US
dc.typeThesisen_US
dc.description.degreePh. D.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc964446928en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record