Information-sharing models for computational genetics

Edwards, Matthew Douglas

dc.contributor.advisor	David K. Gifford.	en_US
dc.contributor.author	Edwards, Matthew Douglas	en_US
dc.contributor.other	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2016-12-05T19:11:07Z
dc.date.available	2016-12-05T19:11:07Z
dc.date.copyright	2016	en_US
dc.date.issued	2016	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/105572
dc.description	Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.	en_US
dc.description	This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.	en_US
dc.description	Cataloged from student-submitted PDF version of thesis.	en_US
dc.description	Includes bibliographical references (pages 97-105).	en_US
dc.description.abstract	Modern genetics has been transformed by a dramatic explosion of data. As sample sizes and the number of measured data types grow, the need for computational methods tailored to deal with these noisy and complex datasets increases. In this thesis, we develop and apply integrated computational and biological approaches for two genetic problems. First, we build a statistical model for genetic mapping using pooled sequencing, a powerful and efficient technique for rapidly unraveling the genetic basis of complex traits. Our approach explicitly models the pooling process and genetic parameters underlying the noisy observed data, and we use it to calculate accurate intervals that contain the targeted regions of interest. We show that our model outperforms simpler alternatives that do not use all available marker data in a principled way. We apply this model to study several phenotypes in yeast, including the genetic basis of the surprising phenomenon of strain-specific essential genes. We demonstrate the complex genetic basis of many of these strain-specific viability phenotypes and uncover the influence of an inherited virus in modifying their effects. Second, we design a statistical model that uses additional functional information describing large sets of genetic variants in order to predict which variants are likely to cause phenotypic changes. Our technique is able to learn complicated relationships between candidate features and can accommodate the additional noise introduced by training on groups of candidate variants, instead of single labeled variants. We apply this model to a large genetic mapping study in yeast by collecting multiple genome-wide functional measurements. By using our model, we demonstrate the importance of several molecular phenotypes in predicting genetic impact. The common themes in this thesis are the development of computational models that accurately reflect the underlying biological processes and the integration of carefully controlled biological experiments to test and utilize our new models.	en_US
dc.description.statementofresponsibility	by Matthew Douglas Edwards.	en_US
dc.format.extent	105 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Information-sharing models for computational genetics	en_US
dc.type	Thesis	en_US
dc.description.degree	Ph. D.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	964446928	en_US

Files in this item

Name:: 964446928-MIT.pdf
Size:: 7.350Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record