dc.contributor.advisor | Tamara Broderick. | en_US |
dc.contributor.author | Masoero, Lorenzo. | en_US |
dc.contributor.other | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. | en_US |
dc.date.accessioned | 2019-07-17T20:59:23Z | |
dc.date.available | 2019-07-17T20:59:23Z | |
dc.date.copyright | 2019 | en_US |
dc.date.issued | 2019 | en_US |
dc.identifier.uri | https://hdl.handle.net/1721.1/121737 | |
dc.description | Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 | en_US |
dc.description | Cataloged from PDF version of thesis. | en_US |
dc.description | Includes bibliographical references (pages 75-83). | en_US |
dc.description.abstract | The recent availability of large genomic studies, with tens of thousands of observations, opens up the intriguing possibility to investigate and understand the effect of rare genetic variants in biological human evolution as well as their impact in the developement of rare diseases. To do so, it is imperative to develop a statistical framework to assess what fraction of the overall variation present in human genome is not yet captured by available datasets. In this thesis we introduce a novel and rigorous methodology to estimate how many new variants are yet to be observed in the context of genomic projects using a nonparametric Bayesian hierarchical approach, which allows to perform prediction tasks which jointly handle multiple subpopulations at the same time. Moreover, our method performs well on extremely small as well as very large datasets, a desirable property given the variability in size of available datasets. As a byproduct of the Bayesian formulation, our estimation procedure also naturally provides uncertainty quantification of the estimates produced. | en_US |
dc.description.statementofresponsibility | by Lorenzo Masoero. | en_US |
dc.format.extent | 83 pages | en_US |
dc.language.iso | eng | en_US |
dc.publisher | Massachusetts Institute of Technology | en_US |
dc.rights | MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. | en_US |
dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | en_US |
dc.subject | Electrical Engineering and Computer Science. | en_US |
dc.title | Genomic variety estimation with Bayesian nonparametric hierarchies | en_US |
dc.type | Thesis | en_US |
dc.description.degree | S.M. | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
dc.identifier.oclc | 1102050333 | en_US |
dc.description.collection | S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science | en_US |
dspace.imported | 2019-07-17T20:59:20Z | en_US |
mit.thesis.degree | Master | en_US |
mit.thesis.department | EECS | en_US |