Show simple item record

dc.contributor.advisorYaniv Erlich and Mark J. Daly.en_US
dc.contributor.authorGymrek, Melissa Aen_US
dc.contributor.otherHarvard--MIT Program in Health Sciences and Technology.en_US
dc.date.accessioned2016-07-01T18:46:05Z
dc.date.available2016-07-01T18:46:05Z
dc.date.copyright2016en_US
dc.date.issued2016en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/103501
dc.descriptionThesis: Ph. D., Harvard-MIT Program in Health Sciences and Technology, 2016.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 225-251).en_US
dc.description.abstractA central goal in genomics is to understand the genetic variants that underlie molecular changes and lead to disease. Recent studies have identified thousands of genetic loci associated with human phenotypes. These have primarily analyzed point mutations, ignoring more complex types of variation. Here we focus on Short Tandem Repeats (STRs) as a model for complex variation. STRs are comprised of repeating motifs of 1-6bp that span over 1% of the human genome. The level of STR variation and its effect on phenotypes remains mostly uncharted, mainly due to the difficulty in accurately genotyping STRs on a large scale. To overcome bioinformatic challenges in STR genotyping, we developed lobSTR, an algorithm for profiling STRs from high throughput sequencing data. lobSTR employs a unique mapping strategy to rapidly align repetitive reads, and uses statistical learning techniques to account for STR-specific noise patterns. We applied lobSTR to generate the largest and highest quality STR catalog to date. This provided the first characterization of more than a million loci and gave novel insights into population-wide trends of STR variation. We used our catalog to conduct a genome-wide analysis of the contribution of STRs to gene expression in humans. This revealed that STRs explain 10-15% of the cis heritability of expression mediated by common variants and potentially play a role in various clinically relevant conditions. Overall these studies highlight the contribution of STRs to the genetic architecture of quantitative traits. We anticipate that integrating repetitive elements, specifically STRs, into genome-wide analyses will lead to the discovery of new genetic variants relevant to human conditions.en_US
dc.description.statementofresponsibilityby Melissa A. Gymrek.en_US
dc.format.extent251 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectHarvard--MIT Program in Health Sciences and Technology.en_US
dc.titleCharacterizing variation at short tandem repeats and their role in human genome regulationen_US
dc.typeThesisen_US
dc.description.degreePh. D.en_US
dc.contributor.departmentHarvard University--MIT Division of Health Sciences and Technology
dc.identifier.oclc952429452en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record