Algorithms for genomics and genetics : compression-accelerated search and admixture analysis
Author(s)
Loh, Po-Ru
DownloadFull printable version (5.929Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Mathematics.
Advisor
Bonnie Berger.
Terms of use
Metadata
Show full item recordAbstract
Rapid advances in next-generation sequencing technologies are revolutionizing genomics, with data sets at the scale of thousands of human genomes fast becoming the norm. These technological leaps promise to enable corresponding advances in biology and medicine, but the deluge of raw data poses substantial mathematical, computational and statistical challenges that must first be overcome. This thesis consists of two research thrusts along these lines. First, we propose an algorithmic framework, "compressive genomics," that accelerates bioinformatic computations through analysis-aware compression. We demonstrate this methodology with proof-of-concept implementations of compression-accelerated search (CaBLAST and CaBLAT). Second, we develop new computational tools for investigating population admixture, a phenomenon of importance in understanding demographic histories of human populations and facilitating association mapping of disease genes. Our recently released ALDER and MixMapper software packages provide fast, sensitive, and robust methods for detecting and analyzing signatures of admixture created by genetic drift and recombination on genome-wide, large-sample scales.
Description
Thesis (Ph. D.)--Massachusetts Institute of Technology, Department of Mathematics, 2013. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 133-139).
Date issued
2013Department
Massachusetts Institute of Technology. Department of MathematicsPublisher
Massachusetts Institute of Technology
Keywords
Mathematics.