Joint base-calling of two DNA sequences with factor graphs
Alternative Title:
Joint base-calling of 2 deoxyribonucleic acid sequences with factor graphs
Author:
Shi, Xiaomeng, Ph. D. Massachusetts Institute of Technology
Abstract:
The advent of DNA sequencing has revolutionized biological research by providing virtual blueprints of living organisms and offering insights into complicated biochemical processes. DNA sequencing is a process encompassing both chemical reactions and signal processing techniques to identify the order of chemical bases in a DNA molecule. In this thesis, we focus on the base-calling stage, during which base order is estimated from data collected through electrophoresis and florescence detection. In particular, we examine the possibility of jointly base-calling two superposed DNA sequences by applying the sum-product algorithm on factor graphs. This approach allows a single electrophoresis experiment to process two sequences, using the same quantity of reagents and machine hours as for a single sequence. A practical heuristic is first used to estimate the peak parameters, then separate those into two sequences (major/minor) by passing messages on a factor graph. Base-calling on the major alone yields accuracy commensurate with single sequence approaches, and joint base-calling provides results for the minor which, while being of lesser quality, incurs no additional cost and can be ultimately used in the genome assembly process.
Description:
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 63-65).