A Bayesian framework for statistical signal processing and knowledge discovery in proteomic engineering
Author(s)
Alterovitz, Gil, 1975-
DownloadFull printable version (12.44Mb)
Other Contributors
Harvard University--MIT Division of Health Sciences and Technology.
Advisor
Marco F. Ramoni and Isaac S. Kohane.
Terms of use
Metadata
Show full item recordAbstract
Proteomics has been revolutionized in the last couple of years through integration of new mass spectrometry technologies such as -Enhanced Laser Desorption/Ionization (SELDI) mass spectrometry. As data is generated in an increasingly rapid and automated manner, novel and application-specific computational methods will be needed to deal with all of this information. This work seeks to develop a Bayesian framework in mass-based proteomics for protein identification. Using the Bayesian framework in a statistical signal processing manner, mass spectrometry data is filtered and analyzed in order to estimate protein identity. This is done by a multi-stage process which compares probabilistic networks generated from mass spectrometry-based data with a mass-based network of protein interactions. In addition, such models can provide insight on features of existing models by identifying relevant proteins. This work finds that the search space of potential proteins can be reduced such that simple antibody-based tests can be used to validate protein identity. This is done with real proteins as a proof of concept. Regarding protein interaction networks, the largest human protein interaction meta-database was created as part of this project, containing over 162,000 interactions. A further contribution is the implementation of the massome network database of mass-based interactions- which is used in the protein identification process. (cont.) This network is explored in terms potential usefulness for protein identification. The framework provides an approach to a number of core issues in proteomics. Besides providing these tools, it yields a novel way to approach statistical signal processing problems in this domain in a way that can be adapted as proteomics-based technologies mature.
Description
Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, February 2006. Includes bibliographical references (leaves 73-85).
Date issued
2006Department
Harvard University--MIT Division of Health Sciences and TechnologyPublisher
Massachusetts Institute of Technology
Keywords
Harvard University--MIT Division of Health Sciences and Technology.