MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Multivariate methods for the statistical analysis of hyperdimensional high-content screening data

Author(s)
Rameseder, Jonathan
Thumbnail
DownloadFull printable version (31.91Mb)
Other Contributors
Massachusetts Institute of Technology. Computational and Systems Biology Program.
Advisor
Michael B. Yae.
Terms of use
M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
In the post-genomic era, greater emphasis has been placed on understanding the function of genes at the systems level. To meet these needs, biologists are creating larger, and increasingly complex datasets. In recent years, high-content screening (HCS) using RNA interference (RNAi) or other perturbation techniques in combination with automated microscopy has emerged as a promising investigative tool to explore intricate biological processes. Image-based HC screens produce massive hyperdimensional data sets. To identify novel components of the DNA damage response (DDR) after ionizing radiation, we recently performed an image-based HC RNAi screen in an osteosarcoma cell line. Robust univariate hit identication methods and manual network analysis identied an isoform of BRD4, a bromodomain and extra-terminal domain family member, as an endogenous inhibitor of DDR signaling. However, despite the plethora of data generated from our and other HC screens, little progress has been made in analyzing HC data using multivariate computational methods that exploit the full richness of hyperdimensional data and identify more than just the most salient knockdown phenotypes to gain a detailed understanding of how gene products cooperate to regulate complex cellular processes. We developed a novel multivariate method using logistic regression models and least absolute shrinkage and selection operator regularization for analyzing hyperdimensional HC data. We applied this method to our HC screen to identify genes that exhibit subtle but consistent phenotypic changes upon knockdown that would have been missed by conventional univariate hit identication approaches. Our method automatically selects the most predictive features at the most predictive time points to facilitate the more ecient design of follow-up experiments and puts the identied hits in a network context using the Prize-Collecting Steiner Tree algorithm. This method offers superior performance over the current gold standard for the analysis of HC RNAi screens. A surprising finding from our analysis is that training sets of genes involved in complex biological phenomena used to train predictive models must be broken down into functionally coherent subsets in order to enhance new gene discovery. Additionally, we found that in the case of RNAi screening, statistical cell-to-cell variation in phenotypic responses in a well of cells targeted by a single shRNA is an important predictor of gene dependent events.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Computational and Systems Biology Program, 2014.
 
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
 
Cataloged from student-submitted PDF version of thesis.
 
Includes bibliographical references.
 
Date issued
2014
URI
http://hdl.handle.net/1721.1/92957
Department
Massachusetts Institute of Technology. Computational and Systems Biology Program
Publisher
Massachusetts Institute of Technology
Keywords
Computational and Systems Biology Program.

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.