MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Phenotype Prediction Using Regularized Regression on Genetic Data in the DREAM5 Systems Genetics B Challenge

Author(s)
Loh, Po-Ru; Tucker, George Jay; Berger, Bonnie
Thumbnail
DownloadLoh-2011-Phenotype Prediction.pdf (1.271Mb)
PUBLISHER_CC

Publisher with Creative Commons License

Creative Commons Attribution

Terms of use
Creative Commons Attribution http://creativecommons.org/licenses/by/2.5/
Metadata
Show full item record
Abstract
A major goal of large-scale genomics projects is to enable the use of data from high-throughput experimental methods to predict complex phenotypes such as disease susceptibility. The DREAM5 Systems Genetics B Challenge solicited algorithms to predict soybean plant resistance to the pathogen Phytophthora sojae from training sets including phenotype, genotype, and gene expression data. The challenge test set was divided into three subcategories, one requiring prediction based on only genotype data, another on only gene expression data, and the third on both genotype and gene expression data. Here we present our approach, primarily using regularized regression, which received the best-performer award for subchallenge B2 (gene expression only). We found that despite the availability of 941 genotype markers and 28,395 gene expression features, optimal models determined by cross-validation experiments typically used fewer than ten predictors, underscoring the importance of strong regularization in noisy datasets with far more features than samples. We also present substantial analysis of the training and test setup of the challenge, identifying high variance in performance on the gold standard test sets.
Date issued
2011-12
URI
http://hdl.handle.net/1721.1/69039
Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Mathematics
Journal
PLoS ONE
Publisher
Public Library of Science
Citation
Loh, Po-Ru, George Tucker, and Bonnie Berger. “Phenotype Prediction Using Regularized Regression on Genetic Data in the DREAM5 Systems Genetics B Challenge.” Ed. Mark Isalan. PLoS ONE 6.12 (2011): e29095. Web. 8 Feb. 2012.
Version: Final published version
ISSN
1932-6203

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.