MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • Computer Science and Artificial Intelligence Lab (CSAIL)
  • CSAIL Digital Archive
  • CSAIL Technical Reports (July 1, 2003 - present)
  • View Item
  • DSpace@MIT Home
  • Computer Science and Artificial Intelligence Lab (CSAIL)
  • CSAIL Digital Archive
  • CSAIL Technical Reports (July 1, 2003 - present)
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Selecting Relevant Genes with a Spectral Approach

Author(s)
Wolf, Lior; Shashua, Amnon; Mukherjee, Sayan
Thumbnail
DownloadMIT-CSAIL-TR-2004-003.ps (11806Kb)
Additional downloads
Metadata
Show full item record
Abstract
Array technologies have made it possible to record simultaneouslythe expression pattern of thousands of genes. A fundamental problemin the analysis of gene expression data is the identification ofhighly relevant genes that either discriminate between phenotypiclabels or are important with respect to the cellular process studied inthe experiment: for example cell cycle or heat shock in yeast experiments,chemical or genetic perturbations of mammalian cell lines,and genes involved in class discovery for human tumors. In this paperwe focus on the task of unsupervised gene selection. The problemof selecting a small subset of genes is particularly challengingas the datasets involved are typically characterized by a very smallsample size — in the order of few tens of tissue samples — andby a very large feature space as the number of genes tend to bein the high thousands. We propose a model independent approachwhich scores candidate gene selections using spectral properties ofthe candidate affinity matrix. The algorithm is very straightforwardto implement yet contains a number of remarkable properties whichguarantee consistent sparse selections. To illustrate the value of ourapproach we applied our algorithm on five different datasets. Thefirst consists of time course data from four well studied Hematopoieticcell lines (HL-60, Jurkat, NB4, and U937). The other fourdatasets include three well studied treatment outcomes (large celllymphoma, childhood medulloblastomas, breast tumors) and oneunpublished dataset (lymph status). We compared our approachboth with other unsupervised methods (SOM,PCA,GS) and withsupervised methods (SNR,RMB,RFE). The results clearly showthat our approach considerably outperforms all the other unsupervisedapproaches in our study, is competitive with supervised methodsand in some case even outperforms supervised approaches.
Date issued
2004-01-27
URI
http://hdl.handle.net/1721.1/30444
Other identifiers
MIT-CSAIL-TR-2004-003
AIM-2004-002
CBCL-234
Series/Report no.
Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory
Keywords
AI

Collections
  • CSAIL Technical Reports (July 1, 2003 - present)

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.