Show simple item record

dc.contributor.authorMukherjee, Sayanen_US
dc.contributor.authorGolland, Polinaen_US
dc.contributor.authorPanchenko, Dmitryen_US
dc.date.accessioned2004-10-08T20:39:06Z
dc.date.available2004-10-08T20:39:06Z
dc.date.issued2003-08-28en_US
dc.identifier.otherAIM-2003-019en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/6723
dc.description.abstractWe introduce and explore an approach to estimating statistical significance of classification accuracy, which is particularly useful in scientific applications of machine learning where high dimensionality of the data and the small number of training examples render most standard convergence bounds too loose to yield a meaningful guarantee of the generalization ability of the classifier. Instead, we estimate statistical significance of the observed classification accuracy, or the likelihood of observing such accuracy by chance due to spurious correlations of the high-dimensional data patterns with the class labels in the given training set. We adopt permutation testing, a non-parametric technique previously developed in classical statistics for hypothesis testing in the generative setting (i.e., comparing two probability distributions). We demonstrate the method on real examples from neuroimaging studies and DNA microarray analysis and suggest a theoretical analysis of the procedure that relates the asymptotic behavior of the test to the existing convergence bounds.en_US
dc.format.extent22 p.en_US
dc.format.extent1135156 bytes
dc.format.extent662639 bytes
dc.format.mimetypeapplication/postscript
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.relation.ispartofseriesAIM-2003-019en_US
dc.subjectAIen_US
dc.subjectClassificationen_US
dc.subjectPermutation testingen_US
dc.subjectStatistical significance.en_US
dc.titlePermutation Tests for Classificationen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record