Improving Multiclass Text Classification with the Support Vector Machine
Author(s)
Rennie, Jason D. M.; Rifkin, Ryan
DownloadAIM-2001-026.ps (1.183Mb)
Additional downloads
Metadata
Show full item recordAbstract
We compare Naive Bayes and Support Vector Machines on the task of multiclass text classification. Using a variety of approaches to combine the underlying binary classifiers, we find that SVMs substantially outperform Naive Bayes. We present full multiclass results on two well-known text data sets, including the lowest error to date on both data sets. We develop a new indicator of binary performance to show that the SVM's lower multiclass error is a result of its improved binary performance. Furthermore, we demonstrate and explore the surprising result that one-vs-all classification performs favorably compared to other approaches even though it has no error-correcting properties.
Date issued
2001-10-16Other identifiers
AIM-2001-026
CBCL-210
Series/Report no.
AIM-2001-026CBCL-210
Keywords
AI, text classification, support vector machine, multiclass classification