Learning with confident examples: Rank pruning for robust classification with noisy labels
Author(s)
Chuang, Isaac L.; Wu, Tailin; Northcutt, Curtis G.
DownloadAccepted version (2.020Mb)
Open Access Policy
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
P N learning is the problem of binary classification when training examples may be mislabeled (flipped) uniformly with noise rate ρ1 for positive examples and ρ0 for negative examples. We propose Rank Pruning (RP) to solve PN learning and the open problem of estimating the noise rates. Unlike prior solutions, RP is efficient and general, requiring O(T) for any unrestricted choice of probabilistic classifier with T fitting time. We prove RP achieves consistent noise estimation and equivalent expected risk as learning with uncorrupted labels in ideal conditions, and derive closed-form solutions when conditions are non-ideal. RP achieves state-of-the-art noise estimation and F1, error, and AUC-PR for both MNIST and CIFAR datasets, regardless of the amount of noise. To highlight, RP with a CNN classifier can predict if an MNIST digit is a one or not with only 0:25% error, and 0:46% error across all digits, even when 50% of positive examples are mislabeled and 50% of observed positive labels are mislabeled negative examples.
Date issued
2017Department
Massachusetts Institute of Technology. Department of PhysicsCitation
Chuang, Isaac L., Wu, Tailin and Northcutt, Curtis G. 2017. "Learning with confident examples: Rank pruning for robust classification with noisy labels."
Version: Author's final manuscript