Learning with confident examples: Rank pruning for robust classification with noisy labels

Chuang, Isaac L.; Wu, Tailin; Northcutt, Curtis G.

Author(s)

Chuang, Isaac L.; Wu, Tailin; Northcutt, Curtis G.

DownloadAccepted version (2.020Mb)

Open Access Policy

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

P N learning is the problem of binary classification when training examples may be mislabeled (flipped) uniformly with noise rate ρ1 for positive examples and ρ0 for negative examples. We propose Rank Pruning (RP) to solve PN learning and the open problem of estimating the noise rates. Unlike prior solutions, RP is efficient and general, requiring O(T) for any unrestricted choice of probabilistic classifier with T fitting time. We prove RP achieves consistent noise estimation and equivalent expected risk as learning with uncorrupted labels in ideal conditions, and derive closed-form solutions when conditions are non-ideal. RP achieves state-of-the-art noise estimation and F1, error, and AUC-PR for both MNIST and CIFAR datasets, regardless of the amount of noise. To highlight, RP with a CNN classifier can predict if an MNIST digit is a one or not with only 0:25% error, and 0:46% error across all digits, even when 50% of positive examples are mislabeled and 50% of observed positive labels are mislabeled negative examples.

Date issued

2017

URI

https://hdl.handle.net/1721.1/137802

Department

Massachusetts Institute of Technology. Department of Physics

Citation

Chuang, Isaac L., Wu, Tailin and Northcutt, Curtis G. 2017. "Learning with confident examples: Rank pruning for robust classification with noisy labels."

Version: Author's final manuscript

Collections

MIT Open Access Articles