On the Impossibility of Learning the Missing Mass
Author(s)
Ohannessian, Mesrob I.; Mossel, Elchanan
Downloadentropy-21-00028-v2.pdf (860.6Kb)
PUBLISHER_CC
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
This paper shows that one cannot learn the probability of rare events without imposing further structural assumptions. The event of interest is that of obtaining an outcome outside the coverage of an i.i.d. sample from a discrete distribution. The probability of this event is referred to as the “missing mass”. The impossibility result can then be stated as: the missing mass is not distribution-free learnable in relative error. The proof is semi-constructive and relies on a coupling argument using a dithered geometric distribution. Via a reduction, this impossibility also extends to both discrete and continuous tail estimation. These results formalize the folklore that in order to predict rare events without restrictive modeling, one necessarily needs distributions with "heavy tails". Keywords: missing mass; rare events; Good-Turing; light tails; heavy tails; no free lunch
Date issued
2019-01Department
Massachusetts Institute of Technology. Department of MathematicsJournal
Entropy
Publisher
Multidisciplinary Digital Publishing Institute
Citation
Mossel, Elchanan and Mesrob Ohannessian. "On the Impossibility of Learning the Missing Mass." Entropy 21, 1 (January 2019): 28 © 2019 The Authors
Version: Final published version
ISSN
1099-4300