Show simple item record

dc.contributor.authorOhannessian, Mesrob I.
dc.contributor.authorDahleh, Munther A.
dc.date.accessioned2015-11-20T14:09:57Z
dc.date.available2015-11-20T14:09:57Z
dc.date.issued2012
dc.identifier.issn1938-7228
dc.identifier.urihttp://hdl.handle.net/1721.1/99945
dc.description.abstractThis paper studies the problem of estimating the probability of symbols that have occurred very rarely, in samples drawn independently from an unknown, possibly infinite, discrete distribution. In particular, we study the multiplicative consistency of estimators, defined as the ratio of the estimate to the true quantity converging to one. We first show that the classical Good-Turing estimator is not universally consistent in this sense, despite enjoying favorable additive properties. We then use Karamata's theory of regular variation to prove that regularly varying heavy tails are sufficient for consistency. At the core of this result is a multiplicative concentration that we establish both by extending the McAllester-Ortiz additive concentration for the missing mass to all rare probabilities and by exploiting regular variation. We also derive a family of estimators which, in addition to being consistent, address some of the shortcomings of the Good-Turing estimator. For example, they perform smoothing implicitly and have the absolute discounting structure of many heuristic algorithms. This also establishes a discrete parallel to extreme value theory, and many of the techniques therein can be adapted to the framework that we set forth.en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Grant 6922470)en_US
dc.description.sponsorshipUnited States. Office of Naval Research (Grant 6918937)en_US
dc.language.isoen_US
dc.publisherJournal of Machine Learning Researchen_US
dc.relation.isversionofhttp://jmlr.org/proceedings/papers/v23/ohannessian12.htmlen_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceMIT web domainen_US
dc.titleRare Probability Estimation under Regularly Varying Heavy Tailsen_US
dc.typeArticleen_US
dc.identifier.citationOhannessian, Mesrob I., and Munther A. Dahleh. "Rare Probability Estimation under Regularly Varying Heavy Tails." Journal of Machine Learning Research: Workshop and Conference Proceedings 23 (2012), 21.1-21.24.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Institute for Data, Systems, and Societyen_US
dc.contributor.departmentMassachusetts Institute of Technology. Laboratory for Information and Decision Systemsen_US
dc.contributor.mitauthorOhannessian, Mesrob I.en_US
dc.contributor.mitauthorDahleh, Munther A.en_US
dc.relation.journalJournal of Machine Learning Research: Workshop and Conference Proceedingsen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.orderedauthorsOhannessian, Mesrob I.; Dahleh, Munther A.en_US
dc.identifier.orcidhttps://orcid.org/0000-0002-1470-2148
mit.licenseOPEN_ACCESS_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record