Estimating entropy of distributions in constant space

Indyk, Piotr

Author(s)

Indyk, Piotr

DownloadPublished version (558.4Kb)

Publisher Policy

Terms of use

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.

Metadata

Show full item record

Abstract

We consider the task of estimating the entropy of k-ary distributions from samples in the streaming model, where space is limited. Our main contribution is an algorithm that requires O ( klog(1"3/")2 ) samples and a constant O(1) memory words of space and outputs a ±" estimate of H(p). Without space limitations, the sample complexity has been established as S(k, ") = T ( "logkk + log"22 k 0, which is sub-linear in the domain size k, and the current algorithms that achieve optimal sample complexity also require nearly-linear space in k. Our algorithm partitions [0, 1] into intervals and estimates the entropy contribution of probability values in each interval. The intervals are designed to trade off the bias and variance of these estimates.

Date issued

2019-12

URI

https://hdl.handle.net/1721.1/129554

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Journal

Advances in Neural Information Processing Systems

Publisher

Neural Information Processing Systems Foundation

Citation

Acharya, Jayadev et al. “Estimating entropy of distributions in constant space.” Advances in Neural Information Processing Systems (NeurIPS 2019), December 2019, Vancouver, Canada, Neural Information Processing Systems Foundation, December 2019. © 2019 The Author(s)

Version: Final published version

ISSN

1049-5258

Collections

MIT Open Access Articles