dc.contributor.author | Saad, Feras A | |
dc.contributor.author | Freer, Cameron E | |
dc.contributor.author | Rinard, Martin C | |
dc.contributor.author | Mansinghka, Vikash K | |
dc.date.accessioned | 2021-10-27T20:29:57Z | |
dc.date.available | 2021-10-27T20:29:57Z | |
dc.date.issued | 2020 | |
dc.identifier.uri | https://hdl.handle.net/1721.1/135919 | |
dc.description.abstract | © 2020 Copyright held by the owner/author(s). This paper addresses a fundamental problem in random variate generation: given access to a random source that emits a stream of independent fair bits, what is the most accurate and entropy-efficient algorithm for sampling from a discrete probability distribution (p1, . . . ,pn), where the probabilities of the output distribution (p'1, . . . ,p'n) of the sampling algorithm must be specified using at most k bits of precision? We present a theoretical framework for formulating this problem and provide new techniques for finding sampling algorithms that are optimal both statistically (in the sense of sampling accuracy) and information-theoretically (in the sense of entropy consumption). We leverage these results to build a system that, for a broad family of measures of statistical accuracy, delivers a sampling algorithm whose expected entropy usage is minimal among those that induce the same distribution (i.e., is "entropy-optimal") and whose output distribution (p'1, . . . ,p'n) is a closest approximation to the target distribution (p1, . . . ,pn) among all entropy-optimal sampling algorithms that operate within the specified k-bit precision. This optimal approximate sampler is also a closer approximation than any (possibly entropy-suboptimal) sampler that consumes a bounded amount of entropy with the specified precision, a class which includes floating-point implementations of inversion sampling and related methods found in many software libraries. We evaluate the accuracy, entropy consumption, precision requirements, and wall-clock runtime of our optimal approximate sampling algorithms on a broad set of distributions, demonstrating the ways that they are superior to existing approximate samplers and establishing that they often consume significantly fewer resources than are needed by exact samplers. | |
dc.language.iso | en | |
dc.publisher | Association for Computing Machinery (ACM) | |
dc.relation.isversionof | 10.1145/3371104 | |
dc.rights | Creative Commons Attribution NonCommercial License 4.0 | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc/4.0/ | |
dc.source | ACM | |
dc.title | Optimal approximate sampling from discrete probability distributions | |
dc.type | Article | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences | |
dc.relation.journal | Proceedings of the ACM on Programming Languages | |
dc.eprint.version | Final published version | |
dc.type.uri | http://purl.org/eprint/type/ConferencePaper | |
eprint.status | http://purl.org/eprint/status/NonPeerReviewed | |
dc.date.updated | 2021-03-22T15:19:17Z | |
dspace.orderedauthors | Saad, FA; Freer, CE; Rinard, MC; Mansinghka, VK | |
dspace.date.submission | 2021-03-22T15:19:19Z | |
mit.journal.volume | 4 | |
mit.journal.issue | POPL | |
mit.license | PUBLISHER_CC | |
mit.metadata.status | Authority Work and Publication Information Needed | |