dc.contributor.author | Diakonikolas, Ilias | |
dc.contributor.author | O'Donnell, Ryan | |
dc.contributor.author | Servedio, Rocco A. | |
dc.contributor.author | Tan, Li-Yang | |
dc.contributor.author | Daskalakis, Konstantinos | |
dc.date.accessioned | 2015-11-20T18:38:04Z | |
dc.date.available | 2015-11-20T18:38:04Z | |
dc.date.issued | 2013-10 | |
dc.identifier.isbn | 978-0-7695-5135-7 | |
dc.identifier.issn | 0272-5428 | |
dc.identifier.uri | http://hdl.handle.net/1721.1/99970 | |
dc.description.abstract | Let S = X[subscript 1]+···+X[subscript n] be a sum of n independent integer random variables X[subscript i], where each X[subscript i] is supported on {0, 1, ..., k - 1} but otherwise may have an arbitrary distribution (in particular the Xi's need not be identically distributed). How many samples are required to learn the distribution S to high accuracy? In this paper we show that the answer is completely independent of n, and moreover we give a computationally efficient algorithm which achieves this low sample complexity. More precisely, our algorithm learns any such S to ε-accuracy (with respect to the total variation distance between distributions) using poly(k, 1/ε) samples, independent of n. Its running time is poly(k, 1/ε) in the standard word RAM model. Thus we give a broad generalization of the main result of [DDS12b] which gave a similar learning result for the special case k = 2 (when the distribution S is a Poisson Binomial Distribution). Prior to this work, no nontrivial results were known for learning these distributions even in the case k = 3. A key difficulty is that, in contrast to the case of k = 2, sums of independent {0, 1, 2}-valued random variables may behave very differently from (discretized) normal distributions, and in fact may be rather complicated - they are not log-concave, they can be Θ(n)-modal, there is no relationship between Kolmogorov distance and total variation distance for the class, etc. Nevertheless, the heart of our learning result is a new limit theorem which characterizes what the sum of an arbitrary number of arbitrary independent {0, 1, ... , k-1}-valued random variables may look like. Previous limit theorems in this setting made strong assumptions on the “shift invariance” of the random variables Xi in order to force a discretized normal limit. We believe that our new limit theorem, as the first result for truly arbitrary sums of independent {0, 1, ... - k-1}-valued random variables, is of independent interest. | en_US |
dc.language.iso | en_US | |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | en_US |
dc.relation.isversionof | http://dx.doi.org/10.1109/FOCS.2013.31 | en_US |
dc.rights | Creative Commons Attribution-Noncommercial-Share Alike | en_US |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | en_US |
dc.source | Other repository | en_US |
dc.title | Learning Sums of Independent Integer Random Variables | en_US |
dc.type | Article | en_US |
dc.identifier.citation | Daskalakis, Constantinos, Ilias Diakonikolas, Ryan ODonnell, Rocco A. Servedio, and Li-Yang Tan. “Learning Sums of Independent Integer Random Variables.” 2013 IEEE 54th Annual Symposium on Foundations of Computer Science (October 2013). | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
dc.contributor.mitauthor | Daskalakis, Konstantinos | en_US |
dc.relation.journal | Proceedings of the 2013 IEEE 54th Annual Symposium on Foundations of Computer Science | en_US |
dc.eprint.version | Author's final manuscript | en_US |
dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
dspace.orderedauthors | Daskalakis, Constantinos; Diakonikolas, Ilias; ODonnell, Ryan; Servedio, Rocco A.; Tan, Li-Yang | en_US |
dc.identifier.orcid | https://orcid.org/0000-0002-5451-0490 | |
mit.license | OPEN_ACCESS_POLICY | en_US |
mit.metadata.status | Complete | |