dc.contributor.author | Daskalakis, C | |
dc.contributor.author | Kamath, G | |
dc.contributor.author | Wright, J | |
dc.date.accessioned | 2022-06-17T14:32:34Z | |
dc.date.available | 2022-06-17T14:32:34Z | |
dc.date.issued | 2018-01-01 | |
dc.identifier.uri | https://hdl.handle.net/1721.1/143461 | |
dc.description.abstract | © Copyright 2018 by SIAM. Given samples from an unknown distribution p and a description of a distribution q, are p and q close or far? This question of "identity testing" has received significant attention in the case of testing whether p and q are equal or far in total variation distance. However, in recent work [VV11a, ADK15, DP17], the following questions have been been critical to solving problems at the frontiers of distribution testing: Alternative Distances: Can we test whether p and q are far in other distances, say Hellinger? Tolerance: Can we test when p and q are close, rather than equal? And if so, close in which distances? Motivated by these questions, we characterize the complexity of distribution testing under a variety of distances, including total variation, '2, Hellinger, Kullback-Leibler, and 2. For each pair of distances d1 and d2, we study the complexity of testing if p and q are close in d1 versus far in d2, with a focus on identifying which problems allow strongly sublinear testers (i.e., those with complexity O(n1) for some > 0 where n is the size of the support of the distributions p and q). We provide matching upper and lower bounds for each case. We also study these questions in the case where we only have samples from q (equivalence testing), showing qualitative differences from identity testing in terms of when tolerance can be achieved. Our algorithms fall into the classical paradigm of ℓ2-statistics, but require crucial changes to handle the challenges introduced by each distance we consider. Finally, we survey other recent results in an attempt to serve as a reference for the complexity of various distribution testing problems. | en_US |
dc.language.iso | en | |
dc.publisher | Society for Industrial and Applied Mathematics | en_US |
dc.relation.isversionof | 10.1137/1.9781611975031.175 | en_US |
dc.rights | Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. | en_US |
dc.source | SIAM | en_US |
dc.title | Which distribution distances are sublinearly testable? | en_US |
dc.type | Article | en_US |
dc.identifier.citation | Daskalakis, C, Kamath, G and Wright, J. 2018. "Which distribution distances are sublinearly testable?." Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms. | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
dc.contributor.department | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory | |
dc.contributor.department | Massachusetts Institute of Technology. Plasma Science and Fusion Center | |
dc.relation.journal | Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms | en_US |
dc.eprint.version | Final published version | en_US |
dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
dc.date.updated | 2022-06-17T14:24:16Z | |
dspace.orderedauthors | Daskalakis, C; Kamath, G; Wright, J | en_US |
dspace.date.submission | 2022-06-17T14:24:17Z | |
mit.license | PUBLISHER_POLICY | |
mit.metadata.status | Authority Work and Publication Information Needed | en_US |