Testing Closeness of Discrete Distributions
Author(s)
Batu, Tugkan; Fortnow, Lance; Rubinfeld, Ronitt; Smith, Warren D.; White, Patrick
DownloadRubinfeld_Testing closeness.pdf (308.9Kb)
OPEN_ACCESS_POLICY
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
Given samples from two distributions over an n-element set, we wish to test whether these distributions are statistically close. We present an algorithm which uses sublinear in n, specifically, O(n[superscript 2/3]ε[superscript −8/3] log n), independent samples from each distribution, runs in time linear in the sample size, makes no assumptions about the structure of the distributions, and distinguishes the cases when the distance between the distributions is small (less than {ε[superscript 4/3]n[superscript −1/3]/32, εn[superscript −1/2]/4}) or large (more than ε) in ℓ[subscript 1] distance. This result can be compared to the lower bound of Ω(n[superscript 2/3]ε[superscript −2/3]) for this problem given by Valiant [2008].
Our algorithm has applications to the problem of testing whether a given Markov process is rapidly mixing. We present sublinear algorithms for several variants of this problem as well.
Date issued
2013-02Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer ScienceJournal
Journal of the ACM
Publisher
Association for Computing Machinery (ACM)
Citation
Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. 2013. Testing Closeness of Discrete Distributions. J. ACM 60, 1, Article 4 (February 2013), 25 pages.
Version: Original manuscript
ISSN
00045411