Optimal testing for properties of distributions
Author(s)Acharya, Jayadev; Daskalakis, Konstantinos; Kamath, Gautam Chetan
MetadataShow full item record
Given samples from an unknown discrete distribution p, is it possible to distinguish whether p belongs to some class of distributions C versus p being far from every distribution in C? This fundamental question has received tremendous attention in statistics, focusing primarily on asymptotic analysis, as well as in information theory and theoretical computer science, where the emphasis has been on small sample size and computational complexity. Nevertheless, even for basic properties of discrete distributions such as monotonicity, independence, logconcavity, unimodality, and monotone-hazard rate, the optimal sample complexity is unknown. We provide a general approach via which we obtain sample-optimal and computationally efficient testers for all these distribution families. At the core of our approach is an algorithm which solves the following problem: Given samples from an unknown distribution p, and a known distribution q, are p and q close in x[superscript 2]-distance, or far in total variation distance? The optimality of our testers is established by providing matching lower bounds, up to constant factors. Finally, a necessary building block for our testers and an important byproduct of our work are the first known computationally efficient proper learners for discrete log-concave, monotone hazard rate distributions.
DepartmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Advances in Neural Information Processing Systems 28 (NIPS 2015)
Neural Information Processing Systems Foundation
Acharya, Jayadev, Constantinos Daskalakis, and Gautam Kamath. "Optimal Testing for Properties of Distributions." Advances in Neural Information Processing Systems 28 (NIPS 2015), Montreal, Canada, 7-12 December, 2015. NIPS 2015.
Final published version