Complexity of High-Dimensional Identity Testing with Coordinate Conditional Sampling
Author(s)
Blanca, Antonio; Chen, Zongchen; Stefankovic, Daniel; Vigoda, Eric
Download3686799.pdf (2.318Mb)
Publisher Policy
Publisher Policy
Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.
Terms of use
Metadata
Show full item recordAbstract
We study the identity testing problem for high-dimensional distributions. Given as input an explicit distribution , an > 0,
and access to sampling oracle(s) for a hidden distribution , the goal in identity testing is to distinguish whether the two
distributions and are identical or are at least -far apart. When there is only access to full samples from the hidden
distribution , it is known that exponentially many samples (in the dimension) may be needed for identity testing, and hence
previous works have studied identity testing with additional access to various “conditional” sampling oracles. We consider a
significantly weaker conditional sampling oracle, which we call the Coordinate Oracle, and provide a computational and
statistical characterization of the identity testing problem in this new model.
We prove that if an analytic property known as approximate tensorization of entropy holds for an -dimensional visible
distribution , then there is an efficient identity testing algorithm for any hidden distribution using e(/) queries to
the Coordinate Oracle. Approximate tensorization of entropy is a pertinent condition as recent works have established
it for a large class of high-dimensional distributions. We also prove a computational phase transition: for a well-studied
class of -dimensional distributions, specifically sparse antiferromagnetic Ising models over {+1, −1}
, we show that in the
regime where approximate tensorization of entropy fails, there is no efficient identity testing algorithm unless RP = NP. We
complement our results with a matching Ω(/) statistical lower bound for the sample complexity of identity testing in the
Coordinate Oracle model.
Journal
ACM Transactions on Algorithms
Publisher
ACM
Citation
Antonio Blanca, Zongchen Chen, Daniel Štefankovič, and Eric Vigoda. 2024. Complexity of High-Dimensional Identity Testing with Coordinate Conditional Sampling. ACM Trans. Algorithms Just Accepted (August 2024).
Version: Final published version
ISSN
1549-6325
Collections
The following license files are associated with this item: