Optimal Algorithms for Testing Closeness of Discrete Distributions.

Siu-on Chan, Ilias Diakonikolas, Gregory Valiant, Paul Valiant

Research output: Working paper


We study the question of closeness testing for two discrete distributions. More precisely, given samples from two distributions p and q over an n -element set, we wish to distinguish whether p=q versus p is at least $\eps$-far from q , in either ℓ 1 or ℓ 2 distance. Batu et al. gave the first sub-linear time algorithms for these problems, which matched the lower bounds of Valiant up to a logarithmic factor in n , and a polynomial factor of $\eps.$
In this work, we present simple (and new) testers for both the ℓ 1 and ℓ 2 settings, with sample complexity that is information-theoretically optimal, to constant factors, both in the dependence on n , and the dependence on $\eps$; for the ℓ 1 testing problem we establish that the sample complexity is $\Theta(\max\{n^{2/3}/\eps^{4/3}, n^{1/2}/\eps^2 \}).$
Original languageEnglish
PublisherComputing Research Repository (CoRR)
Number of pages17
Publication statusPublished - 2013


Dive into the research topics of 'Optimal Algorithms for Testing Closeness of Discrete Distributions.'. Together they form a unique fingerprint.

Cite this