Near-Optimal Closeness Testing of Discrete Histogram Distributions

Ilias Diakonikolas, Daniel M. Kane, Vladimir Nikishkin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We investigate the problem of testing the equivalence between two discrete histograms. A k-histogram over [n] is a probability distribution that is piecewise constant over some set of k intervals over [n]. Histograms have been extensively studied in computer science and statistics. Given a set of samples from two k-histogram distributions p, q over [n], we want to distinguish (with high probability) between the cases that p = q and ||p ? q||_1 >= epsilon. The main contribution of this paper is a new algorithm for this testing problem and a nearly matching information-theoretic lower bound. Specifically, the sample complexity of our algorithm matches our lower bound up to a logarithmic factor, improving on previous work by polynomial factors in the relevant parameters. Our algorithmic approach applies in a more general setting and yields improved sample upper bounds for testing closeness of other structured distributions as well.
Original languageEnglish
Title of host publication44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)
EditorsIoannis Chatzigiannakis, Piotr Indyk, Fabian Kuhn, Anca Muscholl
Place of PublicationDagstuhl, Germany
PublisherSchloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany
Pages1-15
Number of pages15
ISBN (Print)978-3-95977-041-5
DOIs
Publication statusPublished - 14 Jul 2017
EventICALP 2017: 44th International Colloquium on Automata, Languages, and Programming - Warsaw, Poland
Duration: 10 Jul 201714 Jul 2017
http://icalp17.mimuw.edu.pl/

Publication series

NameLeibniz International Proceedings in Informatics (LIPIcs)
PublisherSchloss Dagstuhl--Leibniz-Zentrum fuer Informatik
Volume80
ISSN (Print)1868-8969

Conference

ConferenceICALP 2017
Abbreviated titleICALP 2017
CountryPoland
CityWarsaw
Period10/07/1714/07/17
Internet address

Fingerprint

Dive into the research topics of 'Near-Optimal Closeness Testing of Discrete Histogram Distributions'. Together they form a unique fingerprint.

Cite this