Properly Learning Poisson Binomial Distributions in Almost Polynomial Time

Ilias Diakonikolas, D. Kane, Alistair Stewart

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We give an algorithm for properly learning Poisson binomial distributions. A Poisson binomial distribution (PBD) of order nn is the discrete probability distribution of the sum of nn mutually independent Bernoulli random variables. Given O˜(1/ϵ2)O~(1/ϵ2) samples from an unknown PBD PP, our algorithm runs in time (1/ϵ)O(loglog(1/ϵ))(1/ϵ)O(log⁡log⁡(1/ϵ)), and outputs a hypothesis PBD that is ϵϵ-close to PP in total variation distance. The sample complexity of our algorithm is known to be nearly-optimal, up to logarithmic factors, as established in previous work~\cite{DDS12stoc}. However, the previously best known running time for properly learning PBDs~\cite{DDS12stoc, DKS15} was (1/ϵ)O(log(1/ϵ))(1/ϵ)O(log⁡(1/ϵ)), and was essentially obtained by enumeration over an appropriate ϵϵ-cover. We remark that the running time of this cover-based approach cannot be improved, as any ϵϵ-cover for the space of PBDs has size (1/ϵ)Ω(log(1/ϵ))(1/ϵ)Ω(log⁡(1/ϵ))~\cite{DKS15}. As one of our main contributions, we provide a novel structural characterization of PBDs, showing that any PBD PP is ϵϵ-close to another PBD QQ with O(log(1/ϵ))O(log⁡(1/ϵ)) distinct parameters. More precisely, we prove that, for all ϵ>0,ϵ>0, there exists an explicit collection MM of (1/ϵ)O(loglog(1/ϵ))(1/ϵ)O(log⁡log⁡(1/ϵ)) vectors of multiplicities, such that for any PBD PP there exists a PBD QQ with O(log(1/ϵ))O(log⁡(1/ϵ)) distinct parameters whose multiplicities are given by some element of MM, such that QQ is ϵϵ-close to P.P. Our proof combines tools from Fourier analysis and algebraic geometry.Our approach to the proper learning problem is as follows: Starting with an accurate non-proper hypothesis, we fit a PBD to this hypothesis. More specifically, we essentially start with the hypothesis computed by the computationally efficient non-proper learning algorithm in our recent work~\cite{DKS15}. Our aforementioned structural characterization allows us to reduce the corresponding fitting problem to a collection of (1/ϵ)O(loglog(1/ϵ))(1/ϵ)O(log⁡log⁡(1/ϵ)) systems of low-degree polynomial inequalities. We show that each such system can be solved in time (1/ϵ)O(loglog(1/ϵ))(1/ϵ)O(log⁡log⁡(1/ϵ)), which yields the overall running time of our algorithm.
Original languageEnglish
Title of host publicationProceedings of the 29th Annual Conference on Learning Theory (COLT 2016)
Place of PublicationNew York, USA
PublisherPMLR
Pages850-878
Number of pages29
Publication statusPublished - 26 Jun 2016
Event29th Annual Conference on Learning Theory - New York City, United States
Duration: 23 Jun 201626 Jun 2016
http://www.learningtheory.org/colt2016/

Publication series

NameProceedings of Machine Learning Research
Volume49
ISSN (Electronic)2640-3498

Conference

Conference29th Annual Conference on Learning Theory
Abbreviated titleCOLT 2016
Country/TerritoryUnited States
CityNew York City
Period23/06/1626/06/16
Internet address

Fingerprint

Dive into the research topics of 'Properly Learning Poisson Binomial Distributions in Almost Polynomial Time'. Together they form a unique fingerprint.

Cite this