Abstract / Description of output
Let be an unknown and arbitrary probability distribution over . We consider the problem of \emph{density estimation}, in which a learning algorithm is given i.i.d. draws from and must (with high probability) output a hypothesis distribution that is close to . The main contribution of this paper is a highly efficient density estimation algorithm for learning using a variable-width histogram, i.e., a hypothesis distribution with a piecewise constant probability density function. In more detail, for any and , we give an algorithm that makes draws from , runs in time, and outputs a hypothesis distribution that is piecewise constant with pieces. With high probability the hypothesis satisfies , where denotes the total variation distance (statistical distance), is a universal constant, and is the smallest total variation distance between and any -piecewise constant distribution. The sample size and running time of our algorithm are both optimal up to logarithmic factors. The approximation factor'' that is present in our result is inherent in the problem, as we prove that no algorithm with sample size bounded in terms of and can achieve regardless of what kind of hypothesis distribution it uses.
Original language | English |
---|---|
Title of host publication | Advances in Neural Information Processing Systems 27 |
Editors | Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, K. Q. Weinberger |
Publisher | Curran Associates Inc |
Pages | 1844-1852 |
Number of pages | 9 |
Publication status | Published - 2014 |