Ensemble Learning for Multi-Layer Networks

David Barber, Christopher M. Bishop

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Bayesian treatments of learning in neural networks are typically based either on local Gaussian approximations to a mode of the posterior weight distribution, or on Markov chain Monte Carlo simulations. A third approach, called ensemble learning, was introduced by Hinton and van Camp (1993). It aims to approximate the posterior distribution by minimizing the Kullback-Leibler divergence between the true posterior and a parametric approximating distribution. However, the derivation of a deterministic algorithm relied on the use of a Gaussian approximating distribution with a diagonal covariance matrix and so was unable to capture the posterior correlations between parameters. In this paper, we show how the ensemble learning approach can be extended to full- covariance Gaussian distributions while remaining computationally tractable. We also extend the framework to deal with hyperparameters, leading to a simple re-estimation procedure. Initial results from a standard benchmark problem are encouraging.
Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 10 (NIPS 1997)
EditorsM.I. Jordan, M.J. Kearns, S.A. Solla
PublisherMIT Press
Pages395-401
Number of pages7
Publication statusPublished - 1998

Fingerprint

Dive into the research topics of 'Ensemble Learning for Multi-Layer Networks'. Together they form a unique fingerprint.

Cite this