Ensemble Learning for Multi-Layer Networks

David Barber, Christopher M. Bishop

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Bayesian treatments of learning in neural networks are typically based either on local Gaussian approximations to a mode of the posterior weight distribution, or on Markov chain Monte Carlo simulations. A third approach, called ensemble learning, was introduced by Hinton and van Camp (1993). It aims to approximate the posterior distribution by minimizing the Kullback-Leibler divergence between the true posterior and a parametric approximating distribution. However, the derivation of a deterministic algorithm relied on the use of a Gaussian approximating distribution with a diagonal covariance matrix and so was unable to capture the posterior correlations between parameters. In this paper, we show how the ensemble learning approach can be extended to full- covariance Gaussian distributions while remaining computationally tractable. We also extend the framework to deal with hyperparameters, leading to a simple re-estimation procedure. Initial results from a standard benchmark problem are encouraging.
Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 10 (NIPS 1997)
EditorsM.I. Jordan, M.J. Kearns, S.A. Solla
PublisherMIT Press
Number of pages7
Publication statusPublished - 1998


Dive into the research topics of 'Ensemble Learning for Multi-Layer Networks'. Together they form a unique fingerprint.

Cite this