Edinburgh Research Explorer

Expectation-Maximization Methods for Solving (PO)MDPs and Optimal Control Problems.

Research output: Chapter in Book/Report/Conference proceedingChapter

Original languageEnglish
Title of host publicationBayesian Time Series Models
EditorsSilvia Chiappa, David Barber
PublisherCambridge University Press
Pages388-413
Number of pages26
ISBN (Electronic)9780511984679
ISBN (Print)9780521196765
DOIs
Publication statusPublished - 2011

Abstract

As this book demonstrates, the development of efficient probabilistic inference techniques has made considerable progress in recent years, in particular with respect to exploiting the structure (e.g., factored, hierarchical or relational) of discrete and continuous problem domains. In this chapter we show that these techniques can be used also for solving Markov decision processes (MDPs) or partially observable MDPs (POMDPs) when formulated in terms of a structured dynamic Bayesian network (DBN).

The problems of planning in stochastic environments and inference in state space models are closely related, in particular in view of the challenges both of them face: scaling to large state spaces spanned by multiple state variables, or realising planning (or inference) in continuous or mixed continuous-discrete state spaces. Both fields developed techniques to address these problems. For instance, in the field of planning, they include work on factored Markov decision processes [5, 17, 9, 18], abstractions [10], and relational models of the environment [37]. On the other hand, recent advances in inference techniques show how structure can be exploited both for exact inference as well as for making efficient approximations. Examples are message-passing algorithms (loopy belief propagation, expectation propagation), variational approaches, approximate belief representations (particles, assumed density filtering, Boyen–Koller) and arithmetic compilation (see, e.g., [22, 23, 7]).

In view of these similarities one may ask whether existing techniques for probabilistic inference can directly be translated to solving stochastic planning problems.

ID: 3607608