Edinburgh Research Explorer

Automatic Meeting Segmentation Using Dynamic Bayesian Networks

Research output: Contribution to journalArticle

Original languageEnglish
Pages (from-to)25-36
JournalIEEE transactions on multimedia
Volume9
Issue number1
DOIs
Publication statusPublished - 1 Jan 2007

Abstract

Multiparty meetings are a ubiquitous feature of organizations, and there are considerable economic benefits that would arise from their automatic analysis and structuring. In this paper, we are concerned with the segmentation and structuring of meetings (recorded using multiple cameras and microphones) into sequences of group meeting actions such as monologue, discussion and presentation. We outline four families of multimodal features based on speaker turns, lexical transcription, prosody, and visual motion that are extracted from the raw audio and video recordings. We relate these low-level features to more complex group behaviors using a multistream modelling framework based on multi-stream dynamic Bayesian networks (DBNs). This results in an effective approach to the segmentation problem, resulting in an action error rate of 12.2%, compared with 43% using an approach based on hidden Markov models. Moreover, the multistream DBN developed here leaves scope for many further improvements and extensions

ID: 27374276