Can time dependencies and ensemble classification improve content-free dialogue segmentation?

Jing Su, Saturnino Luz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present an extended study of content-free topic segmentation of conversational (meeting) data based on classification of vocalization events. In previous work, content-free topic segmentation achieved good accuracy through a modified naive Bayes classifier and vocalization horizon features. In this study, we attempted to improve on those results by incorporating time (sequential) dependency information into the topic boundary detection process through the use of conditional random fields and ensemble classifiers. We expected that incorporating such information would help reduce the number of false positives generated by the naive Bayes method. We introduce a new metric in the assessment of performance, in addition to the usual Pk and WindowDiff (WD) metrics in order to account for the under-detection bias of the segmentation task. Although a boosting model showed fairly good performance using a simple base classifier and limited contextual features, the more elaborate methods still trailed the Bayesian method.
Original languageEnglish
Title of host publicationCognitive Infocommunications (CogInfoCom), 2013 IEEE 4th International Conference on
Pages183-188
Number of pages6
ISBN (Electronic)978-1-4799-1546-0
DOIs
Publication statusPublished - 23 Jan 2014

Fingerprint

Dive into the research topics of 'Can time dependencies and ensemble classification improve content-free dialogue segmentation?'. Together they form a unique fingerprint.

Cite this