Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure

Minwoo Jeong, Ivan Titov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Documents often have inherently parallel structure: they may consist of a text and commentaries, or an abstract and a body, or parts presenting alternative views on the same problem. Revealing relations between the parts by jointly segmenting and predicting links between the segments, would help to visualize such documents and construct friendlier user interfaces. To address this problem, we propose an unsupervised Bayesian model for joint discourse segmentation and alignment. We apply our method to the “English as a second
language” podcast dataset where each episode is composed of two parallel parts: a story and an explanatory lecture. The predicted topical links uncover hidden relations between the stories and the lectures. In this domain, our method achieves competitive results, rivaling those of a previously proposed supervised technique.
Original languageEnglish
Title of host publicationACL 2010, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, July 11-16, 2010, Uppsala, Sweden, Short Papers
PublisherAssociation for Computational Linguistics
Pages151-155
Number of pages5
Publication statusPublished - 2010

Fingerprint

Dive into the research topics of 'Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure'. Together they form a unique fingerprint.

Cite this