Inducing Document Plans for Concept-to-Text Generation

Ioannis Konstas, Mirella Lapata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In a language generation system, a content planner selects which elements must be included in the output text and the ordering between them. Recent empirical approaches perform content selection without any ordering and have thus no means to ensure that the output is coherent. In this paper we focus on the problem of generating text from a database and present a trainable end-to-end generation system that includes both content selection and ordering. Content plans are represented intuitively by a set of grammar rules that operate on the document level and are acquired automatically from training data. We develop two approaches: the first one is inspired from Rhetorical Structure Theory and represents the document as a tree of discourse relations between database records; the second one requires little linguistic sophistication and uses tree structures to represent global patterns of database record sequences within a document. Experimental evaluation on two domains yields considerable improvements over the state of the art for both approaches.
Original languageEnglish
Title of host publicationProceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013, 18-21 October 2013, Grand Hyatt Seattle, Seattle, Washington, USA, A meeting of SIGDAT, a Special Interest Group of the ACL
PublisherAssociation for Computational Linguistics
Pages1503-1514
Number of pages12
Publication statusPublished - 2013

Fingerprint Dive into the research topics of 'Inducing Document Plans for Concept-to-Text Generation'. Together they form a unique fingerprint.

Cite this