Multiple aspect summarization using integer linear programming

Kristian Woodsend, Mirella Lapata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Multi-document summarization involves many aspects of content selection and surface realization. The summaries must be informative, succinct, grammatical, and obey stylistic writing conventions. We present a method where such individual aspects are learned separately from data (without any hand-engineering) but optimized jointly using an integer linear programme. The ILP framework allows us to combine the decisions of the expert learners and to select and rewrite source content through a mixture of objective setting, soft and hard constraints. Experimental results on the TAC-08 data set show that our model achieves state-of-the-art performance using ROUGE and significantly improves the informativeness of the summaries.

Original languageEnglish
Title of host publicationEMNLP-CoNLL 2012 - 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Proceedings of the Conference
Pages233-243
Number of pages11
Publication statusPublished - 1 Dec 2012
Event2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012 - Jeju Island, Korea, Republic of
Duration: 12 Jul 201214 Jul 2012
http://emnlp-conll2012.unige.ch/

Conference

Conference2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012
Country/TerritoryKorea, Republic of
CityJeju Island
Period12/07/1214/07/12
Internet address

Fingerprint

Dive into the research topics of 'Multiple aspect summarization using integer linear programming'. Together they form a unique fingerprint.

Cite this