Abstract
Multi-document summarization involves many aspects of content selection and surface realization. The summaries must be informative, succinct, grammatical, and obey stylistic writing conventions. We present a method where such individual aspects are learned separately from data (without any hand-engineering) but optimized jointly using an integer linear programme. The ILP framework allows us to combine the decisions of the expert learners and to select and rewrite source content through a mixture of objective setting, soft and hard constraints. Experimental results on the TAC-08 data set show that our model achieves state-of-the-art performance using ROUGE and significantly improves the informativeness of the summaries.
Original language | English |
---|---|
Title of host publication | EMNLP-CoNLL 2012 - 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Proceedings of the Conference |
Pages | 233-243 |
Number of pages | 11 |
Publication status | Published - 1 Dec 2012 |
Event | 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012 - Jeju Island, Korea, Republic of Duration: 12 Jul 2012 → 14 Jul 2012 http://emnlp-conll2012.unige.ch/ |
Conference
Conference | 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012 |
---|---|
Country/Territory | Korea, Republic of |
City | Jeju Island |
Period | 12/07/12 → 14/07/12 |
Internet address |