Projects per year
Abstract
A core step in statistical data-to-text generation concerns learning correspondences between structured data representations (e.g., facts in a database) and associated texts. In this paper we aim to bootstrap generators from large scale datasets where the data (e.g., DBPedia facts) and related texts (e.g., Wikipedia abstracts) are loosely aligned. We tackle this challenging task by introducing a special-purpose content selection mechanism.1 We use multi-instance learning to automatically discover correspondences between data and text pairs and show how these can be used to enhance the content signal while training an encoder-decoder architecture. Experimental results demonstrate that models trained with content-specific objectives improve upon a vanilla encoder-decoder which solely relies on soft attention.1Our code and data are available at
https://github.com/EdinburghNLP/wikigen
Original language | English |
---|---|
Title of host publication | The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Place of Publication | New Orleans, Louisiana |
Publisher | Association for Computational Linguistics |
Pages | 1516-1527 |
Number of pages | 12 |
DOIs | |
Publication status | Published - 30 Jun 2018 |
Event | 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Hyatt Regency New Orleans Hotel, New Orleans, United States Duration: 1 Jun 2018 → 6 Jun 2018 http://naacl2018.org/ |
Conference
Conference | 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
---|---|
Abbreviated title | NAACL HLT 2018 |
Country/Territory | United States |
City | New Orleans |
Period | 1/06/18 → 6/06/18 |
Internet address |
Fingerprint
Dive into the research topics of 'Bootstrapping Generators from Noisy Data'. Together they form a unique fingerprint.Projects
- 1 Finished
-
TransModal: Translating from Multiple Modalities into Text
Lapata, M. (Principal Investigator)
1/09/16 → 31/08/22
Project: Research