Abstract
We present a novel method to extract parallel sentences from two monolingual corpora, using neural machine translation. Our method relies on translating sentences in one corpus, but constraining the decoding by a prefix tree built on the other corpus. We argue that a neural machine translation system by itself can be a sentence similarity scorer and it efficiently approximates pairwise comparison with a modified beam search. When benchmarked on the BUCC shared task, our method achieves results comparable to other submissions.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 1672–1678 |
| Number of pages | 7 |
| ISBN (Electronic) | 978-1-952148-25-5 |
| DOIs | |
| Publication status | Published - 10 Jul 2020 |
| Event | 2020 Annual Conference of the Association for Computational Linguistics - Hyatt Regency Seattle, Virtual conference, United States Duration: 5 Jul 2020 → 10 Jul 2020 Conference number: 58 https://acl2020.org/ |
Conference
| Conference | 2020 Annual Conference of the Association for Computational Linguistics |
|---|---|
| Abbreviated title | ACL 2020 |
| Country/Territory | United States |
| City | Virtual conference |
| Period | 5/07/20 → 10/07/20 |
| Internet address |
Fingerprint
Dive into the research topics of 'Parallel Sentence Mining by Constrained Decoding'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver