Projects per year
Abstract
We describe the University of Edinburgh's submissions to the WMT20 news translation shared task for the low resource language pair English-Tamil and the mid-resource language pair English-Inuktitut. We use the neural machine translation transformer architecture for all submissions and explore a variety of techniques to improve translation quality to compensate for the lack of parallel training data. For the very low-resource English-Tamil, this involves exploring pretraining, using both language model objectives and translation using an unrelated high-resource language pair (German-English), and iterative backtranslation. For English-Inuktitut, we explore the use of multilingual systems, which, despite not being part of the primary submission, would have achieved the best results on the test set.
Original language | English |
---|---|
Title of host publication | Proceedings of the 5th Conference on Machine Translation |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 92-99 |
Number of pages | 8 |
ISBN (Print) | 978-1-948087-81-0 |
Publication status | Published - 19 Nov 2020 |
Event | Fifth Conference on Machine Translation - Online Conference Duration: 19 Nov 2020 → 20 Nov 2020 http://www.statmt.org/wmt20/ |
Conference
Conference | Fifth Conference on Machine Translation |
---|---|
Abbreviated title | WMT 2020 |
City | Online Conference |
Period | 19/11/20 → 20/11/20 |
Internet address |
Keywords
- machine translation
- WMT
- shared task
- Tamil
- Inuktitut
Projects
- 2 Finished