The University of Edinburgh’s English-Tamil and English-Inuktitut Submissions to the WMT20 News Translation Task

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We describe the University of Edinburgh's submissions to the WMT20 news translation shared task for the low resource language pair English-Tamil and the mid-resource language pair English-Inuktitut. We use the neural machine translation transformer architecture for all submissions and explore a variety of techniques to improve translation quality to compensate for the lack of parallel training data. For the very low-resource English-Tamil, this involves exploring pretraining, using both language model objectives and translation using an unrelated high-resource language pair (German-English), and iterative backtranslation. For English-Inuktitut, we explore the use of multilingual systems, which, despite not being part of the primary submission, would have achieved the best results on the test set.
Original languageEnglish
Title of host publicationProceedings of the 5th Conference on Machine Translation
PublisherAssociation for Computational Linguistics (ACL)
Pages92-99
Number of pages8
ISBN (Print)978-1-948087-81-0
Publication statusPublished - 19 Nov 2020
EventFifth Conference on Machine Translation - Online Conference
Duration: 19 Nov 202020 Nov 2020
http://www.statmt.org/wmt20/

Conference

ConferenceFifth Conference on Machine Translation
Abbreviated titleWMT 2020
CityOnline Conference
Period19/11/2020/11/20
Internet address

Keywords

  • machine translation
  • WMT
  • shared task
  • Tamil
  • Inuktitut

Cite this