A Large-Scale Test Set for the Evaluation of Context-Aware Pronoun Translation in Neural Machine Translation

Mathias Müller, Annette Rios, Elena Voita, Rico Sennrich

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The translation of pronouns presents a special challenge to machine translation to this day, since it often requires context outside the current sentence. Recent work on models that have access to information across sentence boundaries has seen only moderate improvements in terms of automatic evaluation metrics such as BLEU. However, metrics that quantify the overall translation quality are ill equipped to measure gains from additional context. We argue that a different kind of evaluation is needed to assess how well models translate intersentential phenomena such as pronouns. This paper therefore presents a test suite of contrastive translations focused specifically on the translation of pronouns. Furthermore, we perform experiments with several context-aware models. We show that, while gains in BLEU are moderate for those systems, they outperform baselines by a large margin in terms of accuracy on our contrastive test set. Our experiments also show the effectiveness of parameter tying for multi-encoder architectures.
Original languageEnglish
Title of host publicationEMNLP 2018 THIRD CONFERENCE ON MACHINE TRANSLATION (WMT18)
Place of PublicationBrussels, Belgium
PublisherAssociation for Computational Linguistics
Pages61-72
Number of pages12
Publication statusPublished - Oct 2018
EventEMNLP 2018 Third Conference on Machine Translation (WMT18) - Brussels, Belgium
Duration: 31 Oct 20181 Nov 2018
http://www.statmt.org/wmt18/

Workshop

WorkshopEMNLP 2018 Third Conference on Machine Translation (WMT18)
Abbreviated titleWMT18
Country/TerritoryBelgium
CityBrussels
Period31/10/181/11/18
Internet address

Fingerprint

Dive into the research topics of 'A Large-Scale Test Set for the Evaluation of Context-Aware Pronoun Translation in Neural Machine Translation'. Together they form a unique fingerprint.

Cite this