Automatic Reference-Based Evaluation of Pronoun Translation Misses the Point

Liane Guillou, Christian Hardmeier

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We compare the performance of the APT and AutoPRF metrics for pronoun translation against a manually annotated dataset comprising human judgements as to the correctness of translations of the PROTEST test suite. Although there is some correlation with the human judgements, a range of issues limit the performance of the automated metrics. Instead, we recommend the use of semiautomatic metrics and test suites in place of fully automatic metrics.
Original languageEnglish
Title of host publication2018 Conference on Empirical Methods in Natural Language Processing
Place of PublicationBrussels, Belgium
PublisherAssociation for Computational Linguistics
Pages4797-4802
Number of pages6
Publication statusPublished - Nov 2018
Event2018 Conference on Empirical Methods in Natural Language Processing - Square Meeting Center, Brussels, Belgium
Duration: 31 Oct 20184 Nov 2018
http://emnlp2018.org/

Conference

Conference2018 Conference on Empirical Methods in Natural Language Processing
Abbreviated titleEMNLP 2018
CountryBelgium
CityBrussels
Period31/10/184/11/18
Internet address

Cite this