Exploring the Importance of Source Text in Automatic Post-Editing for Context-Aware Machine Translation

Chaojun Wang, Christian Hardmeier, Rico Sennrich

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Accurate translation requires document-level information, which is ignored by sentence-level machine translation. Recent work has demonstrated that document-level consistency can be improved with automatic post-editing (APE) using only target-language (TL) information. We study an extended APE model that additionally integrates source context. A human evaluation of fluency and adequacy in English–Russian translation reveals that the model with access to source context significantly outperforms monolingual APE in terms of adequacy, an effect largely ignored by automatic evaluation metrics. Our results show that TL-only modelling increases fluency without improving adequacy, demonstrating the need for conditioning on source text for automatic post-editing. They also highlight blind spots in automatic methods for targeted evaluation and demonstrate the need for human assessment to evaluate document-level translation quality reliably.
Original languageEnglish
Title of host publicationProceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021)
Place of PublicationOnline
PublisherLinköping University Electronic Press, Sweden
Pages326-335
Number of pages10
ISBN (Electronic)978-91-7929-614-8
Publication statusPublished - 31 May 2021
Event23rd Nordic Conference on Computational Linguistics - Online, Reykjavik , Iceland
Duration: 31 May 20212 Jun 2021
https://nodalida2021.github.io/

Publication series

Name
ISSN (Print)1650-3686
ISSN (Electronic)1650-3740

Conference

Conference23rd Nordic Conference on Computational Linguistics
Abbreviated titleNoDaLiDa 2021
Country/TerritoryIceland
CityReykjavik
Period31/05/212/06/21
Internet address

Cite this