An Exploration of Neural Sequence-to-Sequence Architectures for Automatic Post-Editing

Marcin Junczys-Dowmunt, Roman Grundkiewicz

Research output: Chapter in Book/Report/Conference proceedingConference contribution


In this work, we explore multiple neural architectures adapted for the task of automatic post-editing of machine translation output. We focus on neural endto-end models that combine both inputs mt (raw MT output) and src (source language input) in a single neural architecture, modeling {mt, src} → pe directly. Apart from that, we investigate the influence of hard-attention models which seem to be well-suited for monolingual tasks, as well as combinations of both ideas. We report results on data sets provided during the WMT-2016 shared task on automatic post-editing and can demonstrate that dual-attention models that incorporate all available data in the APE scenario in a single model improve on the best shared task system and on all other published results after the shared task. Dual-attention models that are combined with hard attention remain competitive despite applying fewer changes to the input.
Original languageEnglish
Title of host publicationThe 8th International Joint Conference on Natural Language Processing (IJCNLP 2017)
PublisherAsian Federation of Natural Language Processing
Number of pages10
Publication statusPublished - 1 Dec 2017
EventThe 8th International Joint Conference on Natural Language Processing - Taipei, Taiwan, Province of China
Duration: 27 Nov 20171 Dec 2017


ConferenceThe 8th International Joint Conference on Natural Language Processing
Abbreviated titleIJCNLP 2017
Country/TerritoryTaiwan, Province of China
Internet address

Cite this