An Exploration of Neural Sequence-to-Sequence Architectures for Automatic Post-Editing

Marcin Junczys-Dowmunt, Roman Grundkiewicz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this work, we explore multiple neural architectures adapted for the task of automatic post-editing of machine translation output. We focus on neural endto-end models that combine both inputs mt (raw MT output) and src (source language input) in a single neural architecture, modeling {mt, src} → pe directly. Apart from that, we investigate the influence of hard-attention models which seem to be well-suited for monolingual tasks, as well as combinations of both ideas. We report results on data sets provided during the WMT-2016 shared task on automatic post-editing and can demonstrate that dual-attention models that incorporate all available data in the APE scenario in a single model improve on the best shared task system and on all other published results after the shared task. Dual-attention models that are combined with hard attention remain competitive despite applying fewer changes to the input.
Original languageEnglish
Title of host publicationThe 8th International Joint Conference on Natural Language Processing (IJCNLP 2017)
PublisherAsian Federation of Natural Language Processing
Pages120-129
Number of pages10
Volume1
Publication statusPublished - 1 Dec 2017
EventThe 8th International Joint Conference on Natural Language Processing - Taipei, Taiwan, Province of China
Duration: 27 Nov 20171 Dec 2017
http://ijcnlp2017.org/site/page.aspx?pid=901&sid=1133&lang=en

Conference

ConferenceThe 8th International Joint Conference on Natural Language Processing
Abbreviated titleIJCNLP 2017
CountryTaiwan, Province of China
CityTaipei
Period27/11/171/12/17
Internet address

Cite this