Abstract
Transformer-based neural machine translation (NMT) has recently achieved state-of-the-art performance on many machine translation tasks. However, recent work (Raganato and Tiedemann, 2018; Tang et al., 2018; Tran et al., 2018) has indicated that Transformer models may not learn syntactic structures as well as their recurrent neural network-based counterparts, particularly in low-resource cases. In this paper, we incorporate constituency parse information into a Transformer NMT model. We leverage linearized parses of the source training sentences in order to inject syntax into the Transformer architecture without modifying it. We introduce two methods: a multi-task machine translation and parsing model with a single encoder and decoder, and a mixed encoder model that learns to translate directly from parsed and unparsed source sentences. We evaluate our methods on low-resource translation from English into twenty target languages, showing consistent improvements of 1.3 BLEU on average across diverse target languages for the multi-task technique. We further evaluate the models on full-scale WMT tasks, finding that the multi-task model aids low- and medium-resource NMT but degenerates high-resource English-German translation.
Original language | English |
---|---|
Title of host publication | Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers) |
Place of Publication | Florence, Italy |
Publisher | Association for Computational Linguistics |
Pages | 24-33 |
Number of pages | 10 |
DOIs | |
Publication status | Published - 1 Aug 2019 |
Event | ACL 2019 Fourth Conference on Machine Translation - Florence, Italy Duration: 1 Aug 2019 → 2 Aug 2019 http://www.statmt.org/wmt19/ |
Conference
Conference | ACL 2019 Fourth Conference on Machine Translation |
---|---|
Abbreviated title | WMT19 |
Country/Territory | Italy |
City | Florence |
Period | 1/08/19 → 2/08/19 |
Internet address |