Projects per year
Abstract
It has been shown that increasing model depth improves the quality of neural machine translation. However, different architectural variants to increase model depth have been proposed, and so far, there has been no thorough comparative study. In this work, we describe and evaluate several existing approaches to introduce depth in neural machine translation. Additionally, we explore novel architectural variants, including deep transition RNNs, and we vary how attention is used in the deep decoder. We introduce a novel "BiDeep" RNN architecture that combines deep transition RNNs and stacked RNNs. Our evaluation is carried out on the English to German WMT news translation dataset, using a single-GPU machine for both training and inference. We find that several of our proposed architectures improve upon existing approaches in terms of speed and translation quality. We obtain best improvements with a BiDeep RNN of combined depth 8, obtaining an average improvement of 1.5 BLEU over a strong shallow baseline. We release our code for ease of adoption.
Original language | English |
---|---|
Title of host publication | Proceedings of the Second Conference on Machine Translation, Volume 1: Research Papers |
Place of Publication | Copenhagen, Denmark |
Publisher | Association for Computational Linguistics |
Pages | 99-107 |
Number of pages | 9 |
DOIs | |
Publication status | Published - 8 Sep 2017 |
Event | Second Conference on Machine Translation - Copenhagen, Denmark Duration: 7 Sep 2017 → 8 Sep 2017 http://www.statmt.org/wmt17/ |
Conference
Conference | Second Conference on Machine Translation |
---|---|
Abbreviated title | WMT17 |
Country/Territory | Denmark |
City | Copenhagen |
Period | 7/09/17 → 8/09/17 |
Internet address |
Fingerprint
Dive into the research topics of 'Deep Architectures for Neural Machine Translation'. Together they form a unique fingerprint.Projects
- 3 Finished
-
SUMMA - Scalable Understanding of Mulitingual Media
Renals, S., Birch-Mayne, A. & Cohen, S.
1/02/16 → 31/01/19
Project: Research
-
-
HimL: Health in my Language
Haddow, B., Birch-Mayne, A. & Webber, B.
1/02/15 → 31/01/18
Project: Research
Profiles
-
Alexandra Birch-Mayne
- School of Informatics - Reader in Natural Language Processing
- Institute of Language, Cognition and Computation
- Language, Interaction and Robotics
Person: Academic: Research Active (Research Assistant)