Improving Neural Machine Translation Models with Monolingual Data

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Neural Machine Translation (NMT) has obtained state-of-the art performance for several language pairs, while only using parallel data for training. Monolingual data plays an important role in boosting fluency for phrase-based statistical machine translation, and we investigate the use of monolingual data for neural machine translation (NMT). In contrast to previous work, which integrates a separately trained RNN language model into an NMT architecture (Gülçehre et al., 2015), we note that encoder-decoder NMT architectures already have the capacity to learn the same information as a language model, and we explore strategies to include monolingual training data in the training process. Through our use of monolingual data, we obtain substantial improvements on the WMT 15 (+2.8–3.4 BLEU) task for English↔German, and for the low-resourced IWSLT 14 task Turkish!English (+2.1–3.4 BLEU), obtaining new state-of-the-art results. We also show that fine-tuning on in-domain monolingual and parallel data gives substantial improvements for the IWSLT 15 task for English→German.
Original languageEnglish
Title of host publicationProceedings of the 54th Annual Meeting of the Association for Computational Linguistics
Place of PublicationBerlin, Germany
PublisherAssociation for Computational Linguistics (ACL)
Pages86-96
Number of pages11
Volume1: Long Papers
ISBN (Print)978-1-945626-00-5
DOIs
Publication statusPublished - 12 Aug 2016
Event54th Annual Meeting of the Association for Computational Linguistics - Berlin, Germany
Duration: 7 Aug 201612 Aug 2016
https://mirror.aclweb.org/acl2016/

Conference

Conference54th Annual Meeting of the Association for Computational Linguistics
Abbreviated titleACL 2016
Country/TerritoryGermany
CityBerlin
Period7/08/1612/08/16
Internet address

Fingerprint

Dive into the research topics of 'Improving Neural Machine Translation Models with Monolingual Data'. Together they form a unique fingerprint.

Cite this