Zero-Resource Neural Machine Translation with Monolingual Pivot Data

Anna Currey, Kenneth Heafield

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Zero-shot neural machine translation (NMT) is a framework that uses source-pivot and target-pivot parallel data to train a sourcetarget NMT system. An extension to zeroshot NMT is zero-resource NMT, which generates pseudo-parallel corpora using a zeroshot system and further trains the zero-shot system on that data. In this paper, we expand on zero-resource NMT by incorporating monolingual data in the pivot language into training; since the pivot language is usually the highest-resource language of the three, we expect monolingual pivot-language data to be most abundant. We propose methods for generating pseudo-parallel corpora using pivotlanguage monolingual data and for leveraging the pseudo-parallel corpora to improve the zero-shot NMT system. We evaluate these methods for a high-resource language pair (German-Russian) using English as the pivot. We show that our proposed methods yield consistent improvements over strong zero-shot and zero-resource baselines and even catch up to pivot-based models in BLEU (while not requiring the two-pass inference that pivot models require).
Original languageEnglish
Title of host publicationProceedings of the The 3rd Workshop on Neural Generation and Translation (WNGT 2019)
Place of PublicationHong Kong
PublisherAssociation for Computational Linguistics (ACL)
Pages99–107
Number of pages9
ISBN (Print)78-1-950737-83-3
DOIs
Publication statusPublished - 4 Nov 2019
EventThe 3rd Workshop on Neural Generation and Translation: at EMNLP-IJCNLP 2019 - Hong Kong, Hong Kong
Duration: 4 Nov 20194 Nov 2019
https://sites.google.com/view/wngt19/home

Workshop

WorkshopThe 3rd Workshop on Neural Generation and Translation
Abbreviated titleWNGT 2019
Country/TerritoryHong Kong
CityHong Kong
Period4/11/194/11/19
Internet address

Cite this