On the Impact of Various Types of Noise on Neural Machine Translation

Huda Khayrallah, Philipp Koehn

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We examine how various types of noise in the parallel training data impact the quality of neural machine translation systems. We create five types of artificial noise and analyze how they degrade performance in neural and statistical machine translation. We find that neural models are generally more harmed by noise than statistical models. For one especially egregious type of noise they learn to just copy the input sentence.
Original languageEnglish
Title of host publicationProceedings of the 2nd Workshop on Neural Machine Translation and Generation
Place of PublicationMelbourne, Australia
PublisherAssociation for Computational Linguistics
Pages74-83
Number of pages10
Publication statusPublished - 20 Jul 2018
Event2nd Workshop on Neural Machine Translation and Generation - Melbourne, Australia
Duration: 15 Jul 201820 Jul 2018
https://sites.google.com/site/wnmt18/home
https://sites.google.com/site/wnmt18/

Conference

Conference2nd Workshop on Neural Machine Translation and Generation
Abbreviated titleWNMT 2018
CountryAustralia
CityMelbourne
Period15/07/1820/07/18
Internet address

Fingerprint Dive into the research topics of 'On the Impact of Various Types of Noise on Neural Machine Translation'. Together they form a unique fingerprint.

Cite this