Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation

Brian Thompson, Huda Khayrallah, Antonios Anastasopoulos, Arya D. McCarthy, Kevin Duh, Rebecca Marvin, Paul McNamee, Jeremy Gwinnup, Tim Anderson, Philipp Koehn

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component’s contribution to, and capacity for, domain adaptation. We find that freezing any single component during continued training has minimal impact on performance, and that performance is surprisingly good when a single component is adapted while holding the rest of the model fixed. We also find that continued training does not move the model very far from the out-of-domain model, compared to a sensitivity analysis metric, suggesting that the out-of-domain model can provide a good generic initialization for the new domain.
Original languageEnglish
Title of host publicationProceedings of the Third Conference on Machine Translation: Research Papers
Place of PublicationBelgium, Brussels
PublisherAssociation for Computational Linguistics
Pages124-132
Number of pages9
Publication statusPublished - 31 Oct 2018
EventEMNLP 2018 Third Conference on Machine Translation (WMT18) - Brussels, Belgium
Duration: 31 Oct 20181 Nov 2018
http://www.statmt.org/wmt18/

Workshop

WorkshopEMNLP 2018 Third Conference on Machine Translation (WMT18)
Abbreviated titleWMT18
CountryBelgium
CityBrussels
Period31/10/181/11/18
Internet address

Fingerprint

Dive into the research topics of 'Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation'. Together they form a unique fingerprint.

Cite this