Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation

Brian Thompson, Jeremy Gwinnup, Huda Khayrallah, Kevin Duh, Philipp Koehn

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Continued training is an effective method for domain adaptation in neural machine translation. However, in-domain gains from adaptation come at the expense of general-domain performance. In this work, we interpret the drop in general-domain performance as catastrophic forgetting of general-domain knowledge. To mitigate it, we adapt Elastic Weight Consolidation (EWC)—a machine learning method for learning a new task without forgetting previous tasks. Our method retains the majority of general-domain performance lost in continued training without degrading indomain performance, outperforming the previous state-of-the-art. We also explore the full range of general-domain performance available when some in-domain degradation is acceptable.
Original languageEnglish
Title of host publicationProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)
EditorsJill Burstein, Christy Doran, Thamar Solorio
Place of PublicationMinneapolis, Minnesota
PublisherAssociation for Computational Linguistics
Pages2062–2068
Number of pages7
Volume1
DOIs
Publication statusPublished - 7 Jun 2019
Event2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics - Minneapolis, United States
Duration: 2 Jun 20197 Jun 2019
https://naacl2019.org/

Conference

Conference2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Abbreviated titleNAACL-HLT 2019
Country/TerritoryUnited States
CityMinneapolis
Period2/06/197/06/19
Internet address

Fingerprint

Dive into the research topics of 'Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation'. Together they form a unique fingerprint.

Cite this