Understanding Neural Machine Translation by Simplification: The Case of Encoder-free Models

Gongbo Tang, Rico Sennrich, Joakim Nivre

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

In this paper, we try to understand neural machine translation (NMT) via simplifying NMT architectures and training encoder-free NMT models. In an encoderfree model, the sums of word embeddings and positional embeddings represent the source. The decoder is a standard Transformer or recurrent neural network that directly attends to embeddings via attention mechanisms. Experimental results show (1) that the attention mechanism in encoder-free models acts as a strong feature extractor, (2) that the word embeddings in encoder-free models are competitive to those in conventional models, (3) that non-contextualized source representations lead to a big performance drop, and (4) that encoder-free models have different effects on alignment quality for German→English and Chinese→English.
Original languageEnglish
Title of host publicationRecent Advances in Natural Processing 2019
Subtitle of host publicationRANLP 2019: Natural Language Processingin a Deep Learning World
PublisherINCOMA Ltd.
Pages1186-1193
Number of pages8
ISBN (Electronic)978-954-452-056-4
ISBN (Print)978-954-452-055-7
DOIs
Publication statusPublished - 30 Sept 2019
EventRecent Advances in Natural Language Processing (RANLP 2019) - Cherno More Hotel, Varna, Bulgaria
Duration: 2 Sept 20194 Sept 2019
http://lml.bas.bg/ranlp2019/start.php

Publication series

NameNatural Language Processing in a Deep Learning World
PublisherINCOMA Ltd.
ISSN (Print)1313-8502
ISSN (Electronic)2603-2813

Conference

ConferenceRecent Advances in Natural Language Processing (RANLP 2019)
Abbreviated titleRANLP 2019
Country/TerritoryBulgaria
CityVarna
Period2/09/194/09/19
Internet address

Fingerprint

Dive into the research topics of 'Understanding Neural Machine Translation by Simplification: The Case of Encoder-free Models'. Together they form a unique fingerprint.

Cite this