Graph Convolutional Encoders for Syntax-aware Neural Machine Translation

Joost Bastings, Ivan Titov, Wilker Aziz, Diego Marcheggiani, Khalil Sima'an

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a simple and effective approach to incorporating syntactic structure into neural attention-based encoder decoder models for machine translation. We rely on graph-convolutional networks (GCNs), a recent class of neural networks developed for modeling graph-structured data. Our GCNs use predicted syntactic dependency trees of source sentences to produce representations of words (i.e. hidden states of the encoder) that are sensitive to their syntactic neighborhoods. GCNs take word representations as input and produce word representations as output, so they can easily be incorporated as layers into standard encoders (e.g., on top of bidirectional RNNs or convolutional neural networks). We evaluate their effectiveness with English-German and English-Czech translation experiments for different types of encoders and observe substantial improvements over their syntax-agnostic versions in all the considered setups.
Original languageEnglish
Title of host publicationProceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017)
PublisherAssociation for Computational Linguistics
Pages1957–1967
Number of pages11
Publication statusE-pub ahead of print - 11 Sep 2017
EventEMNLP 2017: Conference on Empirical Methods in Natural Language Processing - Copenhagen, Denmark
Duration: 7 Sep 201711 Sep 2017
http://emnlp2017.net/index.html
http://emnlp2017.net/

Conference

ConferenceEMNLP 2017: Conference on Empirical Methods in Natural Language Processing
Abbreviated titleEMNLP 2017
Country/TerritoryDenmark
CityCopenhagen
Period7/09/1711/09/17
Internet address

Fingerprint

Dive into the research topics of 'Graph Convolutional Encoders for Syntax-aware Neural Machine Translation'. Together they form a unique fingerprint.

Cite this