DCU-UvA Multimodal MT System Report

Iacer Calixto, Desmond Elliott, Stella Frank

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

We present a doubly-attentive multimodal machine translation model. Our model learns to attend to source language and spatial-preserving CONV5,4 visual features as separate attention mechanisms in a neural translation model. In image description translation experiments (Task 1), we find an improvement of 2.3 Meteor points compared to initialising the hidden state of the decoder with only the FC7 features and 2.9 Meteor points compared to a text-only neural machine translation baseline, confirming the useful nature of attending to the CONV5,4 features.
Original languageEnglish
Title of host publicationProceedings of the First Conference on Machine Translation, WMT 2016, colocated with ACL 2016, August 11-12, Berlin, Germany
PublisherAssociation for Computational Linguistics (ACL)
Number of pages5
Publication statusPublished - 12 Aug 2016
EventFirst Conference on Machine Translation - Berlin, Germany
Duration: 11 Aug 201612 Aug 2016


ConferenceFirst Conference on Machine Translation
Abbreviated titleWMT16
Internet address


Dive into the research topics of 'DCU-UvA Multimodal MT System Report'. Together they form a unique fingerprint.

Cite this