Projects per year
Abstract
In this paper we propose a model to learn multimodal multilingual representations for matching images and sentences in different languages, with the aim of advancing multilingual versions of image search and image understanding. Our model learns a common representation for images and their descriptions in two different languages (which need not be parallel) by considering the image as a pivot between two languages. We introduce a new pairwise ranking loss function which can handle both symmetric and asymmetric similarity between the two modalities. We evaluate our models on image-description ranking for German and English, and on semantic textual similarity of image descriptions in English. In both cases we achieve state-of-the-art performance.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing |
Place of Publication | Copenhagen, Denmark |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 2839-2845 |
Number of pages | 7 |
DOIs | |
Publication status | Published - 11 Sep 2017 |
Event | EMNLP 2017: Conference on Empirical Methods in Natural Language Processing - Copenhagen, Denmark Duration: 7 Sep 2017 → 11 Sep 2017 http://emnlp2017.net/index.html http://emnlp2017.net/ |
Conference
Conference | EMNLP 2017: Conference on Empirical Methods in Natural Language Processing |
---|---|
Abbreviated title | EMNLP 2017 |
Country/Territory | Denmark |
City | Copenhagen |
Period | 7/09/17 → 11/09/17 |
Internet address |
Fingerprint
Dive into the research topics of 'Image Pivoting for Learning Multilingual Multimodal Representations'. Together they form a unique fingerprint.Projects
- 1 Finished