Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Current approaches to learning vector representations of text that are compatible between different languages usually require some amount of parallel text, aligned at word, sentence or at least document level. We hypothesize however, that different natural languages share enough semantic structure that it should be possible, in principle, to learn compatible vector representations just by analyzing the monolingual distribution of words. In order to evaluate this hypothesis, we propose a scheme to map word vectors trained on a source language to vectors semantically compatible with word vectors trained on a target language using an adversarial autoencoder. We present preliminary qualitative results and discuss possible future developments of this technique, such as applications to cross-lingual sentence representations.
Original languageEnglish
Title of host publicationProceedings of the 1st Workshop on Representation Learning for NLP
PublisherAssociation for Computational Linguistics
Pages121-126
Number of pages6
DOIs
Publication statusPublished - 11 Aug 2016
Event1st Workshop on Representation Learning for NLP - Berlin, Germany
Duration: 11 Aug 201611 Aug 2016
https://sites.google.com/site/repl4nlp2016/

Conference

Conference1st Workshop on Representation Learning for NLP
Abbreviated titleRepL4NLP 2016
CountryGermany
CityBerlin
Period11/08/1611/08/16
Internet address

Fingerprint Dive into the research topics of 'Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders'. Together they form a unique fingerprint.

Cite this