Cross-lingual Semantic Specialization via Lexical Relation Induction

Edoardo Maria Ponti, Ivan Vulić, Goran Glavaš, Roi Reichart, Anna Korhonen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Semantic specialization integrates structured linguistic knowledge from external resources (such as lexical relations in WordNet) into pretrained distributional vectors in the form of constraints. However, this technique cannot be leveraged in many languages, because their structured external resources are typically incomplete or non-existent. To bridge this gap, we propose a novel method that transfers specialization from a resource-rich source language (English) to virtually any target language. Our specialization transfer comprises two crucial steps: 1) Inducing noisy constraints in the target language through automatic word translation; and 2) Filtering the noisy constraints via a state-of-the-art relation prediction model trained on the source language constraints. This allows us to specialize any set of distributional vectors in the target language with the refined constraints. We prove the effectiveness of our method through intrinsic word similarity evaluation in 8 languages, and with 3 downstream tasks in 5 languages: lexical simplification, dialog state tracking, and semantic textual similarity. The gains over the previous state-of-art specialization methods are substantial and consistent across languages. Our results also suggest that the transfer method is effective even for lexically distant source-target language pairs. Finally, as a by-product, our method produces lists of WordNet-style lexical relations in resource-poor languages.
Original languageEnglish
Title of host publicationProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
EditorsKentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
Place of PublicationStroudsburg, PA, USA
PublisherAssociation for Computational Linguistics
Pages2206-2217
Number of pages12
ISBN (Electronic)978-1-950737-90-1
DOIs
Publication statusPublished - 1 Nov 2019
Event2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing - Hong Kong, Hong Kong
Duration: 3 Nov 20197 Nov 2019
https://www.emnlp-ijcnlp2019.org/

Conference

Conference2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing
Abbreviated titleEMNLP-IJCNLP 2019
Country/TerritoryHong Kong
CityHong Kong
Period3/11/197/11/19
Internet address

Fingerprint

Dive into the research topics of 'Cross-lingual Semantic Specialization via Lexical Relation Induction'. Together they form a unique fingerprint.

Cite this