Abstract / Description of output
The transfer or share of knowledge between languages is a potential solution to resource scarcity in NLP. However, the effectiveness of cross-lingual transfer can be challenged by variation in syntactic structures. Frameworks such as Universal Dependencies (UD) are designed to be cross-lingually consistent, but even in carefully designed resources trees representing equivalent sentences may not always overlap. In this paper, we measure cross-lingual syntactic variation, or anisomorphism, in the UD treebank collection, considering both morphological and structural properties. We show that reducing the level of anisomorphism yields consistent gains in cross-lingual transfer tasks. We introduce a source language selection procedure that facilitates effective cross-lingual parser transfer, and propose a typologically driven method for syntactic tree processing which reduces anisomorphism. Our results show the effectiveness of this method for both machine translation and cross-lingual sentence similarity, demonstrating the importance of syntactic structure compatibility for boosting cross-lingual transfer in NLP.
Original language | English |
---|---|
Title of host publication | Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) |
Editors | Iryna Gurevych, Yusuke Miyao |
Place of Publication | Stroudsburg, PA, USA |
Publisher | Association for Computational Linguistics |
Pages | 1531-1542 |
Number of pages | 12 |
Volume | 1 |
ISBN (Electronic) | 978-1-948087-32-2 |
DOIs | |
Publication status | Published - 1 Jul 2018 |
Event | 56th Annual Meeting of the Association for Computational Linguistics - Melbourne Convention and Exhibition Centre, Melbourne, Australia Duration: 15 Jul 2018 → 20 Jul 2018 http://acl2018.org/ |
Conference
Conference | 56th Annual Meeting of the Association for Computational Linguistics |
---|---|
Abbreviated title | ACL 2018 |
Country/Territory | Australia |
City | Melbourne |
Period | 15/07/18 → 20/07/18 |
Internet address |