Zero-Shot Cross-Lingual Transfer is a Hard Baseline to Beat in German Fine-Grained Entity Typing

Sabine Weber, Mark Steedman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The training of NLP models often requires large amounts of labelled training data, which makes it difficult to expand existing models to new languages. While zero-shot cross-lingual transfer relies on multilingual word embeddings to apply a model trained on one language to another, Yarowski and Ngai (2001) propose the method of annotation projection to generate training data without manual annotation. This method was successfully used for the tasks of named entity recognition and coarse-grained entity typing, but we show that it is outperformed by zero-shot cross-lingual transfer when applied to the similar task of fine-grained entity typing. In our study of fine-grained entity typing with the FIGER type ontology for German, we show that annotation projection amplifies the English model’s tendency to underpredict level 2 labels and is beaten by zero-shot cross-lingual transfer on three novel test sets.
Original languageEnglish
Title of host publicationProceedings of the Second Workshop on Insights from Negative Results in NLP
EditorsJoão Sedoc, Anna Rogers, Anna Rumshisky, Shabnam Tafreshi
Place of PublicationStroudsburg, PA, United States
PublisherAssociation for Computational Linguistics (ACL)
Pages42-48
Number of pages7
ISBN (Electronic)978-1-954085-93-0
Publication statusPublished - 10 Nov 2021
EventWorkshop on Insights from Negative Results in NLP 2021 - Online, Punta Cana, Dominican Republic
Duration: 10 Nov 202110 Nov 2021
https://insights-workshop.github.io/

Conference

ConferenceWorkshop on Insights from Negative Results in NLP 2021
Country/TerritoryDominican Republic
CityPunta Cana
Period10/11/2110/11/21
Internet address

Fingerprint

Dive into the research topics of 'Zero-Shot Cross-Lingual Transfer is a Hard Baseline to Beat in German Fine-Grained Entity Typing'. Together they form a unique fingerprint.

Cite this