LexFit: Lexical Fine-Tuning of Pretrained Language Models

Ivan Vulić, Edoardo Maria Ponti, Anna Korhonen, Goran Glavaš

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Transformer-based language models (LMs) pretrained on large text collections implicitly store a wealth of lexical semantic knowledge, but it is non-trivial to extract that knowledge effectively from their parameters. Inspired by prior work on semantic specialization of static word embedding (WE) models, we show that it is possible to expose and enrich lexical knowledge from the LMs, that is, to specialize them to serve as effective and universal ``decontextualized'' word encoders even when fed input words ``in isolation'' (i.e., without any context). Their transformation into such word encoders is achieved through a simple and efficient lexical fine-tuning procedure (termed LexFit) based on dual-encoder network structures. Further, we show that LexFit can yield effective word encoders even with limited lexical supervision and, via cross-lingual transfer, in different languages without any readily available external knowledge. Our evaluation over four established, structurally different lexical-level tasks in 8 languages indicates the superiority of LexFit-based WEs over standard static WEs (e.g., fastText) and WEs from vanilla LMs. Other extensive experiments and ablation studies further profile the LexFit framework, and indicate best practices and performance variations across LexFit variants, languages, and lexical tasks, also directly questioning the usefulness of traditional WE models in the era of large neural models.
Original languageEnglish
Title of host publicationProceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
EditorsChengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
Place of PublicationStroudsburg, PA, USA
PublisherAssociation for Computational Linguistics
Pages5269-5283
Number of pages15
Volume1
ISBN (Electronic)978-1-954085-52-7
DOIs
Publication statusPublished - 1 Aug 2021
EventThe Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing - Bangkok, Thailand
Duration: 1 Aug 20216 Aug 2021
https://2021.aclweb.org/

Conference

ConferenceThe Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing
Abbreviated titleACL-IJCNLP 2021
Country/TerritoryThailand
CityBangkok
Period1/08/216/08/21
Internet address

Fingerprint

Dive into the research topics of 'LexFit: Lexical Fine-Tuning of Pretrained Language Models'. Together they form a unique fingerprint.

Cite this