Language Models Learn POS First

Naomi Saphra, Adam Lopez

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A glut of recent research shows that language models capture linguistic structure. Linzen et al. (2016) found that LSTM-based language models may encode syntactic information sufficient to favor verbs which match the number of their subject nouns. Liu et al. (2018) suggested that the high performance of LSTMs may depend on the linguistic structure of the input data, as performance on several artificial tasks was higher with natural language data than with artificial sequential data.
Such work answers the question of whether a model represents linguistic structure. But how and when are these structures acquired? Rather than treating the training process itself as a black box, we investigate how representations of linguistic structure are learned over time. In particular, we demonstrate that different aspects of linguistic structure are learned at different rates, with part of speech tagging acquired early and global topic information learned continuously.
Original languageEnglish
Title of host publicationProceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP
Place of PublicationBrussels, Belgium
PublisherACL Anthology
Pages328-330
Number of pages3
Publication statusPublished - Nov 2018
Event2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP - Brussels, Belgium
Duration: 1 Nov 20181 Nov 2018
https://blackboxnlp.github.io/

Conference

Conference2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP
Country/TerritoryBelgium
CityBrussels
Period1/11/181/11/18
Internet address

Fingerprint

Dive into the research topics of 'Language Models Learn POS First'. Together they form a unique fingerprint.

Cite this