Indicatements that character language models learn English morpho-syntactic units and regularities

Yova Kementchedjhieva, Adam Lopez

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Character language models have access to surface morphological patterns, but it is not clear whether or how they learn abstract morphological regularities. We instrument a character language model with several probes, finding that it can develop a specific unit to identify word boundaries and, by extension, morpheme boundaries, which allows it to capture linguistic properties and regularities of these units. Our language model proves surprisingly good at identifying the selectional restrictions of English derivational morphemes, a task that requires both morphological and syntactic awareness. Thus we conclude that, when morphemes overlap extensively with the words of a language, a character language model can perform morphological abstraction.
Original languageEnglish
Title of host publicationProceedings of the Workshop on analyzing and interpreting neural networks for NLP 2018
Place of PublicationBrussels, Belgium
PublisherAssociation for Computational Linguistics
Pages145-153
Number of pages9
Publication statusPublished - Nov 2018
EventAnalyzing and interpreting neural networks for NLP: Collocated with EMNLP 2018 - Brussels, Belgium
Duration: 1 Nov 20181 Nov 2018
https://blackboxnlp.github.io/

Workshop

WorkshopAnalyzing and interpreting neural networks for NLP
Country/TerritoryBelgium
CityBrussels
Period1/11/181/11/18
Internet address

Fingerprint

Dive into the research topics of 'Indicatements that character language models learn English morpho-syntactic units and regularities'. Together they form a unique fingerprint.

Cite this