Abstract
When parsing morphologically-rich languages with neural models, it is beneficial to model input at the character level, and it has been claimed that this is because character-level models learn morphology. We test these claims by comparing character-level models to an oracle with access to explicit morphological analysis on twelve languages with varying morphological typologies. Our results highlight many strengths of character-level models, but also show that they are poor at disambiguating some words, particularly in the face of case syncretism. We then demonstrate that explicitly modeling morphological case improves our best model, showing that characterlevel models can benefit from targeted forms of explicit morphological modeling.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing |
| Place of Publication | Brussels, Belgium |
| Publisher | Association for Computational Linguistics |
| Pages | 2573-2583 |
| Number of pages | 11 |
| Publication status | Published - 1 Nov 2018 |
| Event | 2018 Conference on Empirical Methods in Natural Language Processing - Square Meeting Center, Brussels, Belgium Duration: 31 Oct 2018 → 4 Nov 2018 http://emnlp2018.org/ |
Conference
| Conference | 2018 Conference on Empirical Methods in Natural Language Processing |
|---|---|
| Abbreviated title | EMNLP 2018 |
| Country/Territory | Belgium |
| City | Brussels |
| Period | 31/10/18 → 4/11/18 |
| Internet address |