Abstract
The article introduces novel instanciations of three French constituent treebanks in which certain syntactic phenomena responsible for long-distance dependencies are represented with discontinuous constituents. Resulting trees are mildly context-sentitive structures, and can be modeled with, e.g. LCFRS. We show that such structures can be parsed efficiently, by introducing a neural transition-based discontinuous parser, that also performs morphological analysis and functional tagging. Our experiments show that the sparsity of these phenomena in French treebanks makes learning and evaluation of discontinuous structures difficult.
Translated title of the contribution | Representation and parsing of syntactic discontinuities in French constituent treebanks |
---|---|
Original language | French |
Title of host publication | Actes de la 24e conférence sur le Traitement Automatique des Langues Naturelles |
Place of Publication | Orléans, France |
Publisher | Association pour le Traitement Automatique des Langues (ATALA) |
Pages | 77-92 |
Number of pages | 16 |
Publication status | Published - 1 Jun 2017 |
Event | Automatic Processing of Natural Languages 2017 - Orleans, France Duration: 26 Jun 2017 → 30 Jun 2017 http://taln2017.cnrs.fr/ |
Conference
Conference | Automatic Processing of Natural Languages 2017 |
---|---|
Abbreviated title | TALN 2017 |
Country/Territory | France |
City | Orleans |
Period | 26/06/17 → 30/06/17 |
Internet address |
Keywords / Materials (for Non-textual outputs)
- Discontinuous constituents
- parsing
- deep learning
- Constituants discontinus
- analyse syntaxique