Abstract
We introduce a novel transition system for discontinuous constituency parsing. Instead of storing subtrees in a stack –i.e. a data structure with linear-time sequential access– the proposed system uses a set of parsing items, with constant-time random access. This change makes it possible to construct any discontinuous constituency tree in exactly 4n − 2 transitions for a sentence of length n, whereas existing systems need a quadratic number of transitions to derive some structures. At each parsing step, the parser considers every item in the set to be combined with a focus item and to construct a new constituent in a bottom-up fashion. The parsing strategy is based on the assumption that most syntactic structures can be parsed incrementally and that the set – the memory of the parser– remains reasonably small on average. Moreover, we introduce a dynamic oracle for the new transition system, and present the first experiments in discontinuous constituency parsing using a dynamic oracle. Our parser obtains state-of-the-art results on three English and German discontinuous treebanks.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics |
Editors | Jill Burstein, Christy Doran, Thamar Solorio |
Place of Publication | Minneapolis, Minnesota |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 204–217 |
Number of pages | 14 |
Volume | 1 |
DOIs | |
Publication status | Published - 7 Jun 2019 |
Event | 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics - Minneapolis, United States Duration: 2 Jun 2019 → 7 Jun 2019 https://naacl2019.org/ |
Conference
Conference | 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics |
---|---|
Abbreviated title | NAACL-HLT 2019 |
Country/Territory | United States |
City | Minneapolis |
Period | 2/06/19 → 7/06/19 |
Internet address |