Abstract
The article deals with lexicalized constituent parsing in a transition-based framework. Typical statistical approaches for this task are based on an unstructured representation of the lexicon. Words are represented by discrete unrelated symbols.
Instead, our proposal relies on dense vector representations (embeddings) that are able to encode similarity between symbols: words, part-of-speech tags and phrase structure symbols. The article studies and compares 3 increasingly complex
neural network architectures, which are fed symbol embeddings. The experiments suggest that the information given by embeddings is best captured by a deep architecture with a non-linear layer.
Instead, our proposal relies on dense vector representations (embeddings) that are able to encode similarity between symbols: words, part-of-speech tags and phrase structure symbols. The article studies and compares 3 increasingly complex
neural network architectures, which are fed symbol embeddings. The experiments suggest that the information given by embeddings is best captured by a deep architecture with a non-linear layer.
Original language | French |
---|---|
Title of host publication | 22nd Conference on Automatic Processing of Natural Languages (CAEN 2015) |
Pages | 293–304 |
Number of pages | 12 |
Publication status | Published - 2015 |
Event | 22nd Conference on Automatic Processing of Natural Languages - Caen, France Duration: 22 Jun 2015 → 25 Jun 2015 https://taln2015.greyc.fr/ |
Conference
Conference | 22nd Conference on Automatic Processing of Natural Languages |
---|---|
Abbreviated title | CAEN 2015 |
Country/Territory | France |
City | Caen |
Period | 22/06/15 → 25/06/15 |
Internet address |