Projects per year
We present a distributional approach to the problem of inducing parameters for unseen words in probabilistic parsers. Our KNN-based algorithm uses distributional similarity over an unlabelled corpus to match unseen words to the most similar seen words, and can induce parameters for those unseen words without retraining the parser. We apply this to domain adaptation for three different parsers that employ fine-grained syntactic categories,which allows us to focus on modifying the lexicon, while leaving the structure of the parser itself intact. We demonstrate uplifts for dependency recovery of 2%-6% on novel vocabulary in bio medical text.
|Title of host publication||Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis (Louhi)|
|Publisher||Association for Computational Linguistics|
|Number of pages||11|
|Publication status||Published - 2015|
FingerprintDive into the research topics of 'Parser Adaptation to the Biomedical Domain without Re-Training'. Together they form a unique fingerprint.
- 2 Finished
Steedman, M., Geib, C. & Petrick, R.
1/01/10 → 31/12/15