Projects per year
Corrected co-training (Pierce & Cardie, 2001) and the closely related co-testing (Muslea et al., 2000) are active learning methods which exploit redundant views to reduce the cost of manually creating labeled training data. We extend these methods to statistical parsing algorithms for natural language. Because creating complex parse structures by hand is significantly more timeconsuming than selecting labels from a small set, it may be easier for the human to correct the learner’s partially accurate output rather than generate the complex label from scratch. The goal of our work is to minimize the number of corrections that the annotator must make. To reduce the human effort in correcting machine parsed sentences, we propose a novel approach, which we call one-sided corrected co-training and show that this method requires only a third as many manual annotation decisions as corrected co-training/co-testing to achieve the same improvement in performance.
|Title of host publication||Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003)|
|Number of pages||8|
|Publication status||Published - 2003|
FingerprintDive into the research topics of 'Corrected Co-training for Statistical Parsers'. Together they form a unique fingerprint.
- 1 Finished
Wide coverage parsing and grammer induction using CCG
30/09/00 → 29/09/03