Corrected Co-training for Statistical Parsers

Rebecca Hwa, Miles Osborne, Anoop Sarkar, Mark Steedman

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Corrected co-training (Pierce & Cardie, 2001) and the closely related co-testing (Muslea et al., 2000) are active learning methods which exploit redundant views to reduce the cost of manually creating labeled training data. We extend these methods to statistical parsing algorithms for natural language. Because creating complex parse structures by hand is significantly more timeconsuming than selecting labels from a small set, it may be easier for the human to correct the learner’s partially accurate output rather than generate the complex label from scratch. The goal of our work is to minimize the number of corrections that the annotator must make. To reduce the human effort in correcting machine parsed sentences, we propose a novel approach, which we call one-sided corrected co-training and show that this method requires only a third as many manual annotation decisions as corrected co-training/co-testing to achieve the same improvement in performance.
Original languageEnglish
Title of host publicationProceedings of the Twentieth International Conference on Machine Learning (ICML-2003)
Number of pages8
Publication statusPublished - 2003


Dive into the research topics of 'Corrected Co-training for Statistical Parsers'. Together they form a unique fingerprint.

Cite this