We describe a method for prediction of linguistic structure in a language for which only unlabeled data is available, using annotated data from a set of one or more helper languages. Our approach is based on a model that locally mixes between supervised models from the helper languages. Parallel data is not used, allowing the technique to be applied even in domains where human-translated texts are unavailable. We obtain state-of-the-art performance for two tasks of structure prediction: unsupervised part-of-speech tagging and unsupervised dependency parsing.
|Title of host publication||Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing|
|Place of Publication||Edinburgh, Scotland, UK.|
|Publisher||Association for Computational Linguistics|
|Number of pages||12|
|Publication status||Published - 1 Jul 2011|