A fully Bayesian approach to unsupervised part-of-speech tagging

Sharon Goldwater, Tom Griffiths

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Unsupervised learning of linguistic structure is a difficult problem. A common approach is to define a generative model and maximize the probability of the hidden structure given the observed data. Typically, this is done using maximum-likelihood estimation (MLE) of the model parameters. We show using part-of-speech tagging that a fully Bayesian approach can greatly improve performance. Rather than estimating a single set of parameters, the Bayesian approach integrates over all possible parameter values. This difference ensures that the learned structure will have high probability over a range of possible parameters, and permits the use of priors favoring the sparse distributions that are typical of natural language. Our model has the structure of a standard trigram HMM, yet its accuracy is closer to that of a state-of-the-art discriminative model (Smith and Eisner, 2005), up to 14 percentage points better than MLE. We find improvements both when training from data alone, and using a tagging dictionary
Original languageEnglish
Title of host publicationProceedings of the 45th Annual Meeting of the Association of Computational Linguistics
Place of PublicationPrague, Czech Republic
PublisherAssociation for Computational Linguistics
Number of pages8
Publication statusPublished - 1 Jun 2007


Dive into the research topics of 'A fully Bayesian approach to unsupervised part-of-speech tagging'. Together they form a unique fingerprint.

Cite this