Bayesian Inference for PCFGs via Markov Chain Monte Carlo

Mark Johnson, Thomas Griffiths, Sharon Goldwater

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

This paper presents two Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference of probabilistic context free grammars (PCFGs) from terminal strings, providing an alternative to maximum-likelihood estimation using Inside-Outside algorithm. We illustrate these methods by estimating a sparse grammar describing the morphology of the Bantu language Sesotho, demonstrating that with suitable priors Bayesian techniques can infer linguistic structure in situations where maximum likelihood methods such as Inside-Outside algorithm only produce a trivial grammar.
Original languageEnglish
Title of host publicationHuman Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference
Place of PublicationRochester, New York
PublisherAssociation for Computational Linguistics
Pages139-146
Number of pages8
Publication statusPublished - 1 Apr 2007

Fingerprint

Dive into the research topics of 'Bayesian Inference for PCFGs via Markov Chain Monte Carlo'. Together they form a unique fingerprint.

Cite this