Abstract / Description of output
This paper presents two Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference of probabilistic context free grammars (PCFGs) from terminal strings, providing an alternative to maximum-likelihood estimation using Inside-Outside algorithm. We illustrate these methods by estimating a sparse grammar describing the morphology of the Bantu language Sesotho, demonstrating that with suitable priors Bayesian techniques can infer linguistic structure in situations where maximum likelihood methods such as Inside-Outside algorithm only produce a trivial grammar.
Original language | English |
---|---|
Title of host publication | Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference |
Place of Publication | Rochester, New York |
Publisher | Association for Computational Linguistics |
Pages | 139-146 |
Number of pages | 8 |
Publication status | Published - 1 Apr 2007 |