Edinburgh Research Explorer

Bayesian Modeling of Dependency Trees Using Hierarchical Pitman-Yor Priors

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Open Access permissions



Original languageEnglish
Title of host publicationICML Workshop on Prior Knowledge for Text and Language
Number of pages6
Publication statusPublished - 2008


Recent work in hierarchical priors for language modeling [MacKay and Peto, 1994, Teh, 2006, Goldwater et al., 2006] has shown significant advantages to Bayesian methods in NLP. But the issue of sparse conditioning contexts is ubiquitous in NLP, and these smoothing ideas can be applied more broadly to extend the reach of Bayesian modeling in natural language. For example, a useful representation of higher-level syntactic structure is given by dependency graphs are one such representation of this kind of higher-level structure. Specifically, dependency graphs encode relationships between words and their sentence-level, syntactic modifiers by representing each sentence in a corpus as a directed graph with nodes consisting of the part-of-speech-tagged words in that sentence.

In this paper, we describe two Bayesian models over dependency trees. First, we show that a classic generative dependency model can be substantially improved by (a) using a hierarchical Pitman-Yor process as a prior over the distribution over dependents of a word, and (b) sampling the hyperparameters of the prior. Remarkably, these changes alone yield a significant increase in parse accuracy over the standard model. Second, we present a Bayesian dependency parsing model in which latent state variables mediate the relationships between words and their dependents. The model clusters bilexical dependencies into states using a similar approach to that employed by Bayesian topic models when clustering words into topics. It discovers word clusters with a fine-grained syntactic character.

Download statistics

No data available

ID: 11154289