Hierarchical Bayesian Language Models for Conversational Speech Recognition

Songfang Huang, Steve Renals

Research output: Contribution to journalArticlepeer-review

Abstract

Traditional n-gram language models are widely used in state-of-the-art large vocabulary speech recognition systems. This simple model suffers from some limitations, such as overfitting of maximum-likelihood estimation and the lack of rich contextual knowledge sources. In this paper, we exploit a hierarchical Bayesian interpretation for language modeling, based on a nonparametric prior called the Pitman--Yor process. This offers a principled approach to language model smoothing, embedding the power-law distribution for natural language. Experiments on the recognition of conversational speech in multiparty meetings demonstrate that by using hierarchical Bayesian language models, we are able to achieve significant reductions in perplexity and word error rate.
Original languageEnglish
Pages (from-to)1941-1954
Number of pages14
JournalIEEE Transactions on Audio, Speech and Language Processing
Volume18
Issue number8
DOIs
Publication statusPublished - Nov 2010

Keywords

  • AMI corpus
  • conversational speech recognition
  • hierarchical Bayesian model
  • language model (LM)
  • meetings
  • smoothing

Fingerprint

Dive into the research topics of 'Hierarchical Bayesian Language Models for Conversational Speech Recognition'. Together they form a unique fingerprint.

Cite this