Sparse Forward-Backward Using Minimum Divergence Beams for Fast Training Of Conditional Random Fields

Chris Pal, C. Sutton, A. McCallum

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Hidden Markov models and linear-chain conditional random fields (CRFs) are applicable to many tasks in spoken language processing. In large state spaces, however, training can be expensive, because it often requires many iterations of forward-backward. Beam search is a standard heuristic for controlling complexity during Viterbi decoding, but during forward-backward, standard beam heuristics can be dangerous, as they can make training unstable. We introduce sparse forward-backward, a variational perspective on beam methods that uses an approximating mixture of Kronecker delta functions. This motivates a novel minimum-divergence beam criterion based on minimizing KL divergence between the respective marginal distributions. Our beam selection approach is not only more efficient for Viterbi decoding, but also more stable within sparse forward-backward training. For a standard text-to-speech problem, we reduce CRF training time fourfold - from over a day to six hours - with no loss in accuracy
Original languageEnglish
Title of host publicationAcoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages4
ISBN (Print)1-4244-0469-X
Publication statusPublished - 1 May 2006

Keywords / Materials (for Non-textual outputs)

  • Viterbi decoding
  • hidden Markov models
  • natural languages
  • speech coding
  • Kronecker delta functions
  • conditional random fields
  • linear-chain conditional random fields
  • minimum divergence beams
  • sparse forward-backward
  • spoken language processing
  • text-to-speech problem
  • Computer science
  • Hidden Markov models
  • Iterative decoding
  • Maximum likelihood decoding
  • Natural languages
  • Random variables
  • Speech synthesis
  • State-space methods
  • Transducers
  • Viterbi algorithm


Dive into the research topics of 'Sparse Forward-Backward Using Minimum Divergence Beams for Fast Training Of Conditional Random Fields'. Together they form a unique fingerprint.

Cite this