Speech Recognition Using Augmented Conditional Random Fields

Yasser Hifny, Steve Renals

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

Acoustic modeling based on hidden Markov models (HMMs) is employed by state-of-the-art stochastic speech recognition systems. Although HMMs are a natural choice to warp the time axis and model the temporal phenomena in the speech signal, their conditional independence properties limit their ability to model spectral phenomena well. In this paper, a new acoustic modeling paradigm based on augmented conditional random fields (ACRFs) is investigated and developed. This paradigm addresses some limitations of HMMs while maintaining many of the aspects which have made them successful. In particular, the acoustic modeling problem is reformulated in a data driven, sparse, augmented space to increase discrimination. Acoustic context modeling is explicitly integrated to handle the sequential phenomena of the speech signal. We present an efficient framework for estimating these models that ensures scalability and generality. In the TIMIT phone recognition task, a phone error rate of 23.0\% was recorded on the full test set, a significant improvement over comparable HMM-based systems.
Original languageEnglish
Pages (from-to)354-365
Number of pages12
JournalIEEE Transactions on Audio, Speech and Language Processing
Issue number2
Publication statusPublished - Feb 2009

Keywords / Materials (for Non-textual outputs)

  • Augmented conditional random fields (ACRFs)
  • augmented spaces
  • discriminative compression
  • hidden Markov models (HMMs)


Dive into the research topics of 'Speech Recognition Using Augmented Conditional Random Fields'. Together they form a unique fingerprint.

Cite this