Multi-frame factorisation for long-span acoustic modelling

Liang Lu, Steve Renals

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Acoustic models based on Gaussian mixture models (GMMs) typically use short span acoustic feature inputs. This does not capture long-term temporal information from speech owing to the conditional independence assumption of hidden Markov models. In this paper, we present an implicit approach that approximates the joint distribution of long span features by product of actorized models, in contrast to deep neural networks (DNNs) that model feature correlations directly. The approach is applicable to a broad range of acoustic models. We present experiments using GMM and probabilistic linear discriminant analysis (PLDA) based models on Switchboard, observing consistent word error rate reductions.
Original languageEnglish
Title of host publicationProceedings IEEE International Conference on Acoustics, Speech and Signal Processing
Number of pages5
Publication statusPublished - 2015

Fingerprint Dive into the research topics of 'Multi-frame factorisation for long-span acoustic modelling'. Together they form a unique fingerprint.

Cite this