Edinburgh Research Explorer

Regularized subspace Gaussian mixture models for cross-lingual speech recognition

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publicationAutomatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages365-370
Number of pages6
ISBN (Electronic)978-1-4673-0366-8
ISBN (Print)978-1-4673-0365-1
DOIs
Publication statusPublished - 2011

Abstract

We investigate cross-lingual acoustic modelling for low resource languages using the subspace Gaussian mixture model (SGMM). We assume the presence of acoustic models trained on multiple source languages, and use the global subspace parameters from those models for improved modelling in a target language with limited amounts of transcribed speech. Experiments on the GlobalPhone corpus using Spanish, Portuguese, and Swedish as source languages and German as target language (with 1 hour and 5 hours of transcribed audio) show that multilingually trained SGMM shared parameters result in lower word error rates (WERs) than using those from a single source language. We also show that regularizing the estimation of the SGMM state vectors by penalizing their l1-norm help to overcome numerical instabilities and lead to lower WER.

    Research areas

  • Gaussian processes, acoustic signal processing, natural language processing, speech recognition, German, GlobalPhone corpus, Portuguese, Spanish, Swedish, cross-lingual acoustic modelling, cross-lingual speech recognition, global subspace parameter, low resource language, regularized subspace Gaussian mixture model, word error rates, Acoustics, Data models, Estimation, Hidden Markov models, Speech recognition, Training data, Vectors

Download statistics

No data available

ID: 11804569