Multilingual training of deep neural networks

A. Ghoshal, P. Swietojanski, S. Renals

Research output: Chapter in Book/Report/Conference proceedingConference contribution


We investigate multilingual modeling in the context of a deep neural network (DNN) - hidden Markov model (HMM) hybrid, where the DNN outputs are used as the HMM state likelihoods. By viewing neural networks as a cascade of feature extractors followed by a logistic regression classifier, we hypothesise that the hidden layers, which act as feature extractors, will be transferable between languages. As a corollary, we propose that training the hidden layers on multiple languages makes them more suitable for such cross-lingual transfer. We experimentally confirm these hypotheses on the GlobalPhone corpus using seven languages from three different language families: Germanic, Romance, and Slavic. The experiments demonstrate substantial improvements over a monolingual DNN-HMM hybrid baseline, and hint at avenues of further exploration.
Original languageEnglish
Title of host publicationAcoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages5
ISBN (Print)978-1-4799-0356-6
Publication statusPublished - 2013


  • feature extraction
  • hidden Markov models
  • linguistics
  • natural language processing
  • neural nets
  • pattern classification
  • regression analysis
  • speech recognition
  • Germanic language
  • GlobalPhone corpus
  • HMM state likelihoods
  • Romance language
  • Slavic language
  • cross-lingual transfer
  • deep neural network
  • feature extractors
  • hidden Markov model
  • hidden layers
  • hybrid DNN-HMM
  • logistic regression classifier
  • multilingual modeling
  • multilingual training
  • Acoustics
  • Feature extraction
  • Hidden Markov models
  • Neural networks
  • Speech
  • Speech recognition
  • Training
  • deep learning
  • neural networks

Fingerprint Dive into the research topics of 'Multilingual training of deep neural networks'. Together they form a unique fingerprint.

Cite this