Edinburgh Research Explorer

Transcription of multi-genre media archives using out-of-domain data

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Open Access permissions



Original languageEnglish
Title of host publicationSpoken Language Technology Workshop (SLT), 2012 IEEE
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages6
ISBN (Electronic)978-1-4673-5124-9
ISBN (Print)978-1-4673-5125-6
Publication statusPublished - 2012


We describe our work on developing a speech recognition system for multi-genre media archives. The high diversity of the data makes this a challenging recognition task, which may benefit from systems trained on a combination of in-domain and out-of-domain data. Working with tandem HMMs, we present Multi-level Adaptive Networks (MLAN), a novel technique for incorporating information from out-of-domain posterior features using deep neural networks. We show that it provides a substantial reduction in WER over other systems, with relative WER reductions of 15% over a PLP baseline, 9% over in-domain tandem features and 8% over the best out-of-domain tandem features.

    Research areas

  • hidden Markov models, information retrieval systems, records management, speech recognition, MLAN, deep neural networks, hidden Markov model, in-domain tandem features, multigenre media archives transcription, multilevel adaptive networks, out-of-domain posterior features, relative WER reductions, speech recognition system, tandem HMM, Acoustics, Adaptation models, Hidden Markov models, Neural networks, Speech, Training, Training data, cross-domain adaptation, media archives, tandem

Download statistics

No data available

ID: 11804699