Edinburgh Research Explorer

Transcription of multi-genre media archives using out-of-domain data

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Open Access permissions

Open

Documents

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6424244
Original languageEnglish
Title of host publicationSpoken Language Technology Workshop (SLT), 2012 IEEE
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages324-329
Number of pages6
ISBN (Electronic)978-1-4673-5124-9
ISBN (Print)978-1-4673-5125-6
DOIs
Publication statusPublished - 2012

Abstract

We describe our work on developing a speech recognition system for multi-genre media archives. The high diversity of the data makes this a challenging recognition task, which may benefit from systems trained on a combination of in-domain and out-of-domain data. Working with tandem HMMs, we present Multi-level Adaptive Networks (MLAN), a novel technique for incorporating information from out-of-domain posterior features using deep neural networks. We show that it provides a substantial reduction in WER over other systems, with relative WER reductions of 15% over a PLP baseline, 9% over in-domain tandem features and 8% over the best out-of-domain tandem features.

    Research areas

  • hidden Markov models, information retrieval systems, records management, speech recognition, MLAN, deep neural networks, hidden Markov model, in-domain tandem features, multigenre media archives transcription, multilevel adaptive networks, out-of-domain posterior features, relative WER reductions, speech recognition system, tandem HMM, Acoustics, Adaptation models, Hidden Markov models, Neural networks, Speech, Training, Training data, cross-domain adaptation, media archives, tandem

Download statistics

No data available

ID: 11804699