Edinburgh Research Explorer

Untranscribed web audio for low resource speech recognition

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publicationProceedings Interspeech 2019
PublisherInternational Speech Communication Association
Pages226-230
Number of pages5
DOIs
Publication statusPublished - 19 Sep 2019
EventInterspeech 2019 - Graz, Austria
Duration: 15 Sep 201919 Sep 2019
https://www.interspeech2019.org/

Publication series

Name
PublisherInternational Speech Communication Association
ISSN (Electronic)1990-9772

Conference

ConferenceInterspeech 2019
CountryAustria
CityGraz
Period15/09/1919/09/19
Internet address

Abstract

Speech recognition models are highly susceptible to mismatch in the acoustic and language domains between the training and the evaluation data. For low resource languages, it is difficult to obtain transcribed speech for target domains, while untranscribed data can be collected with minimal effort. Recently, a method applying lattice-free maximum mutual information (LF-MMI) to untranscribed data has been found to be effective for semi-supervised training. However, weaker initial models and domain mismatch can result in high deletion rates for the semi-supervised model. Therefore, we propose a method to force the base model to overgenerate possible transcriptions, relying on the ability of LF-MMI to deal with uncertainty.

On data from the IARPA MATERIAL programme, our new semi-supervised method outperforms the standard semisupervised method, yielding significant gains when adapting for mismatched bandwidth and domain.

    Research areas

  • speech recognition, semi-supervised training, domain adaptation, web data

Event

Interspeech 2019

15/09/1919/09/19

Graz, Austria

Event: Conference

Download statistics

No data available

ID: 99436262