Projects per year
Abstract / Description of output
The use of semi-supervised training (SST) has become an increasingly popular way of increasing the performance of ASR acoustic models without the need for further transcribed speech data. However, the performance of the technique can be very sensitive to the quality of the initial ASR system. This paper undertakes a comprehensive study of the improvements gained with respect to variation in the initial systems, the quantity of untranscribed data used, and the learning schedules. We postulate that the reason SST can be effective even when the initial model is poor is because it enables utterance-level information to be propagated to the frame level, and hence hypothesise that the quality of the language model plays a much larger role than the quality of the acoustic model. In experiments on Tagalog data from the IARPA MATERIAL programme, we find that indeed this is the case, and show that with an appropriately chosen recipe it is possible to achieve over 50% relative WER reductions from SST, even when the WER of the initial system is more than 80%.
Original language | English |
---|---|
Title of host publication | Proceedings of Interspeech 2021 |
Publisher | International Speech Communication Association |
Pages | 716-720 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 30 Aug 2021 |
Event | Interspeech 2021: The 22nd Annual Conference of the International Speech Communication Association - Brno, Czech Republic Duration: 30 Aug 2021 → 3 Sept 2021 Conference number: 22 https://www.interspeech2021.org |
Publication series
Name | |
---|---|
ISSN (Print) | 1990-9772 |
Conference
Conference | Interspeech 2021 |
---|---|
Country/Territory | Czech Republic |
City | Brno |
Period | 30/08/21 → 3/09/21 |
Internet address |
Keywords / Materials (for Non-textual outputs)
- speech recognition
- semi-supervised training
Fingerprint
Dive into the research topics of 'On the Learning Dynamics of Semi-Supervised Training for ASR'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Unmute : Opening Spoken Language Interaction to the Currently Unheard
Bell, P., Goldwater, S. & Renals, S.
1/12/20 → 30/11/23
Project: Research