Projects per year
This paper describes the CSTR submission to the Multilingual and Code-Switching ASR Challenges at Interspeech 2021. For the multilingual track of the challenge, we trained a multilingual CNN-TDNN acoustic model for Gujarati, Hindi, Marathi, Odia, Tamil and Telugu and subsequently fine-tuned the model on monolingual training data. A language model built on a mixture of training and CommonCrawl data was used for decoding. We also demonstrate that crawled data from YouTube can be successfully used to improve the performance of the acoustic model with semi-supervised training. These models together with confidence based language identification achieve the average WER of 18.1%, a 41% relative improvement compared to the provided multilingual baseline model. For the code-switching track of the challenge we again train a multilingual model on Bengali and Hindi technical lectures and we employ a language model trained on CommonCrawl Bengali and Hindi data mixed with in-domain English data, using a novel transliteration method to generate pronunciations for the English terms. The final model improves by 18% and 34% relative compared to our multilingual baseline. Both our systems were among the top-ranked entries to the challenge.
|Title of host publication||Proceedings of Interspeech 2021|
|Publisher||International Speech Communication Association|
|Number of pages||5|
|Publication status||Published - 30 Aug 2021|
|Event||Interspeech 2021: The 22nd Annual Conference of the International Speech Communication Association - Brno, Czech Republic|
Duration: 30 Aug 2021 → 3 Sep 2021
Conference number: 22
|Period||30/08/21 → 3/09/21|
- low-resource speech recognition
- multilingual speech recognition
- code switching
FingerprintDive into the research topics of 'The CSTR System for Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages'. Together they form a unique fingerprint.
- 1 Active
1/12/20 → 30/11/23