Projects per year
Abstract / Description of output
We present a method for cross-lingual training an ASR system using absolutely no transcribed training data from the target language, and with no phonetic knowledge of the language in question. Our approach uses a novel application of a decipherment algorithm, which operates given only unpaired speech and text data from the target language. We apply this decipherment to phone sequences generated by a universal phone recogniser trained on out-of-language speech corpora, which we follow with flat-start semi-supervised training to obtain an acoustic model for the new language. To the best of our knowledge, this is the first practical approach to zero-resource cross-lingual ASR which does not rely on any hand-crafted phonetic information. We carry out experiments on read speech from the GlobalPhone corpus, and show that it is possible to learn a decipherment model on just 20 minutes of data from the target language. When used to generate pseudo-labels for semi-supervised training, we obtain WERs that range from 32.5% to just 1.9% absolute worse than the equivalent fully supervised models trained on the same data.
Original language | English |
---|---|
Title of host publication | Proceedings of Interspeech 2022 |
Editors | Hanseok Ko, John H. L. Hansen |
Publisher | ISCA |
Pages | 2288-2292 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 18 Sept 2022 |
Event | Interspeech 2022 - Incheon, Korea, Republic of Duration: 18 Sept 2022 → 22 Sept 2022 Conference number: 23 https://interspeech2022.org/ |
Conference
Conference | Interspeech 2022 |
---|---|
Country/Territory | Korea, Republic of |
City | Incheon |
Period | 18/09/22 → 22/09/22 |
Internet address |
Keywords / Materials (for Non-textual outputs)
- automatic speech recognition
- cross-lingual transfer
- decipherment
- semi-supervised training
Fingerprint
Dive into the research topics of 'Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Unmute : Opening Spoken Language Interaction to the Currently Unheard
Bell, P., Goldwater, S. & Renals, S.
1/12/20 → 30/11/23
Project: Research