Abstract / Description of output
Self-supervised learning (SSL) is used in deep learning to train on large datasets without the need for expensive labelling of the data. Recently, large Automatic Speech Recognition (ASR) models such as XLS-R have utilised SSL to train on over one hundred different languages simultaneously. However, deeper investigation shows that the bulk of the training data for XLS-R comes from a small number of languages. Biases learned through SSL have been shown to exist in multiple domains, but language bias in multilingual SSL ASR has not been thoroughly examined. In this paper, we utilise the Lottery Ticket Hypothesis (LTH) to identify language-specific subnetworks within XLS-R and test the performance of these subnetworks on a variety of different languages. We are able to show that when fine-tuning, XLS-R bypasses traditional linguistic knowledge and builds only on weights learned from the languages with the largest data contribution to the pre-training data.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2024 IEEE Spoken Language Technology Workshop |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 1-6 |
Number of pages | 6 |
Publication status | Accepted/In press - 30 Aug 2024 |
Event | IEEE Spoken Language Technology Workshop 2024 - Banyan Tree Macau, Macau, China Duration: 2 Dec 2024 → 5 Dec 2024 https://2024.ieeeslt.org |
Publication series
Name | Proceedings of the IEEE Spoken Language Technology Workshop |
---|---|
Publisher | IEEE |
ISSN (Print) | 2639-5479 |
Conference
Conference | IEEE Spoken Language Technology Workshop 2024 |
---|---|
Abbreviated title | SLT 2024 |
Country/Territory | China |
City | Macau |
Period | 2/12/24 → 5/12/24 |
Internet address |
Keywords / Materials (for Non-textual outputs)
- speech recognition
- self-supervised learning
- language bias
- language-specific subnetworks
- model pruning