Abstract
Given a large amount of unannotated speech in a low-resource language, can we classify the speech utterances by topic? We consider this question in the setting where a small amount of speech in the low-resource language is paired with text translations in a high-resource language. We develop an effective cross-lingual topic classifier by training on just 20 hours of translated speech, using a recent model for direct speech-to-text translation. While the translations are poor, they are still good enough to correctly classify the topic of 1-minute speech segments over 70% of the time—a 20% improvement over a majority-class baseline. Such a system could be useful for humanitarian applications like crisis response, where incoming speech in a foreign low-resource language must be quickly assessed for further action.
Original language | English |
---|---|
Title of host publication | ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Pages | 8164-8168 |
Number of pages | 5 |
ISBN (Electronic) | 978-1-5090-6631-5 |
ISBN (Print) | 978-1-5090-6632-2 |
DOIs | |
Publication status | Published - 14 May 2020 |
Event | 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing - Barcelona, Spain Duration: 4 May 2020 → 8 May 2020 Conference number: 45 |
Publication series
Name | |
---|---|
Publisher | IEEE |
ISSN (Print) | 1520-6149 |
ISSN (Electronic) | 2379-190X |
Conference
Conference | 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing |
---|---|
Abbreviated title | ICASSP 2020 |
Country/Territory | Spain |
City | Barcelona |
Period | 4/05/20 → 8/05/20 |
Keywords
- speech translation
- Low-resource speech processing
- speech classification
- unwritten languages