Projects per year
Abstract
End-to-end Automatic Speech Recognition (E2E ASR) significantly simplifies the training process of an ASR model. Connectionist Temporal Classification (CTC) is one of the most popular methods for E2E ASR training. Implicitly, CTC has a unique topology which is very useful for sequence modelling. However, we find that by changing to another topology, we can make it even more effective. In this paper, we propose a new CTC-like method, for E2E ASR training, by modifying the topology of original CTC, so that the wellknown abuse of the blank label in CTC can be resolved theoretically. As we change the topology, a normalisation term is necessary, which makes the form of the final loss function similar to Maximum Mutual Information (MMI); we hence name our method MMI-CTC. In addition to maximising the posterior probability of the target sequence, the normalisation enables models to explicitly minimise the probability of competing hypothesis at the word sequence level. Our experimental results show that MMI-CTC is more efficient than CTC, and that the normalisation is essential for sequence training.
Original language | English |
---|---|
Title of host publication | Proceedings of 2022 IEEE International Conference on Acoustics, Speech and Signal Processing |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 7792-7796 |
Number of pages | 5 |
ISBN (Electronic) | 978-1-6654-0540-9 |
ISBN (Print) | 978-1-6654-0541-6 |
DOIs | |
Publication status | Published - 27 Apr 2022 |
Event | 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - Online, Singapore Duration: 7 May 2022 → 27 May 2022 Conference number: 47 https://2022.ieeeicassp.org/index.php |
Publication series
Name | International Conference on Acoustics, Speech, and Signal Processing (ICASSP) |
---|---|
Publisher | IEEE |
ISSN (Print) | 1520-6149 |
ISSN (Electronic) | 2379-190X |
Conference
Conference | 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
---|---|
Abbreviated title | ICASSP 2022 |
Period | 7/05/22 → 27/05/22 |
Internet address |
Keywords / Materials (for Non-textual outputs)
- ASR
- E2E ASR
- CTC
- MMI
- Sequence Training
Fingerprint
Dive into the research topics of 'Investigating Sequence-Level Normalisation for CTC-Like End-To-End ASR'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Consolidated Studentships - Bell 2020
Non-EU industry, commerce and public corporations
1/09/20 → 28/02/25
Project: Research