Projects per year
Abstract
Measuring the performance of automatic speech recognition (ASR) systems requires manually transcribed data in order to compute the word error rate (WER), which is often time-consuming and expensive. In this paper, we propose a novel approach to estimate WER, or e-WER, which does not require a gold-standard transcription of the test set. Our e-WER framework uses a comprehensive set of features: ASR recognised text, character recognition results to complement recognition output, and internal decoder features. We report results for the two features; black-box and glass-box using unseen 24 Arabic broadcast programs. Our system achieves 16.9% WER root mean squared error (RMSE) across 1,400 sentences. The estimated overall WER e-WER was 25.3% for the three hours test set, while the actual WER was 28.5%.
Original language | English |
---|---|
Title of host publication | Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) |
Place of Publication | Melbourne, Australia |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 20-24 |
Number of pages | 5 |
Publication status | Published - Jul 2018 |
Fingerprint
Dive into the research topics of 'Word Error Rate Estimation for Speech Recognition: e-WER'. Together they form a unique fingerprint.Projects
- 1 Finished
-
SUMMA - Scalable Understanding of Mulitingual Media
Renals, S., Birch-Mayne, A. & Cohen, S.
1/02/16 → 31/01/19
Project: Research