Multi-Reference Evaluation for Dialectal Speech Recognition System: A Study for Egyptian ASR

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Dialectal Arabic has no standard orthographic representation. This creates a challenge when evaluating an Automatic Speech Recognition (ASR) system for dialect. Since the reference transcription text can vary widely from one user to another, we propose an innovative approach for evaluating dialectal speech recognition using Multi-References. For each recognized speech segments, we ask five different users to transcribe the speech. We combine the alignment for the multiple references, and use the combined alignment to report a modified version of Word Error Rate (WER). This approach is in favor of accepting a recognized word if any of the references typed it in the same form. Our method proved to be more effective in capturing many correctly recognized words that have multiple acceptable spellings. The initial WER according to each of the five references individually ranged between 76.4% to 80.9%. When considering all references combined, the Multi-References MR-WER was found to be 53%
Original languageEnglish
Title of host publicationProceedings of the Second Workshop on Arabic Natural Language Processing
PublisherAssociation for Computational Linguistics
Number of pages9
ISBN (Print)978-1-941643-58-7
Publication statusPublished - 1 Aug 2015


Dive into the research topics of 'Multi-Reference Evaluation for Dialectal Speech Recognition System: A Study for Egyptian ASR'. Together they form a unique fingerprint.

Cite this