Segmenting Subtitles for Correcting ASR Segmentation Errors

David Wan, Chris Kedzie, Faisal Ladhak, Elsbeth Turcan, Petra Galuščáková, Elena Zotkina, Zhengping Jiang, Peter Bell, Kathleen McKeown

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Typical ASR systems segment the input audio in toutterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation. In this work, we propose a model for correcting the acoustic segmentation of ASR models for low-resource languages to improve performance on down-stream tasks. We propose the use of subtitles as a proxy dataset for correcting ASR acoustic segmentation, creating synthetic acoustic utterances by modeling common error modes. We train a neural tagging model for correcting ASR acoustic segmentation and show that it improves downstream performance on MT and audio-document cross-language information retrieval (CLIR).
Original languageEnglish
Title of host publicationProceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
PublisherAssociation for Computational Linguistics
Pages2842-2854
Number of pages13
ISBN (Print)978-1-954085-02-2
Publication statusPublished - 19 Apr 2021
Event16th conference of the European Chapter of the Association for Computational Linguistics - Virtual Conference
Duration: 19 Apr 202123 Apr 2021
https://2021.eacl.org/

Conference

Conference16th conference of the European Chapter of the Association for Computational Linguistics
Abbreviated titleEACL 2021
CityVirtual Conference
Period19/04/2123/04/21
Internet address

Fingerprint

Dive into the research topics of 'Segmenting Subtitles for Correcting ASR Segmentation Errors'. Together they form a unique fingerprint.

Cite this