Punctuated Transcription of Multi-genre Broadcasts Using Acoustic and Lexical Approaches

Ondrej Klejch, Peter Bell, Steve Renals

Research output: Chapter in Book/Report/Conference proceedingConference contribution


In this paper we investigate the punctuated transcription of multi-genre broadcast media. We examine four systems, three of which are based on lexical features, the fourth of which uses acoustic features by integrating punctuation into the speech recognition acoustic models. We also explore the combination of these component systems using voting and log-linear interpolation. We performed experiments on the English language MGB Challenge data, which comprises about 1,600h of BBC television recordings. Our results indicate that a lexical system, based on a neural machine translation approach is significantly better than other systems achieving an F-Measure of 62.6% on reference text, with a relative degradation of 19% on ASR output. Our analysis of the results in terms of specific punctuation indicated that using longer context improves the prediction of question marks and acoustic information improves prediction of exclamation marks. Finally, we show that even though the systems are complementary, their straightforward combination does not yield better F-measure
Original languageEnglish
Title of host publication2016 IEEE Workshop on Spoken Language Technology
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages8
ISBN (Electronic)978-1-5090-4903-5
Publication statusPublished - 9 Feb 2017
Event2016 IEEE Workshop on Spoken Language Technology - San Diego, United States
Duration: 13 Dec 201616 Dec 2016


Conference2016 IEEE Workshop on Spoken Language Technology
Abbreviated titleSLT 2016
Country/TerritoryUnited States
CitySan Diego
Internet address


Dive into the research topics of 'Punctuated Transcription of Multi-genre Broadcasts Using Acoustic and Lexical Approaches'. Together they form a unique fingerprint.

Cite this