Automated Production of True-Cased Punctuated Subtitles for Weather and News Broadcasts

Joris Driesen, Alexandra Birch, Simon Grimsey, Saeid Safarfashandi, Juliet Gauthier, Matt Simpson, Steve Renals

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Providing subtitling for multimedia content is a highly costly process. Any system aimed at automating at least part of this process may therefore yield significant economic benefits for content providers. In this paper, we present an integrated automatic system capable of automatically subtitling weather forecasts and news broadcasts. In this system, a number of different modules are stringed together, each performing a single processing step in the pipeline. An ASR (Automatic Speech Recognition) module first converts raw audio into an uninterrupted stream of written words. A decision tree classifier then marks sentence boundaries in the resulting word sequence. Finally, a SMT (Statistical Machine Translation) module `translates' the resulting sentences into punctuated true-cased text. The system has been developed in close cooperation with Red Bee Media and will be deployed in their commercial production pipeline.
Original languageEnglish
Title of host publicationINTERSPEECH 2014 15th Annual Conference of the International Speech Communication Association
PublisherInternational Speech Communication Association
Pages2146-2147
Number of pages2
Publication statusPublished - 2014

Fingerprint Dive into the research topics of 'Automated Production of True-Cased Punctuated Subtitles for Weather and News Broadcasts'. Together they form a unique fingerprint.

Cite this