Edinburgh Research Explorer

Automated Production of True-Cased Punctuated Subtitles for Weather and News Broadcasts

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Open Access permissions


Original languageEnglish
Title of host publicationINTERSPEECH 2014 15th Annual Conference of the International Speech Communication Association
PublisherInternational Speech Communication Association
Number of pages2
Publication statusPublished - 2014


Providing subtitling for multimedia content is a highly costly process. Any system aimed at automating at least part of this process may therefore yield significant economic benefits for content providers. In this paper, we present an integrated automatic system capable of automatically subtitling weather forecasts and news broadcasts. In this system, a number of different modules are stringed together, each performing a single processing step in the pipeline. An ASR (Automatic Speech Recognition) module first converts raw audio into an uninterrupted stream of written words. A decision tree classifier then marks sentence boundaries in the resulting word sequence. Finally, a SMT (Statistical Machine Translation) module `translates' the resulting sentences into punctuated true-cased text. The system has been developed in close cooperation with Red Bee Media and will be deployed in their commercial production pipeline.

Download statistics

No data available

ID: 20065558