Large Scale Speech-to-Text Translation with Out-of-Domain Corpora Using Better Context-Based Models and Domain Adaptation

Marcin Junczys-Dowmunt, Pawel Przybysz, Arleta Staszuk, Eun-Kyoung Kim, Jaewon Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we described the process of building a large-scale speech-to-text pipeline. Two target domains, daily conversations and travel-related conversations between two agents, for the English-German language pair (both directions) are examined. The SMT component is built from out-of-domain but freely-available bilingual and monolingual data. We make use of most of the known available resources to examine the effects of unrestricted data and large scale models. A naive baseline delivers solid results in terms of MT-quality. Extending the baseline with context-based translation model features like operations sequence models, higher-order class-based language models, and additional web-scale word-based language models leads to a system that significantly outperforms the baseline. Domain adaption is performed by separately weighting the influence of the out-of-domain subcorpora. This is explored for translation models and language models yielding significant improvements in both cases. Automatic and manual evaluation results are provided for raw MT-quality and ASR+MT-quality.
Original languageEnglish
Title of host publicationINTERSPEECH 2015 16th Annual Conference of the International Speech Communication Association
Place of PublicationDresden, Germany
PublisherInternational Speech Communication Association
Pages2272-2276
Number of pages5
Publication statusPublished - 2015
EventInterspeech 2015 - Dresden, Germany
Duration: 6 Sep 20159 Sep 2015

Conference

ConferenceInterspeech 2015
Country/TerritoryGermany
CityDresden
Period6/09/159/09/15

Fingerprint

Dive into the research topics of 'Large Scale Speech-to-Text Translation with Out-of-Domain Corpora Using Better Context-Based Models and Domain Adaptation'. Together they form a unique fingerprint.

Cite this