Improving machine translation via triangulation and transliteration

Nadir Durrani, Philipp Koehn

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

In this paper we improve Urdu→Hindi-English machine translation through triangulation and transliteration. First we built an Urdu→Hindi SMT system by inducing triangulated and transliterated phrase-tables from Urdu–English and Hindi–English phrase translation models. We then use it to translate the Urdu part of the Urdu-English parallel data into Hindi, thus creating an artificial Hindi-English parallel data. Our phrase-translation strategies give an improvement of up to +3.35 BLEU points over a baseline Urdu→Hindi system. The synthesized data improve Hindi→English system by +0.35 and English→Hindi system by +1.0 BLEU points.
Original languageEnglish
Title of host publicationProceedings of 17th Annual conference of the European Association for Machine Translation
Pages71-78
Number of pages8
Publication statusPublished - 2014

Fingerprint

Dive into the research topics of 'Improving machine translation via triangulation and transliteration'. Together they form a unique fingerprint.

Cite this