Voting on n-grams for machine translation system combination

Kenneth Heafield*, Alon Lavie

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

Abstract / Description of output

System combination exploits differences between machine translation systems to form a combined translation from several system outputs. Core to this process are features that reward n-gram matches between a candidate combination and each system output. Systems differ in performance at the n-gram level despite similar overall scores. We therefore advocate a new feature formulation: for each system and each small n, a feature counts n-gram matches between the system and candidate. We show post-evaluation improvement of 6.67 BLEU over the best system on NIST MT09 Arabic-English test data. Compared to a baseline system combination scheme from WMT 2009, we show improvement in the range of 1 BLEU point.

Original languageEnglish
Publication statusPublished - 2010
Event9th Biennial Conference of the Association for Machine Translation in the Americas, AMTA 2010 - Denver, CO, United States
Duration: 31 Oct 20104 Nov 2010

Conference

Conference9th Biennial Conference of the Association for Machine Translation in the Americas, AMTA 2010
Country/TerritoryUnited States
CityDenver, CO
Period31/10/104/11/10

Fingerprint

Dive into the research topics of 'Voting on n-grams for machine translation system combination'. Together they form a unique fingerprint.

Cite this