Abstract
Automatically evaluating word order of MT system output at the sentence-level is challenging. At the sentence-level, ngram counts are rather sparse which makes it difficult to measure word order quality effectively using lexicalized units. Recent approaches abstract away from lexicalization by assigning a score to the permutation representing how word positions in system output move around relative to a reference translation. Metrics over permutations exist (e.g., Kendal tau or Spearman Rho) and have been shown to be useful in earlier work. However, none of the existing metrics over permutations groups word positions recursively into larger phrase-like blocks, which makes it difficult to account for long distance reordering phenomena. In this paper we explore novel metrics computed over Permutation Forests (PEFs), packed charts of Permutation Trees (PETs), which are tree decompositions of a permutation into primitive ordering units. We empirically compare PEFs metric against five known reordering metrics on WMT13 data for ten language pairs. The PEFs metric shows better correlation with human ranking than the other metrics almost on all language pairs. None of the other metrics exhibits as stable behavior across language pairs.
Original language | English |
---|---|
Title of host publication | Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation |
Place of Publication | Doha, Qatar |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 138-147 |
Number of pages | 10 |
DOIs | |
Publication status | Published - Oct 2014 |
Event | Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation - Doha, Qatar Duration: 25 Oct 2014 → 25 Oct 2014 http://www.cs.ust.hk/~dekai/ssst8/ |
Conference
Conference | Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation |
---|---|
Abbreviated title | SSST-8 |
Country/Territory | Qatar |
City | Doha |
Period | 25/10/14 → 25/10/14 |
Internet address |