Abstract / Description of output
Translating between dissimilar languages requires an account of the use of divergent word orders when expressing the same semantic content. Reordering
poses a serious problem for statistical machine translation systems and has generated a considerable body of research aimed at meeting its challenges. Direct evaluation of reordering requires automatic metrics that explicitly measure the quality of word order choices in translations. Current metrics, such as BLEU, only evaluate reordering indirectly. We analyse the ability of current metrics to capture reordering performance. We then introduce permutation distance metrics as a direct method for measuring word order similarity between translations and reference sentences. By correlating all metrics with a novel method for eliciting human judgements of reordering quality, we show that current metrics are largely influenced by lexical choice, and that they are not able to distinguish between different reordering scenarios. Also, we show that
permutation distance metrics correlate very well with human judgements, and are
impervious to lexical differences.
poses a serious problem for statistical machine translation systems and has generated a considerable body of research aimed at meeting its challenges. Direct evaluation of reordering requires automatic metrics that explicitly measure the quality of word order choices in translations. Current metrics, such as BLEU, only evaluate reordering indirectly. We analyse the ability of current metrics to capture reordering performance. We then introduce permutation distance metrics as a direct method for measuring word order similarity between translations and reference sentences. By correlating all metrics with a novel method for eliciting human judgements of reordering quality, we show that current metrics are largely influenced by lexical choice, and that they are not able to distinguish between different reordering scenarios. Also, we show that
permutation distance metrics correlate very well with human judgements, and are
impervious to lexical differences.
Original language | English |
---|---|
Pages (from-to) | 15-26 |
Number of pages | 12 |
Journal | Machine Translation |
Volume | 24 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2010 |