Further meta-evaluation of machine translation

Chris Callison-Burch, Cameron Fordyce, Philipp Koehn, Christof Monz, Josh Schroeder

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper analyzes the translation quality of machine translation systems for 10 language pairs translating between Czech, English, French, German, Hungarian, and Spanish. We report the translation quality of over 30 diverse translation systems based on a large-scale manual evaluation involving hundreds of hours of effort. We use the human judgments of the systems to analyze automatic evaluation metrics for translation quality, and we report the strength of the correlation with human judgments at both the system-level and at the sentence-level. We validate our manual evaluation methodology by measuring intra- and inter-annotator agreement, and collecting timing information.
Original languageEnglish
Title of host publicationProceedings of the Third Workshop on Statistical Machine Translation (StatMT '08)
Place of PublicationStroudsburg, PA, USA
PublisherAssociation for Computational Linguistics
Pages70-106
Number of pages37
ISBN (Print)978-1-932432-09-1
Publication statusPublished - 2008

Fingerprint

Dive into the research topics of 'Further meta-evaluation of machine translation'. Together they form a unique fingerprint.

Cite this