Simulating human judgment in machine translation evaluation campaigns

Philipp Koehn

Research output: Chapter in Book/Report/Conference proceedingConference contribution


We present a Monte Carlo model to simulate human judgments in machine translation evaluation campaigns, such as WMT or IWSLT. We use the model to compare different ranking methods and to give guidance on the number of judgments that need to be collected to obtain sufficiently significant distinctions between systems
Original languageEnglish
Title of host publication2012 International Workshop on Spoken Language Translation, IWSLT 2012, Hong Kong, December 6-7, 2012
Number of pages6
Publication statusPublished - 2012

Cite this