Alternative Objective Functions for Training MT Evaluation Metrics

Milos Stanojevic, Khalil Sima'an

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

MT evaluation metrics are tested for correlation with human judgments either at the sentence- or the corpus-level. Trained metrics ignore corpus-level judgments and are trained for high sentence-level correlation only. We show that training only for one objective (sentence or corpus level), can not only harm the performance on the other objective, but it can also be suboptimal for the objective being optimized. To this end we present a metric trained for corpus-level and show empirical comparison against a metric trained for sentencelevel exemplifying how their performance may vary per language pair, type and level of judgment. Subsequently we propose a model trained to optimize both objectives simultaneously and show that it is far more stable than–and on average outperforms– both models on both objectives.
Original languageEnglish
Title of host publicationProceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Short Papers)
Place of PublicationVancouver, Canada
PublisherAssociation for Computational Linguistics (ACL)
Pages20-25
Number of pages6
DOIs
Publication statusPublished - 4 Aug 2017
Event55th annual meeting of the Association for Computational Linguistics (ACL) - Vancouver, Canada
Duration: 30 Jul 20174 Aug 2017
http://acl2017.org/

Conference

Conference55th annual meeting of the Association for Computational Linguistics (ACL)
Abbreviated titleACL 2017
Country/TerritoryCanada
CityVancouver
Period30/07/174/08/17
Internet address

Fingerprint

Dive into the research topics of 'Alternative Objective Functions for Training MT Evaluation Metrics'. Together they form a unique fingerprint.

Cite this