ParBLEU: Augmenting Metrics with Automatic Paraphrasesfor the WMT’20 Metrics Shared Task

Rachel Bawden, Biao Zhang, Andre Tättar, Matt Post

Research output: Chapter in Book/Report/Conference proceedingConference contribution


We describe parBLEU, parCHRF++, and parESIM, which augment baseline metrics with automatically generated paraphrases produced by PRISM (Thompson and Post, 2020a), a multilingual neural machine translation system. We build on recent work studying how to improve BLEU by using diverse automatically paraphrased references (Bawden et al., 2020), extending experiments to the multilingual setting for the WMT2020 metrics shared task and for three base metrics. We compare their capacity to exploit up to 100 additional synthetic references. We find that gains are possible when using additional, automatically paraphrased references, although they are not systematic. However, segment-level correlations, particularly into English, are improved for all three metrics and even with higher numbers of paraphrased references.
Original languageEnglish
Title of host publicationProceedings of the Fifth Conference on Machine Translation
PublisherAssociation for Computational Linguistics (ACL)
Number of pages8
ISBN (Print)978-1-948087-81-0
Publication statusPublished - 19 Nov 2020
EventFifth Conference on Machine Translation - Online Conference
Duration: 19 Nov 202020 Nov 2020


ConferenceFifth Conference on Machine Translation
Abbreviated titleWMT 2020
CityOnline Conference
Internet address


  • machine translation
  • metrics
  • evaluation
  • shared task
  • paraphrasing


Dive into the research topics of 'ParBLEU: Augmenting Metrics with Automatic Paraphrasesfor the WMT’20 Metrics Shared Task'. Together they form a unique fingerprint.

Cite this