Hallucinations in Large Multilingual Translation Models

Nuno M. Guerreiro*, Duarte M. Alves, Jonas Waldendorf, Barry Haddow, Alexandra Birch, Pierre Colombo, André F. T. Martins

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

Hallucinated translations can severely undermine and raise safety issues when machine translation systems are deployed in the wild. Previous research on the topic focused on small bilingual models trained on high-resource languages, leaving a gap in our understanding of hallucinations in multilingual models across diverse translation scenarios. In this work, we fill this gap by conducting a comprehensive analysis—over 100 language pairs across various resource levels and going beyond English-centric directions—on both the M2M neural machine translation (NMT) models and GPT large language models (LLMs). Among several insights, we highlight that models struggle with hallucinations primarily in low-resource directions and when translating out of English, where, critically, they may reveal toxic patterns that can be traced back to the training data. We also find that LLMs produce qualitatively different hallucinations to those of NMT models. Finally, we show that hallucinations are hard to reverse by merely scaling models trained with the same data. However, employing more diverse models, trained on different data or with different procedures, as fallback systems can improve translation quality and virtually eliminate certain pathologies.
Original languageEnglish
Pages (from-to)1500-1517
Number of pages18
JournalTransactions of the Association for Computational Linguistics
Publication statusPublished - 14 Dec 2023


Dive into the research topics of 'Hallucinations in Large Multilingual Translation Models'. Together they form a unique fingerprint.

Cite this