Discourse in Statistical Machine Translation: A Survey and a Case Study

Christian Hardmeier

Current approaches to statistical machine translation assume that sentences in a text are independent, ignoring the property of connectedness present in virtually all discourse. We provide an extensive overview of the literature about statistical machine translation that can be related to discourse phenomena and present a detailed investigation and discussion of existing research efforts on a particular discourse-related problem, the translation of anaphoric pronouns. Comparing different approaches to discourse in statistical machine translation allows us to identify fundamental problems and draw conclusions from an overarching perspective.
Original languageEnglish
Number of pages28
JournalDiscours Revue de linguistique, psycholinguistique et informatique
Publication statusPublished - 23 Dec 2012


  • statistical machine translation
  • discourse
  • pronominal anaphora
  • survey

