Examining the Relationship between Preordering and Word Order Freedom in Machine Translation

Joachim Daiber, Milos Stanojevic, Wilker Aziz, Khalil Sima'an

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

We study the relationship between word order freedom and preordering in statistical machine translation. To assess word order freedom, we first introduce a novel entropy measure which quantifies how difficult it is to predict word order given a source sentence and its syntactic analysis. We then address preordering for two target languages at the far ends of the word order freedom spectrum, German and Japanese, and argue that for languages with more word order freedom, attempting to predict a unique word order given source clues only is less justified. Subsequently, we examine lattices of n-best word order predictions as a unified representation for languages from across this broad spectrum and present an effective solution to a resulting technical issue, namely how to select a suitable source word order from the lattice during training. Our experiments show that lattices are crucial for good empirical performance for languages with freer word order (English–German) and can provide additional improvements for fixed word order languages (English–Japanese).
Original languageEnglish
Title of host publicationProceedings of the First Conference on Machine Translation, Volume 1: Research Papers
Place of PublicationBerlin, Germany
PublisherAssociation for Computational Linguistics (ACL)
Pages118-130
Number of pages13
DOIs
Publication statusPublished - 12 Aug 2016
EventFirst Conference on Machine Translation - Berlin, Germany
Duration: 11 Aug 201612 Aug 2016
http://www.statmt.org/wmt16/

Conference

ConferenceFirst Conference on Machine Translation
Abbreviated titleWMT16
Country/TerritoryGermany
CityBerlin
Period11/08/1612/08/16
Internet address

Fingerprint

Dive into the research topics of 'Examining the Relationship between Preordering and Word Order Freedom in Machine Translation'. Together they form a unique fingerprint.

Cite this