We present an extension of phrase-based statistical machine translation models that enables the straight-forward integration of additional annotation at the word-level — may it be linguistic markup or automatically generated word classes. In a number of experiments we show that factored translation models lead to better translation performance, both in terms of automatic scores, as well as more grammatical coherence.
|Title of host publication||Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)|
|Place of Publication||Prague, Czech Republic|
|Publisher||Association for Computational Linguistics|
|Number of pages||9|
|Publication status||Published - 1 Jun 2007|