We describe an open-source toolkit for statistical machine translation whose novel contributions are (a) support for linguistically motivated factors, (b) confusion network decoding, and (c) efficient data formats for translation models and language models. In addition to the SMT decoder, the toolkit also includes a wide variety of tools for training, tuning and applying the system to many translation tasks.
|Title of host publication||Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions|
|Place of Publication||Prague, Czech Republic|
|Publisher||Association for Computational Linguistics|
|Number of pages||4|
|Publication status||Published - 1 Jun 2007|