Abstract / Description of output
We present cdec, an open source framework for decoding, aligning with, and training a number of statistical machine translation models, including word-based models, phrase-based models, and models based on synchronous context-free grammars. Using a single unified internal representation for translation forests, the decoder strictly separates model-specific translation logic from general rescoring, pruning, and inference algorithms. From this unified representation, the decoder can extract not only the 1- or k-best translations, but also alignments to a reference, or the quantities necessary to drive discriminative training using gradient-based or gradient-free optimization techniques. Its efficient C++ implementation means that memory use and runtime performance are significantly better than comparable decoders.
Original language | English |
---|---|
Title of host publication | Proceedings of the ACL 2010 System Demonstrations |
Place of Publication | Uppsala, Sweden |
Publisher | Association for Computational Linguistics |
Pages | 7-12 |
Number of pages | 6 |
Publication status | Published - 1 Jul 2010 |