While CYK+ and Earley-style variants are popular algorithms for decoding unbinarized SCFGs, in particular for syntax-based Statistical Machine Translation, the algorithms rely on a so-called dot chart which suffers from a high memory consumption. We propose a recursive variant of the CYK+ algorithm that eliminates the dot chart, without incurring an increase in time complexity for SCFG decoding. In an evaluation on a string-to-tree SMT scenario, we empirically demonstrate substantial improvements in memory consumption and translation speed.
|Title of host publication||Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation|
|Place of Publication||Doha, Qatar|
|Publisher||Association for Computational Linguistics|
|Number of pages||9|
|Publication status||Published - 1 Oct 2014|