Language Model Rest Costs and Space-Efficient Storage

Kenneth Heafield, Philipp Koehn, Alon Lavie

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Approximate search algorithms, such as cube pruning in syntactic machine translation, rely on the language model to estimate probabilities of sentence fragments. We contribute two changes that trade between accuracy of these
estimates and memory, holding sentence-level scores constant. Common practice uses lower order entries in an N-gram model to score the first few words of a fragment; this violates assumptions made by common smoothing
strategies, including Kneser-Ney. Instead, we use a unigram model to score the first word, a bigram for the second, etc. This improves search at the expense of memory. Conversely, we show how to save memory by collapsing probability and backoff into a single value without changing sentence-level scores, at the expense of less accurate estimates for sentence fragments. These changes can be stacked, achieving better estimates with unchanged memory usage. In order to interpret changes in search accuracy, we adjust the pop limit so that accuracy is unchanged and report the change in CPU time. In a German English Moses system with target-side syntax, improved estimates yielded a 63% reduction
in CPU time; for a Hiero-style version, the reduction is 21%. The compressed language model uses 26% less RAM while equivalent search quality takes 27% more CPU. Source code is released as part of KenLM.
Original languageEnglish
Title of host publicationProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012, July 12-14, 2012, Jeju Island, Korea
PublisherAssociation for Computational Linguistics
Number of pages10
Publication statusPublished - 2012


Dive into the research topics of 'Language Model Rest Costs and Space-Efficient Storage'. Together they form a unique fingerprint.

Cite this