A Space-Efficient Phrase Table Implementation Using Minimal Perfect Hash Functions

Marcin Junczys-Dowmunt

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

We describe the structure of a space-efficient phrase table for phrase-based statistical machine translation with the Moses decoder. The new phrase table can be used in-memory or be partially mapped on-disk. Compared to the standard Moses on-disk phrase table implementation a size reduction by a factor of 6 is achieved.
The focus of this work lies on the source phrase index which is implemented using minimal perfect hash functions. Two methods are discussed that reduce the memory consumption of a baseline implementation.
Original languageEnglish
Title of host publicationText, Speech and Dialogue
Subtitle of host publication15th International Conference, TSD 2012, Brno, Czech Republic, September 3-7, 2012. Proceedings
EditorsPetr Sojka, Ales Horák, Ivan Kopecek, Karel Pala
Place of PublicationBerlin, Heidelberg
PublisherSpringer
Pages320-327
Number of pages8
ISBN (Electronic)978-3-642-32790-2
ISBN (Print)978-3-642-32789-6
DOIs
Publication statusPublished - 2012

Publication series

NameLecture Notes in Computer Science
PublisherSpringer Berlin Heidelberg
Volume7499
ISSN (Print)0302-9743

Keywords / Materials (for Non-textual outputs)

  • statistical machine translation
  • compact phrase table
  • minimal perfect hash function
  • Moses

Fingerprint

Dive into the research topics of 'A Space-Efficient Phrase Table Implementation Using Minimal Perfect Hash Functions'. Together they form a unique fingerprint.

Cite this