Grouping Language Model Boundary Words to Speed K-Best Extraction from Hypergraphs

Kenneth Heafield, Philipp Koehn, Alon Lavie

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

We propose a new algorithm to approximately extract top-scoring hypotheses from a hypergraph when the score includes an N–gram language model. In the popular cube pruning algorithm, every hypothesis is annotated with boundary words and permitted to recombine only if all boundary words are equal. However, many hypotheses share some, but not all, boundary words. We use these common boundary words to group hypotheses and do so recursively, resulting in a tree of hypotheses. This tree forms the basis for our new search algorithm that iteratively refines groups of boundary words on demand. Machine translation experiments show our algorithm makes translation 1.50 to 3.51 times as fast as with cube pruning in common cases.
Original languageEnglish
Title of host publicationHuman Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, June 9-14, 2013, Westin Peachtree Plaza Hotel, Atlanta, Georgia, USA
Pages958-968
Number of pages11
Publication statusPublished - 2013

Fingerprint

Dive into the research topics of 'Grouping Language Model Boundary Words to Speed K-Best Extraction from Hypergraphs'. Together they form a unique fingerprint.

Cite this