Grouping language model boundarywords to speed K-best extraction from hypergraphs

Kenneth Heafield, Philipp Koehn, Alon Lavie

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

We propose a new algorithm to approximately extract top-scoring hypotheses from a hypergraph when the score includes an N-gram language model. In the popular cube pruning algorithm, every hypothesis is annotated with boundary words and permitted to recombine only if all boundary words are equal. However, many hypotheses share some, but not all, boundary words. We use these common boundary words to group hypotheses and do so recursively, resulting in a tree of hypotheses. This tree forms the basis for our new search algorithm that iteratively refines groups of boundary words on demand. Machine translation experiments show our algorithm makes translation 1.50 to 3.51 times as fast as with cube pruning in common cases.

Original languageEnglish
Title of host publicationProceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Subtitle of host publicationHuman Language Technologies, Proceedings of the Main Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages958-968
Number of pages11
ISBN (Electronic)9781937284473
Publication statusPublished - 9 Jun 2013
Event2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013 - Atlanta, United States
Duration: 9 Jun 201314 Jun 2013

Conference

Conference2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013
Country/TerritoryUnited States
CityAtlanta
Period9/06/1314/06/13

Fingerprint

Dive into the research topics of 'Grouping language model boundarywords to speed K-best extraction from hypergraphs'. Together they form a unique fingerprint.

Cite this