Tree Structure Compression with RePair

Markus Lohrey, Sebastian Maneth, Roy Mennicke

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Larsson and Moffat's RePair algorithm is generalized from strings to trees. The new algorithm (TreeRePair) produces straight-line linear context-free tree (SLT) grammars which are smaller than those produced by previous grammar-based compressors such as BPLEX. Experiments show that a Huffman-based coding of the resulting grammars gives compression ratios comparable to the best known XML file compressors. Moreover, SLT grammars can be used as efficient memory representation of trees. Our investigations show that tree traversals over TreeRePair grammars are 14 times slower than over pointer structures and 5 times slower than over succinct trees, while memory consumption is only 1/43 and 1/6, respectively.
Original languageEnglish
Title of host publication2011 Data Compression Conference (DCC 2011), 29-31 March 2011, Snowbird, UT, USA
PublisherInstitute of Electrical and Electronics Engineers
Pages353-362
Number of pages10
ISBN (Print)978-1-61284-279-0
DOIs
Publication statusPublished - 2011

Fingerprint

Dive into the research topics of 'Tree Structure Compression with RePair'. Together they form a unique fingerprint.

Cite this