Path Queries on Compressed XML

Peter Buneman, Martin Grohe, Christoph Koch

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Central to any XML query language is a path language such as XPath which operates on the tree structure of the XML document. We demonstrate in this paper that the tree structure can be effectively compressed and manipulated using techniques derived from symbolic model checking. Specifically, we show first that succinct representations of document tree structures based on sharing subtrees are highly effective. Second, we show that compressed structures can be queried directly and efficiently through a process of manipulating selections of nodes and partial decompression. We study both the theoretical and experimental properties of this technique and provide algorithms for querying our compressed instances using node-selecting path query languages such as XPath. We believe the ability to store and manipulate large portions of the structure of very large XML documents in main memory is crucial to the development of efficient, scalable native XML databases and query engines.
Original languageEnglish
Title of host publicationProceedings of the 29th Conference on Very Large Data Bases
Number of pages12
Publication statusPublished - 2003


Dive into the research topics of 'Path Queries on Compressed XML'. Together they form a unique fingerprint.

Cite this