Efficient Memory Representation of XML Documents

Giorgio Busatto, Markus Lohrey, Sebastian Maneth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Implementations that load XML documents and give access to them via, e.g., the DOM, suffer from huge memory demands: the space needed to load an XML document is usually many times larger than the size of the document. A considerable amount of memory is needed to store the tree structure of the XML document. Here a technique is presented that allows to represent the tree structure of an XML document in an efficient way. The representation exploits the high regularity in XML documents by “compressing” their tree structure; the latter means to detect and remove repetitions of tree patterns. The functionality of basic tree operations, like traversal along edges, is preserved in the compressed representation. This allows to directly execute queries (and in particular, bulk operations) without prior decompression. For certain tasks like validation against an XML type or checking equality of documents, the representation allows for provably more efficient algorithms than those running on conventional representations.
Original languageEnglish
Title of host publicationDatabase Programming Languages
Subtitle of host publication10th International Workshop, DBPL 2005, Trondheim, Norway, August 28-29, 2005, Revised Selected Papers
EditorsGavin Bierman, Christoph Koch
Place of PublicationBerlin, Heidelberg
PublisherSpringer Berlin Heidelberg
Pages199-216
Number of pages18
ISBN (Electronic)978-3-540-31445-5
ISBN (Print)978-3-540-30951-2
DOIs
Publication statusPublished - 2005

Publication series

NameLecture Notes in Computer Science
PublisherSpringer Berlin Heidelberg
Volume3774
ISSN (Print)0302-9743

Fingerprint

Dive into the research topics of 'Efficient Memory Representation of XML Documents'. Together they form a unique fingerprint.

Cite this