XArch: Archiving Scientific and Reference Data

Heiko Müller, Peter Buneman, Ioannis Koltsidas

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Database archiving is important for the retrieval of old versions of a database and for temporal queries over the history of data. We demonstrate XArch, a management system for maintaining, populating, and querying archives of hierarchical data. XArch is based on a nested merge approach that efficiently stores multiple versions of hierarchical data in a compact archive. By merging elements into one data structure, any specific version is retrievable from the archive in a single pass over the data and efficient tracking of object history is possible. XArch implements this approach and extends it in two important ways. First, in order to merge large hierarchical data sets, elements need to be sorted according to their key values. We developed an efficient algorithm for sorting hierarchical data in secondary storage and modified the nested merge algorithm accordingly. Second, we designed and implemented a declarative query language that enables one both to view data from particular versions and to track the history of objects. We demonstrate this using both molecular biology and demographic reference data as examples.
Original languageEnglish
Title of host publicationProceedings of the 2008 ACM SIGMOD International Conference on Management of Data
Place of PublicationNew York, NY, USA
Number of pages4
ISBN (Print)978-1-60558-102-6
Publication statusPublished - 2008

Publication series

NameSIGMOD '08


  • archiving
  • hierarchical data
  • temporal queries

Fingerprint Dive into the research topics of 'XArch: Archiving Scientific and Reference Data'. Together they form a unique fingerprint.

Cite this