Edinburgh Research Explorer

Partial Evaluation for Distributed XPath Query Processing and Beyond

Research output: Contribution to journalArticle

  • Gao Cong
  • Wenfei Fan
  • Anastasios Kementsietsidis
  • Jianzhong Li
  • Xianmin Liu

Related Edinburgh Organisations

Documents

http://doi.acm.org/10.1145/2389241.2389251
Original languageEnglish
Pages (from-to)32-75
Number of pages43
JournalACM Transactions on Database Systems
Volume37
Issue number4
DOIs
Publication statusPublished - 2012

Abstract

This article proposes algorithms for evaluating XPath queries over an XML tree that is partitioned horizontally and vertically, and is distributed across a number of sites. The key idea is based on partial evaluation: it is to send the whole query to each site that partially evaluates the query, in parallel, and sends the results as compact (Boolean) functions to a coordinator that combines these to obtain the result. This approach possesses the following performance guarantees. First, each site is visited at most twice for data-selecting XPath queries, and only once for Boolean XPath queries. Second, the network traffic is determined by the answer to the query, rather than the size of the tree. Third, the total computation is comparable to that of centralized algorithms on the tree stored in a single site, regardless of how the tree is fragmented and distributed. We also present a MapReduce algorithm for evaluating Boolean XPath queries, based on partial evaluation. In addition, we provide algorithms to evaluate XPath queries on very large XML trees, in a centralized setting. We show both analytically and empirically that our techniques are scalable with large trees and complex XPath queries. These results, we believe, illustrate the usefulness and potential of partial evaluation in distributed systems as well as centralized XML stores for evaluating XPath queries and beyond.

Download statistics

No data available

ID: 17627310