Partial Evaluation for Distributed XPath Query Processing and Beyond

Gao Cong, Wenfei Fan, Anastasios Kementsietsidis, Jianzhong Li, Xianmin Liu

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

This article proposes algorithms for evaluating XPath queries over an XML tree that is partitioned horizontally and vertically, and is distributed across a number of sites. The key idea is based on partial evaluation: it is to send the whole query to each site that partially evaluates the query, in parallel, and sends the results as compact (Boolean) functions to a coordinator that combines these to obtain the result. This approach possesses the following performance guarantees. First, each site is visited at most twice for data-selecting XPath queries, and only once for Boolean XPath queries. Second, the network traffic is determined by the answer to the query, rather than the size of the tree. Third, the total computation is comparable to that of centralized algorithms on the tree stored in a single site, regardless of how the tree is fragmented and distributed. We also present a MapReduce algorithm for evaluating Boolean XPath queries, based on partial evaluation. In addition, we provide algorithms to evaluate XPath queries on very large XML trees, in a centralized setting. We show both analytically and empirically that our techniques are scalable with large trees and complex XPath queries. These results, we believe, illustrate the usefulness and potential of partial evaluation in distributed systems as well as centralized XML stores for evaluating XPath queries and beyond.
Original languageEnglish
Pages (from-to)32-75
Number of pages43
JournalACM Transactions on Database Systems
Issue number4
Publication statusPublished - 2012


Dive into the research topics of 'Partial Evaluation for Distributed XPath Query Processing and Beyond'. Together they form a unique fingerprint.

Cite this