Edinburgh Research Explorer

Distributed query evaluation with performance guarantees

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Documents

http://doi.acm.org/10.1145/1247480.1247537
Original languageEnglish
Title of host publicationProceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12-14, 2007
PublisherACM
Pages509-520
Number of pages12
ISBN (Print)978-1-59593-686-8
DOIs
Publication statusPublished - 11 Jun 2007

Abstract

Partial evaluation has recently proven an effective technique for evaluating Boolean XPath queries over a fragmented tree that is distributed over a number of sites. What left open is whether or not the technique is applicable to generic data-selecting XPath queries. In contrast to Boolean queries that return a single truth value, a generic XPath query returns a set of elements, and its evaluation introduces difficulties to avoiding excessive data shipping. This paper settles this question in positive by providing evaluation algorithms and optimizations for generic XPath queries in the same distributed and fragmented setting. These algorithms explore parallelism and retain the performance guarantees of their counterpart for Boolean queries, regardless of how the tree is fragmented and distributed. First, each site is visited at most three times, and down to at most twice when optimizations are in place. Second, the network traffic is determined by the final answer of the query, rather than the size of the tree, without incurring unnecessary data shipping. Third, the total computation is comparable to that of centralized algorithms on the tree stored in a single site. We show both analytically and experimentally that our algorithms and optimizations are scalable and efficient on large trees and complex XPath queries.

Download statistics

No data available

ID: 17664454