Projects per year
A query Q is said to be effectively bounded if for all datasets D, there exists a subset DQ of D such that Q(D) = Q(DQ), and the size of DQ and time for fetching DQ are independent of the size of D. The need for studying such queries is evident, since it allows us to compute Q(D) by accessing a bounded dataset DQ, regardless of how big D is. This paper investigates effectively bounded conjunctive queries (SPC) under an access schema A, which specifies indices and cardinality constraints commonly used. We provide characterizations (sufficient and necessary conditions) for determining whether an SPC query Q is effectively bounded under A. We study several problems for deciding whether Q is bounded, and if not, for identifying a minimum set of parameters of Q to instantiate and make Q bounded. We show that these problems range from quadratic-time to NP-complete, and develop efficient (heuristic) algorithms for them. We also provide an algorithm that, given an effectively bounded SPC query Q and an access schema A, generates a query plan for evaluating Q by accessing a bounded amount of data in any (possibly big) dataset. We experimentally verify that our algorithms substantially reduce the cost of query evaluation.