TY - JOUR
T1 - On the Data Complexity of Relative Information Completeness
AU - Cao, Yang
AU - Deng, Ting
AU - Fan, Wenfei
AU - Geerts, Floris
PY - 2014/9
Y1 - 2014/9
N2 - Databases in an enterprise are often partially closed: parts of their data must be contained in master data, which has complete information about the core business entities of the enterprise. With this comes the need for studying relative information completeness: a partially closed database is said to be complete for a query relative to master data if it has complete information to answer the query, i.e., extending the database by adding more tuples either does not change its answer to the query or makes it no longer partially closed w.r.t. the master data. This paper investigates three problems associated with relative information completeness. Given a query Q and a partially closed database D master data D-m, (1) the relative completeness problem is to decide whether D is complete for Q relative to D-m; (2) the minimal completeness problem is to determine whether D is a minimal database that is complete for Q relative to D-m; and (3) the bounded extension problem is to decide whether it suffices to extend D by adding at most K tuples, such that the extension makes a partially closed database that is complete for Q relative to D-m. While the combined complexity bounds of the relative completeness problem and the minimal completeness problem are already known, neither their data complexity nor the bounded extension problem has been studied. We establish upper and lower bounds of these problems for data complexity, all matching, for Q expressed in a variety of query languages. (C) 2014 Elsevier Ltd. All rights reserved.
AB - Databases in an enterprise are often partially closed: parts of their data must be contained in master data, which has complete information about the core business entities of the enterprise. With this comes the need for studying relative information completeness: a partially closed database is said to be complete for a query relative to master data if it has complete information to answer the query, i.e., extending the database by adding more tuples either does not change its answer to the query or makes it no longer partially closed w.r.t. the master data. This paper investigates three problems associated with relative information completeness. Given a query Q and a partially closed database D master data D-m, (1) the relative completeness problem is to decide whether D is complete for Q relative to D-m; (2) the minimal completeness problem is to determine whether D is a minimal database that is complete for Q relative to D-m; and (3) the bounded extension problem is to decide whether it suffices to extend D by adding at most K tuples, such that the extension makes a partially closed database that is complete for Q relative to D-m. While the combined complexity bounds of the relative completeness problem and the minimal completeness problem are already known, neither their data complexity nor the bounded extension problem has been studied. We establish upper and lower bounds of these problems for data complexity, all matching, for Q expressed in a variety of query languages. (C) 2014 Elsevier Ltd. All rights reserved.
KW - Incomplete information
KW - Relative completeness
KW - Master data management
KW - Partially closed databases
KW - Data complexity
U2 - 10.1016/j.is.2014.04.001
DO - 10.1016/j.is.2014.04.001
M3 - Article
VL - 45
SP - 18
EP - 34
JO - Information Systems
JF - Information Systems
SN - 0306-4379
ER -