Capturing Missing Tuples and Missing Values

Ting Deng, Wenfei Fan, Floris Geerts

Research output: Contribution to journalArticlepeer-review

Abstract

Databases in real life are often neither entirely closed-world nor entirely open-world. Indeed, databases in an enterprise are typically partially closed, in which a part of the data is constrained by master data that contains complete information about the enterprise in certain aspects. It has been shown that despite missing tuples, such a database may turn out to have complete information for answering a query. This paper studies partially closed databases from which both tuples and attribute values may be missing. We specify such a database in terms of conditional tables constrained by master data, referred to as c-instances. We first propose three models to characterize whether a c-instance T is complete for a query Q relative to master data. That is, depending on how missing values in T are instantiated, the answer to Q in T remains unchanged when new tuples are added. We then investigate three problems, to determine (a) whether a given c-instance is complete for a query Q, (b) whether there exists a c-instance that is complete for Q relative to master data available, and (c) whether a c-instance is a minimal-size database that is complete for Q. We establish matching lower and upper bounds on these problems for queries expressed in a variety of languages, in each of the three models for specifying relative completeness.
Original languageEnglish
Article number10
Number of pages47
JournalACM Transactions on Database Systems
Volume41
Issue number2
DOIs
Publication statusPublished - May 2016

Fingerprint

Dive into the research topics of 'Capturing Missing Tuples and Missing Values'. Together they form a unique fingerprint.

Cite this