Edinburgh Research Explorer

Improving Data Quality: Consistency and Accuracy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publicationProceedings of the 33rd International Conference on Very Large Data Bases, University of Vienna, Austria, September 23-27, 2007
Number of pages12
Publication statusPublished - 2007


This paper revisits the analysis of annotation propagation from source databases to views defined in terms of conjunctive (SPJ) queries. Given a source database D, an SPJ query Q, the view Q(D) and a tuple ΔV in the view, the view (resp. source) side-effect problem is to find a minimal set ΔD of tuples such that the deletion of ΔD from D results in the deletion of ΔV from Q(D) while minimizing the side effects on the view (resp. the source). A third problem, referred to as the annotation placement problem, is to find a single base tuple ΔD such that annotation in a field of ΔD propagates to ΔV while minimizing the propagation to other fields in the view Q(D). These are important for data provenance and the management of view updates. However important, these problems are unfortunately NP-hard for most subclasses of SPJ views [5].To make the annotation propagation analysis feasible in practice, we propose a key preserving condition on SPJ views, which requires that the projection fields of an SPJ view Q retain a key of each base relation involved in Q. While this condition is less restrictive than other proposals [11, 14], it often simplifies the annotation propagation analysis. Indeed, for key-preserving SPJ views the annotation placement problem coincides with the view side-effect problem, and the view and source side-effect problems become tractable. In addition we generalize the setting of [5] by allowing ΔV to be a group of tuples to be deleted, and investigate the insertion of tuples to the view. We show that group updates make the analysis harder: these problems become NP-hard for several subclasses of SPJ views. We also show that for SPJ views the source and view side-effect problems are NP-hard for single-tuple insertion, but are tractable for some subclasses of SPJ for group insertions, in the presence or in the absence of the key preservation condition.

Download statistics

No data available

ID: 17664509