Edinburgh Research Explorer

Determining the relative accuracy of attributes

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publicationProceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22-27, 2013
PublisherACM
Pages565-576
Number of pages12
DOIs
Publication statusPublished - 2013

Abstract

The relative accuracy problem is to determine, given tuples t1 and t2 that refer to the same entity e, whether t1[A] is more accurate than t2A, i.e., t1A is closer to the true value of the A attribute of e than t2A. This has been a longstanding issue for data quality, and is challenging when the true values of e are unknown. This paper proposes a model for determining relative accuracy. (1) We introduce a class of accuracy rules and an inference system with a chase procedure, to deduce relative accuracy. (2) We identify and study several fundamental problems for relative accuracy. Given a set Ie of tuples pertaining to the same entity e and a set of accuracy rules, these problems are to decide whether the chase process terminates, is Church-Rosser, and leads to a unique target tuple te composed of the most accurate values from Ie for all the attributes of e. (3) We propose a framework for inferring accurate values with user interaction. (4) We provide algorithms underlying the framework, to find the unique target tuple te whenever possible; when there is no enough information to decide a complete te, we compute top-k candidate targets based on a preference model. (5) Using real-life and synthetic data, we experimentally verify the effectiveness and efficiency of our method.

Download statistics

No data available

ID: 17626700