Determining the relative accuracy of attributes

Yang Cao, Wenfei Fan, Wenyuan Yu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The relative accuracy problem is to determine, given tuples t1 and t2 that refer to the same entity e, whether t1[A] is more accurate than t2A, i.e., t1A is closer to the true value of the A attribute of e than t2A. This has been a longstanding issue for data quality, and is challenging when the true values of e are unknown. This paper proposes a model for determining relative accuracy. (1) We introduce a class of accuracy rules and an inference system with a chase procedure, to deduce relative accuracy. (2) We identify and study several fundamental problems for relative accuracy. Given a set Ie of tuples pertaining to the same entity e and a set of accuracy rules, these problems are to decide whether the chase process terminates, is Church-Rosser, and leads to a unique target tuple te composed of the most accurate values from Ie for all the attributes of e. (3) We propose a framework for inferring accurate values with user interaction. (4) We provide algorithms underlying the framework, to find the unique target tuple te whenever possible; when there is no enough information to decide a complete te, we compute top-k candidate targets based on a preference model. (5) Using real-life and synthetic data, we experimentally verify the effectiveness and efficiency of our method.
Original languageEnglish
Title of host publicationProceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22-27, 2013
PublisherACM
Pages565-576
Number of pages12
DOIs
Publication statusPublished - 2013

Fingerprint Dive into the research topics of 'Determining the relative accuracy of attributes'. Together they form a unique fingerprint.

Cite this