Quaid - a platform for improving data quality

Project Details

Key findings

This is a follow-on of project EP/E029213/1
on data quality. It aims to develop a practical system for
improving data quality. The key finding is SemanDaq, a working
data cleaning system based on conditional functional dependencies
and matching dependencies. It supports the following:
(1) Data quality rule discovery: automatically discovering
conditional dependencies as data quality rules from (possibly
dirty) data.
(2) Rule validation: automatically validating the rules discovered.
(3) Error detection: detecting errors and inconsistencies in the data.
(4) Data repairing: fixing the errors detected, with performance
guarantees on the quality of repairs.
(5) Entity resolution: identifying tuples from unreliable data sources
that refer to the same real-world entity, based on the semantics of
the data.
The system was demonstrated at VLDB 2008, and was well received.
StatusFinished
Effective start/end date1/10/0930/09/10

Funding

  • EPSRC: £127,730.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.