ANMAT: Automatic Knowledge Discovery and Error Detection through Pattern Functional Dependencies

Abdulhakim Qahtan, Nan Tang, Mourad Ouzzani, Yang Cao, Michael Stonebraker

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Knowledge discovery is critical to successful data analytics. We propose a new type of meta-knowledge, namely pattern functional dependencies (PFDs), that combine patterns (or regex-like rules) and integrity constraints (ICs) to model the dependencies (or meta-knowledge) between partial values (or patterns) across different attributes in a table. PFDs go beyond the classical functional dependencies and their ex- tensions. For instance, in an employee table, ID “F-9-107”, “F” determines the financial department, and “9” determines one’s grade. Moreover, a key application of PFDs is to use them to identify erroneous data; tuples that violate some PFDs. In this demonstration, attendees will experience the following features: PFD discovery – automatically discover PFDs from (dirty) data in different domains; and Error detection with PFDs – we will show errors that are detected by PFDs but cannot be captured by existing approaches.
Original languageEnglish
Title of host publicationProceedings of the 2019 International Conference on Management of Data
Place of PublicationNew York
PublisherACM
Pages1977-1980
Number of pages4
ISBN (Print)978-1-4503-5643-5
DOIs
Publication statusPublished - 25 Jun 2019
EventACM SIGMOD/PODS International Conference on Management of Data (SIGMOD 2019) - Amsterdam, Netherlands
Duration: 30 Jun 20195 Jul 2019
http://sigmod2019.org/

Conference

ConferenceACM SIGMOD/PODS International Conference on Management of Data (SIGMOD 2019)
Abbreviated titleSIGMOD 2019
Country/TerritoryNetherlands
CityAmsterdam
Period30/06/195/07/19
Internet address

Keywords / Materials (for Non-textual outputs)

  • data cleaning
  • Pattern Functional Dependencies
  • Constrained Patterns
  • Error Detection
  • Knowledge Discovery

Fingerprint

Dive into the research topics of 'ANMAT: Automatic Knowledge Discovery and Error Detection through Pattern Functional Dependencies'. Together they form a unique fingerprint.

Cite this