ptype: probabilistic type inference

Taha Ceritli, Christopher K. I. Williams, James Geddes

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

Type inference refers to the task of inferring the data type of a given column of data. Current approaches often fail when data contains missing data and anomalies, which are found commonly in real-world data sets. In this paper, we propose ptype, a probabilistic robust type inference method that allows us to detect such entries, and infer data types. We further show that the proposed method outperforms existing methods.
Original languageEnglish
Pages (from-to)870-904
Number of pages35
JournalData Mining and Knowledge Discovery
Volume34
Issue number3
Early online date16 Mar 2020
DOIs
Publication statusPublished - 31 May 2020

Keywords / Materials (for Non-textual outputs)

  • Type inference
  • Robustness
  • Probabilistic finite-state machine

Fingerprint

Dive into the research topics of 'ptype: probabilistic type inference'. Together they form a unique fingerprint.

Cite this