Robust Variational Autoencoders for Outlier Detection in Mixed-Type Data

Simão Eduardo, Alfredo Nazábal, Christopher K I Williams, Charles Sutton

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We focus on the problem of unsupervised cell outlier detection and repair in mixed-type tabular data. Traditional methods are concerned only on detecting which rows in the dataset are outliers. However, identifying which cells corrupt a specific row is an important problem in practice, and the very first step towards repairing them. We introduce the Robust Variational Autoencoder (RVAE), a deep generative model that learns the joint distribution of the clean data while identifying the outlier cells, allowing their imputation (repair). RVAE explicitly learns the probability of each cell being an outlier, balancing different likelihood models in the row outlier score, making the method suitable for OD in mixed-type datasets. We show experimentally that not only RVAE performs better than several state-of-the-art methods in cell OD and repair for tabular data, but also that is robust against the initial hyper-parameter selection.
Original languageEnglish
Title of host publicationProceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics
PublisherPMLR
Pages4056-4066
Number of pages10
Publication statusPublished - 28 Aug 2020
Event23rd International Conference on Artificial Intelligence and Statistics - Teatro Politeama, Online, Italy
Duration: 26 Aug 202028 Aug 2020
Conference number: 23
https://www.aistats.org/

Publication series

NameProceedings of Machine Learning Research
PublisherPMLR
Volume108
ISSN (Electronic)2640-3498

Conference

Conference23rd International Conference on Artificial Intelligence and Statistics
Abbreviated titleAISTATS 2020
Country/TerritoryItaly
CityOnline
Period26/08/2028/08/20
Internet address

Fingerprint

Dive into the research topics of 'Robust Variational Autoencoders for Outlier Detection in Mixed-Type Data'. Together they form a unique fingerprint.

Cite this