Clustering with a Reject Option: Interactive Clustering as Bayesian Prior Elicitation

Akash Srivastava, James Zou, Charles Sutton

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A good clustering can help a data analyst to explore and understand a data set, but what constitutes a good clustering may depend on domain-specific and application-specific criteria. These criteria can be difficult to formalize, even when it is easy for an analyst to know a good clustering when she sees one. We present a new approach to interactive clustering for data exploration, called \ciif, based on a particularly simple feedback mechanism, in which an analyst can choose to reject individual clusters and request new ones. The new clusters should be different from previously rejected clusters while still fitting the data well. We formalize this interaction in a novel Bayesian prior elicitation framework. In each iteration, the prior is adapted to account for all the previous feedback, and a new clustering is then produced from the posterior distribution. To achieve the computational efficiency necessary for an interactive setting, we propose an incremental optimization method over data minibatches using Lagrangian relaxation. Experiments demonstrate that \ciif can produce accurate and diverse clusterings.
Original languageEnglish
Title of host publicationProceedings of the 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016)
Pages16-20
Number of pages5
Publication statusPublished - 27 Jul 2016
Event33rd International Conference on Machine Learning: ICML 2016 - New York, United States
Duration: 19 Jun 201624 Jun 2016
https://icml.cc/Conferences/2016/

Conference

Conference33rd International Conference on Machine Learning
Abbreviated titleICML 2016
Country/TerritoryUnited States
CityNew York
Period19/06/1624/06/16
Internet address

Fingerprint

Dive into the research topics of 'Clustering with a Reject Option: Interactive Clustering as Bayesian Prior Elicitation'. Together they form a unique fingerprint.

Cite this