Projects per year
Abstract
Mining itemsets that are the most interesting under a statistical model of the underlying data is a commonly used and well-studied technique for exploratory data analysis, with the most recent interestingness models exhibiting state of the art performance. Continuing this highly promising line of work, we propose the first, to the best of our knowledge, generative model over itemsets, in the form of a Bayesian network, and an associated novel measure of interestingness. Our model is able to efficiently infer interesting itemsets directly from the transaction database using structural EM, in which the E-step employs the greedy approximation to weighted set cover. Our approach is theoretically simple, straightforward to implement, trivially parallelizable and retrieves
itemsets whose quality is comparable to, if not better than, existing state
of the art algorithms as we demonstrate on several real-world datasets.
itemsets whose quality is comparable to, if not better than, existing state
of the art algorithms as we demonstrate on several real-world datasets.
Original language | English |
---|---|
Title of host publication | The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD 2016) |
Place of Publication | Riva del Garda, Italy |
Publisher | Springer |
Pages | 410-425 |
Number of pages | 16 |
ISBN (Electronic) | 978-3-319-46227-1 |
ISBN (Print) | 978-3-319-46226-4 |
DOIs | |
Publication status | Published - 4 Sept 2016 |
Event | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2016 - Riva del Garda, Italy Duration: 19 Sept 2016 → 23 Sept 2016 http://www.ecmlpkdd2016.org/ |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer, Cham |
Volume | 9852 |
ISSN (Print) | 0302-9743 |
Conference
Conference | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2016 |
---|---|
Abbreviated title | ECML-PKDD 2016 |
Country/Territory | Italy |
City | Riva del Garda |
Period | 19/09/16 → 23/09/16 |
Internet address |
Fingerprint
Dive into the research topics of 'A Bayesian Network Model for Interesting Itemsets'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Statistical Natural Language Processing Methods for Computer Program Source Code
Sutton, C. (Principal Investigator)
1/10/13 → 31/03/17
Project: Research