Edinburgh Research Explorer

A Subsequence Interleaving Model for Sequential Pattern Mining

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Open Access permissions

Open

Documents

http://dl.acm.org/citation.cfm?doid=2939672.2939787
Original languageEnglish
Title of host publicationKDD '16 Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Place of PublicationSan Francisco , United States
PublisherACM
Pages835-844
Number of pages10
ISBN (Electronic)978-1-4503-4232-2
DOIs
Publication statusPublished - 13 Aug 2016
Event22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - San Francisco , United States
Duration: 13 Aug 201617 Aug 2016
http://www.kdd.org/kdd2016/

Conference

Conference22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Abbreviated titleKDD2016
CountryUnited States
CitySan Francisco
Period13/08/1617/08/16
Internet address

Abstract

Recent sequential pattern mining methods have used the minimum description length (MDL) principle to define an encoding scheme which describes an algorithm for mining the most compressing patterns in a database. We present a novel subsequence interleaving model based on a probabilistic model of the sequence database, which allows us to searchfor the most compressing set of patterns without designing a specific encoding scheme. Our proposed algorithm is able to efficiently mine the most relevant sequential patterns and rank them using an associated measure of interestingness.The efficient inference in our model is a direct result of our use of a structural expectation-maximization framework, in which the expectation-step takes the form of a submodular optimization problem subject to a coverage constraint.We show on both synthetic and real world datasets that ourmodel mines a set of sequential patterns with low spuriousness and redundancy, high interpretability and usefulness in real-world applications. Furthermore, we demonstrate that the quality of the patterns from our approach is comparable to, if not better than, existing state of the art sequential pattern mining algorithms.

Event

22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

13/08/1617/08/16

San Francisco , United States

Event: Conference

Download statistics

No data available

ID: 25565948