Efficient Utility-based Clustering over High Dimensional Partition Spaces

Silvia Liverani, Paul E. Anderson, Kieron D. Edwards, Andrew J. Millar, Jim Q. Smith

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

Because of the huge number of partitions of even a moderately sized dataset, even when Bayes factors have a closed form, in model-based clustering a comprehensive search for the highest scoring (MAP) partition is usually impossible. However, when each cluster in a partition has a signature and it is known that some signatures are of scientific interest whilst others are not, it is possible, within a Bayesian framework, to develop search algorithms which are guided by these cluster signatures. Such algorithms can be expected to find better partitions more quickly. In this paper we develop a framework within which these ideas can be formalized. We then briefly illustrate the efficacy of the proposed guided search on a microarray time coursed at a set where the clustering objective is to identify clusters of genes with different types of circadian expression profiles.

Original languageEnglish
Pages (from-to)539-572
Number of pages34
JournalBayesian analysis
Volume4
Issue number3
DOIs
Publication statusPublished - 2009

Keywords / Materials (for Non-textual outputs)

  • TIME-SERIES
  • Transcriptomics
  • Circadian Rhythms
  • Biological Clocks
  • Arabidopsis thaliana
  • Bayesian inference

Fingerprint

Dive into the research topics of 'Efficient Utility-based Clustering over High Dimensional Partition Spaces'. Together they form a unique fingerprint.

Cite this