Sequential Hierarchical Pattern Clustering

Bassam Farran*, Amirthalingam Ramanan, Mahesan Niranjan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Clustering is a, widely used unsupervised data analysis technique in machine learning. However, a common requirement amongst many existing clustering methods is that all pairwise distances between patterns must be computed in advance. This makes it computationallly expensive and difficult to cope with large scale data used in several applications, such as in bioinformatics. In this paper we propose a novel sequential hierarchical clustering technique that initially builds a hierarchical tree from a small fraction of the entire data, while the remaining data is processed sequentially and the tree adapted constructively. Preliminary results using this approach show that the quality of the clusters obtained does not degrade while reducing the computational needs.

Original languageEnglish
Title of host publicationPATTERN RECOGNITION IN BIOINFORMATICS, PROCEEDINGS
Editors Kadirkamanathan, G Sanguinetti, M Girolami, M Niranjan, J Noirel
PublisherSpringer-Verlag Berlin Heidelberg
Pages79-88
Number of pages10
ISBN (Print)978-3-642-04030-6
Publication statusPublished - 2009
Event4th International Conference Pattern Recognition in Bioinformatics - Sheffield, United Kingdom
Duration: 7 Sep 20099 Sep 2009

Publication series

NameLECTURE NOTES IN BIOINFORMATICS
PublisherSPRINGER-VERLAG BERLIN
Volume5780
ISSN (Print)0302-9743

Conference

Conference4th International Conference Pattern Recognition in Bioinformatics
CountryUnited Kingdom
Period7/09/099/09/09

Keywords

  • On-line clustering
  • Hierarchical clustering
  • Large scale data
  • Gene expression
  • NEURAL NETWORKS

Cite this