Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs

Bogdan-Adrian Manghiuc, He Sun

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Hierarchical clustering studies a recursive partition of a data set into clusters of successively smaller size, and is a fundamental problem in data analysis. In this work we study the cost function for hierarchical clustering introduced by Dasgupta, and present two polynomial-time approximation algorithms: Our first result is an O(1)-approximation algorithm for graphs of high conductance. Our simple construction bypasses complicated recursive routines of finding sparse cuts known in the literature. Our second and main result is an O(1)-approximation algorithm for a wide family of graphs that exhibit a well defined structure of clusters. This result generalises the previous state-of-the art, which holds only for graphs generated from stochastic models. The significance of our work is demonstrated by the empirical analysis on both synthetic and real-world data sets, on which our presented algorithm outperforms the previously proposed algorithm for graphs with a well-defined cluster structure.
Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems
EditorsM. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, J. Wortman Vaughan
PublisherCurran Associates Inc
Number of pages12
Publication statusPublished - 6 Dec 2021
EventThirty-fifth Conference on Neural Information Processing Systems - Virtual
Duration: 6 Dec 202114 Dec 2021

Publication series

NameAdvances in Neural Information Processing Systems
ISSN (Print)1049-5258


ConferenceThirty-fifth Conference on Neural Information Processing Systems
Abbreviated titleNeurIPS 2021
Internet address


Dive into the research topics of 'Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs'. Together they form a unique fingerprint.

Cite this