Ensemble Clustering for Result Diversification

Dong Nguyen, Djoerd Hiemstra

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper describes the participation of the University of
Twente in the Web track of TREC 2012. Our baseline approach
uses the Mirex toolkit, an open source tool that sequantially
scans all the documents. For result diversification,
we experimented with improving the quality of clusters
through ensemble clustering. We combined clusters obtained
by different clustering methods (such as LDA and
K-means) and clusters obtained by using different types of
data (such as document text and anchor text). Our twolayer
ensemble run performed better than the LDA based diversification
and also better than a non-diversification run.
Original languageEnglish
Title of host publicationProceedings of The Twenty-First Text REtrieval Conference, TREC 2012, Gaithersburg, Maryland, USA, November 6-9, 2012
Number of pages4
Publication statusPublished - 2012

Fingerprint

Dive into the research topics of 'Ensemble Clustering for Result Diversification'. Together they form a unique fingerprint.

Cite this