Edinburgh Research Explorer

On the Complexity of Query Result Diversification

Research output: Contribution to journalArticle

Original languageEnglish
Pages (from-to)577-588
Number of pages12
JournalProceedings of the VLDB Endowment (PVLDB)
Volume6
Issue number8
Publication statusPublished - 2013

Abstract

Query result diversification is a bi-criteria optimization problem for ranking query results. Given a database D, a query Q and a positive integer k, it is to find a set of k tuples from Q(D) such that the tuples are as relevant as possible
to the query, and at the same time, as diverse as possible to each other. Subsets of Q(D) are ranked by an objective function defined in terms of relevance and diversity. Query result diversification has found a variety of applications in databases, information retrieval and operations research. This paper studies the complexity of result diversification for relational queries. We identify three problems in connection with query result diversification, to determine whether there exists a set of k tuples that is ranked above a bound
with respect to relevance and diversity, to assess the rank of a given k-element set, and to count how many k-element sets are ranked above a given bound. We study these problems for a variety of query languages and for three objective functions. We establish the upper and lower bounds of these problems, all matching, for both combined complexity and data complexity. We also investigate several special settings of these problems, identifying tractable cases.

Download statistics

No data available

ID: 17626012