Query preserving graph compression

Wenfei Fan, Jianzhong Li, Xin Wang, Yinghui Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

It is common to find graphs with millions of nodes and billions of edges in, e.g., social networks. Queries on such graphs are often prohibitively expensive. These motivate us to propose query preserving graph compression, to compress graphs relative to a class Λ of queries of users' choice. We compute a small Gr from a graph G such that (a) for any query Q Ε Λ Q, Q(G) = Q'(Gr), where Q' Ε Λ can be efficiently computed from Q; and (b) any algorithm for computing Q(G) can be directly applied to evaluating Q' on Gr as is. That is, while we cannot lower the complexity of evaluating graph queries, we reduce data graphs while preserving the answers to all the queries in Λ. To verify the effectiveness of this approach, (1) we develop compression strategies for two classes of queries: reachability and graph pattern queries via (bounded) simulation. We show that graphs can be efficiently compressed via a reachability equivalence relation and graph bisimulation, respectively, while reserving query answers. (2) We provide techniques for maintaining compressed graph Gr in response to changes ΔG to the original graph G. We show that the incremental maintenance problems are unbounded for the two lasses of queries, i.e., their costs are not a function of the size of ΔG and changes in Gr. Nevertheless, we develop incremental algorithms that depend only on ΔG and Gr, independent of G, i.e., we do not have to decompress Gr to propagate the changes. (3) Using real-life data, we experimentally verify that our compression techniques could reduce graphs in average by 95% for reachability and 57% for graph pattern matching, and that our incremental maintenance algorithms are efficient.

Original languageEnglish
Title of host publicationProceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Place of PublicationNew York, NY, USA
PublisherACM
Pages157-168
Number of pages12
ISBN (Print)978-1-4503-1247-9
DOIs
Publication statusPublished - 2012

Publication series

NameSIGMOD '12
PublisherACM

Keywords / Materials (for Non-textual outputs)

  • graph compression
  • reachability queries
  • pattern queries

Fingerprint

Dive into the research topics of 'Query preserving graph compression'. Together they form a unique fingerprint.

Cite this