Projects per year
Graph pattern matching is finding all matches in a data graph for a given pattern graph and is often defined in terms of subgraph isomorphism, an NP-complete problem. To lower its complexity, various extensions of graph simulation have been considered instead. These extensions allow graph pattern matching to be conducted in cubic time. However, they fall short of capturing the topology of data graphs, that is, graphs may have a structure drastically different from pattern graphs they match, and the matches found are often too large to understand and analyze. To rectify these problems, this article proposes a notion of strong simulation, a revision of graph simulation for graph pattern matching. (1) We identify a set of criteria for preserving the topology of graphs matched. We show that strong simulation preserves the topology of data graphs and finds a bounded number of matches. (2) We show that strong simulation retains the same complexity as earlier extensions of graph simulation by providing a cubic-time algorithm for computing strong simulation. (3) We present the locality property of strong simulation which allows us to develop an effective distributed algorithm to conduct graph pattern matching on distributed graphs. (4) We experimentally verify the effectiveness and efficiency of these algorithms using both real-life and synthetic data.