TY - GEN
T1 - Making pattern queries bounded in big graphs
AU - Cao, Yang
AU - Fan, Wenfei
AU - Huai, Jinpeng
AU - Huang, Ruizhe
PY - 2015/4/1
Y1 - 2015/4/1
N2 - It is cost-prohibitive to find matches Q(G) of a pattern query Q in a big graph G. We approach this by fetching a small subgraph GQ of G such that Q(GQ) = Q(G). We show that many practical patterns are effectively bounded under access constraints A commonly found in real life, such that GQ can be identified in time determined by Q and A only, independent of the size |G| of G. This holds no matter whether pattern queries are localized (e.g., via subgraph isomorphism) or non-localized (graph simulation). We provide algorithms to decide whether a pattern Q is effectively bounded, and if so, to generate a query plan that computes Q(G) by accessing GQ, in time independent of |G|. When Q is not effectively bounded, we give an algorithm to extend access constraints and make Q bounded in G. Using real-life data, we experimentally verify the effectiveness of the approach, e.g., about 60% of queries are effectively bounded for subgraph isomorphism, and for such queries our approach outperforms the conventional methods by 4 orders of magnitude.
AB - It is cost-prohibitive to find matches Q(G) of a pattern query Q in a big graph G. We approach this by fetching a small subgraph GQ of G such that Q(GQ) = Q(G). We show that many practical patterns are effectively bounded under access constraints A commonly found in real life, such that GQ can be identified in time determined by Q and A only, independent of the size |G| of G. This holds no matter whether pattern queries are localized (e.g., via subgraph isomorphism) or non-localized (graph simulation). We provide algorithms to decide whether a pattern Q is effectively bounded, and if so, to generate a query plan that computes Q(G) by accessing GQ, in time independent of |G|. When Q is not effectively bounded, we give an algorithm to extend access constraints and make Q bounded in G. Using real-life data, we experimentally verify the effectiveness of the approach, e.g., about 60% of queries are effectively bounded for subgraph isomorphism, and for such queries our approach outperforms the conventional methods by 4 orders of magnitude.
KW - graph theory
KW - query processing
KW - access constraints
KW - big graph
KW - pattern queries
KW - query plan generation
KW - subgraph isomorphism
KW - Adaptation models
KW - Approximation algorithms
KW - Awards activities
KW - Computational modeling
KW - Indexes
KW - Motion pictures
KW - Pattern matching
U2 - 10.1109/ICDE.2015.7113281
DO - 10.1109/ICDE.2015.7113281
M3 - Conference contribution
SP - 161
EP - 172
BT - Data Engineering (ICDE), 2015 IEEE 31st International Conference on
PB - Institute of Electrical and Electronics Engineers (IEEE)
ER -