TY - GEN
T1 - Graph pattern based RDF data compression
AU - Pan, Jeff Z.
AU - Pérez, José Manuel Gómez
AU - Ren, Yuan
AU - Wu, Honghan
AU - Wang, Haofen
AU - Zhu, Man
PY - 2015/2/21
Y1 - 2015/2/21
N2 - The growing volume of RDF documents and their inter-linking raise a challenge on the storage and transferring of such documents. One solution to this problem is to reduce the size of RDF documents via compression. Existing approaches either applywell-known generic compression technologies but seldom exploit the graph structure of RDF documents.Or, they focus on minimized compact serial isations leaving the graph nature inexplicit, which leads obstacles for further applying higher level compression techniques. In this paper we propose graph pattern based technologies, which on the one hand can reduce the numbers of triples in RDF documents and on the other hand can serial ise RDF graph in a data pattern based way, which can deal with syntactic redundancies which are not eliminable to existing techniques. Evaluation on real world datasets shows that our approach can substantially reduce the size of RDF documents by complementing the abilities of existing approaches. Furthermore, the evaluation results on rule mining operations show the potentials of the proposed serialisation format in supporting efficient data access.
AB - The growing volume of RDF documents and their inter-linking raise a challenge on the storage and transferring of such documents. One solution to this problem is to reduce the size of RDF documents via compression. Existing approaches either applywell-known generic compression technologies but seldom exploit the graph structure of RDF documents.Or, they focus on minimized compact serial isations leaving the graph nature inexplicit, which leads obstacles for further applying higher level compression techniques. In this paper we propose graph pattern based technologies, which on the one hand can reduce the numbers of triples in RDF documents and on the other hand can serial ise RDF graph in a data pattern based way, which can deal with syntactic redundancies which are not eliminable to existing techniques. Evaluation on real world datasets shows that our approach can substantially reduce the size of RDF documents by complementing the abilities of existing approaches. Furthermore, the evaluation results on rule mining operations show the potentials of the proposed serialisation format in supporting efficient data access.
UR - http://www.scopus.com/inward/record.url?scp=84928920380&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-15615-6_18
DO - 10.1007/978-3-319-15615-6_18
M3 - Conference contribution
AN - SCOPUS:84928920380
VL - 8943
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 239
EP - 256
BT - Semantic Technology - 4th Joint International Conference, JIST 2014, Revised Selected Papers
PB - Springer
T2 - 4th Joint International Conference on Semantic Technology, JIST 2014
Y2 - 9 November 2014 through 11 November 2014
ER -