Predicate Invention Based RDF DataCompression

Man Zhu, Weixin Wu, Jeff Z. Pan, Jingyu Han, Pengfei Huang, Qian Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

RDF is a data representation format for schema-free structured information that is gaining speed in the context of semantic web, life science, and vice versa. With the continuing proliferation of structured data, demand for RDF compression is becoming increasingly important. In this study, we introduce a novel lossless compression technique for RDF datasets (triples), called PIC (Predicate Invention based Compression). By generating informative predicates and constructing effective mapping to original predicates, PIC only needs to store dramatically reduced number of triples with the newly created predicates, and restoring the original triples efficiently using the mapping. These predicates are automatically generated by a decomposable forward-backward procedure, which consequently supports very fast parallel bit computation. As a semantic compression method for structured data, besides the reduction of syntactic verbosity and data redundancy, we also invoke semantics in the RDF datasets. Experiments on various datasets show competitive results in terms of compression ratio.
Original languageEnglish
Title of host publicationSemantic Technology
Subtitle of host publication8th Joint International Conference, JIST 2018, Awaji, Japan, November 26–28, 2018, Proceedings
EditorsRyutaro Ichise, Freddy Lecue, Takahiro Kawamura, Dongyan Zhao, Stephen Muggleton, Kouji Kozaki
Place of PublicationCham
PublisherSpringer International Publishing
Pages153-161
Number of pages9
ISBN (Electronic)978-3-030-04284-4
ISBN (Print)978-3-030-04283-7
DOIs
Publication statusPublished - 14 Nov 2018
EventThe 8th Joint International Semantic Technology Conference - Awaji City, Japan
Duration: 26 Nov 201828 Nov 2018
http://jist2018.knowledge-graph.jp/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer, Cham
Volume11341
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceThe 8th Joint International Semantic Technology Conference
Abbreviated titleJIST 2018
CountryJapan
CityAwaji City
Period26/11/1828/11/18
Internet address

Fingerprint

Dive into the research topics of 'Predicate Invention Based RDF DataCompression'. Together they form a unique fingerprint.

Cite this