Distributed Stream Consistency Checking

Shen Gao, Daniele Dell'Aglio, Jeff Z. Pan, Abraham Bernstein

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Dealing with noisy data is one of the big issues in stream processing. While noise has been widely studied in settings where streams have simple schemas, e.g. time series, few solutions focused on streams characterized by complex data structures. This paper studies how to check consistency over large amounts of complex streams. Our proposed methods exploit reasoning to assess if portions of the streams are compliant to a reference conceptual model. To achieve scalability, our methods run on state-of-the-art distributed stream processing platforms, e.g. Apache Storm or Twitter Heron. Our first method computes the closure of Negative Inclusions (NIs) for DL-Lite ontologies and registers the NIs as queries. The second method compiles the ontology into a processing pipeline to evenly distribute the workload. Experiments compares the two methods and show that the second one improves the throughput up to 139% with the LUBM ontology and 330% with the NPD ontology.
Original languageEnglish
Title of host publicationWeb Engineering
Subtitle of host publication18th International Conference, ICWE 2018, Cáceres, Spain, June 5-8, 2018, Proceedings
EditorsTommi Mikkonen, Ralf Klamma, Juan Hernández
Place of PublicationCham
PublisherSpringer International Publishing
Pages387-403
Number of pages17
ISBN (Electronic)978-3-319-91662-0
ISBN (Print)978-3-319-91661-3
DOIs
Publication statusPublished - 20 May 2018
Event18th International Conference on Web Engineering - Cáceres , Spain
Duration: 5 Jun 20188 Jun 2018
https://icwe2018.webengineering.org/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer, Cham
Volume10845
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th International Conference on Web Engineering
Abbreviated titleICWE 2018
CountrySpain
CityCáceres
Period5/06/188/06/18
Internet address

Fingerprint

Dive into the research topics of 'Distributed Stream Consistency Checking'. Together they form a unique fingerprint.

Cite this