Approximate Computing for Stream Analytics

Do Le Quoc, Ruichuan Chen, Pramod Bhatotia, Christof Fetzer, Volker Hilt, Thorsten Strufe

Research output: Chapter in Book/Report/Conference proceedingChapter (peer-reviewed)peer-review

Abstract

Approximate computing has become a promising mechanism to trade off accuracy for efficiency. The idea behind approximate computing is to compute over a representative sample instead of the entire input dataset. Thus, approximate computing — based on the chosen sample size — can make a systematic trade-off between the output accuracy and computation efficiency. Unfortunately, the state-oftheart systems for approximate computing primarily target batch analytics, where the input data remains unchanged during the course of computation. Thus, they are not well-suited for stream analytics. This motivated the design of STREAMAPPROX— a stream analytics system for approximate computing. To realize this idea, an online stratified reservoir sampling algorithm is designed to produce approximate output with rigorous error bounds. Importantly, the proposed algorithm is generic and can be applied to two prominent types of stream processing systems: (1) batched stream processing such as Apache Spark Streaming, and (2) pipelined stream processing such as Apache Flink.
Original languageEnglish
Title of host publicationEncyclopedia of Big Data Technologies
EditorsSherif Sakr, Albert Zomaya
PublisherSpringer-Verlag
ChapterA
Pages90-97
Number of pages8
Edition1
ISBN (Electronic)978-3-319-77525-8
ISBN (Print)3319775243, 978-3319775241, 978-3-319-77526-5
DOIs
Publication statusPublished - 1 Mar 2019

Fingerprint Dive into the research topics of 'Approximate Computing for Stream Analytics'. Together they form a unique fingerprint.

Cite this