Slider: Incremental Sliding Window Analytics

Pramod Bhatotia, Umut A. Acar, Flavio P. Junqueira, Rodrigo Rodrigues

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Sliding window analytics is often used in distributed data-parallel computing for analyzing large streams of continuously arriving data. When pairs of consecutive windows overlap, there is a potential to update the output incrementally, more efficiently than recomputing from scratch. However, in most systems, realizing this potential requires programmers to explicitly manage the intermediate state for overlapping windows, and devise an application-specific algorithm to incrementally update the output.

In this paper, we present self-adjusting contraction trees, a set of data structures and algorithms for transparently updating the output of a sliding window computation as the window moves, while reusing, to the extent possible, results from prior computations. Self-adjusting contraction trees structure sub-computations of a data-parallel computation in the form of a shallow (logarithmic depth) balanced data dependence graph, through which input changes are efficiently propagated in asymptotically sub-linear time.

We implemented self-adjusting contraction trees in a system called Slider. The design of Slider incorporates several novel techniques, most notably: (i) a set of self balancing trees tuned for different variants of sliding window computation (append-only, fixed-width, or variable-width slides); (ii) a split processing mode, where a background pre-processing stage leverages the predictability of input changes to pave the way for a more efficient foreground processing when the window slides; and (iii) an extension of the data structures to handle multiple-job workflows such as data-flow query processing. We evaluated Slider using a variety of applications and real-world case studies. Our results show significant performance gains without requiring any changes to the existing application code used for non-incremental data processing.
Original languageEnglish
Title of host publicationProceedings of the 15th International Middleware Conference
Place of PublicationNew York, NY, USA
PublisherACM
Pages61-72
Number of pages12
ISBN (Print)978-1-4503-2785-5
DOIs
Publication statusPublished - Dec 2014

Publication series

NameMiddleware '14
PublisherACM

Fingerprint Dive into the research topics of 'Slider: Incremental Sliding Window Analytics'. Together they form a unique fingerprint.

Cite this