The Promises of Hybrid Hexagonal/Classical Tiling for GPU

Tobias Grosser, Sven Verdoolaege, Albert Cohen, P. Sadayappan

Research output: Working paper

Abstract

Time-tiling is necessary for efficient execution of iterative stencil computations. But the usual hyper-rectangular tiles cannot be used because of positive/negative dependence distances along the stencil's spatial dimensions. Several prior efforts have addressed this issue. However, known techniques trade enhanced data reuse for other causes of inefficiency, such as unbalanced parallelism, redundant computations, or increased control flow overhead incompatible with efficient GPU execution. We explore a new path to maximize the effectivness of time-tiling on iterative stencil computations. Our approach is particularly well suited for GPUs. It does not require any redundant computations, it favors coalesced global-memory access and data reuse in shared-memory/cache, avoids thread divergence, and extracts a high degree of parallelism. We introduce hybrid hexagonal tiling, combining hexagonal tile shapes along the time (sequential) dimension and one spatial dimension, with classical tiling for other spatial dimensions. An hexagonal tile shape simultaneously enable parallel tile execution and reuse along the time dimension. Experimental results demonstrate significant performance improvements over existing stencil compilers.
Original languageEnglish
PublisherINRIA
Publication statusPublished - 27 Jul 2013

Fingerprint

Dive into the research topics of 'The Promises of Hybrid Hexagonal/Classical Tiling for GPU'. Together they form a unique fingerprint.

Cite this