Edinburgh Research Explorer

Tiling Optimizations for Stencil Computations Using Rewrite Rules in Lift

Research output: Contribution to journalArticle

Related Edinburgh Organisations

Open Access permissions

Open

Documents

https://dl.acm.org/doi/abs/10.1145/3368858
Original languageEnglish
Article number52
Pages (from-to)52:2-52:25
Number of pages25
JournalACM Transactions on Architecture and Code Optimization
Volume16
Issue number4
Early online date26 Dec 2019
DOIs
Publication statusPublished - 31 Jan 2020

Abstract

Stencil computations are a widely used type of algorithm, found in applications from physical simulations to machine learning. Stencils are embarrassingly parallel, therefore fit on modern hardware such as Graphic Processing Units perfectly. Although stencil computations have been extensively studied, optimizing them for increasingly diverse hardware remains challenging. Domain-specific Languages (DSLs) have raised the programming abstraction and offer good performance; however, this method places the burden on DSL implementers to write almost full-fledged parallelizing compilers and optimizers.

Lift has recently emerged as a promising approach to achieve performance portability by using a small set of reusable parallel primitives that DSL or library writers utilize. Lift’s key novelty is in its encoding of optimizations as a system of extensible rewrite rules which are used to explore the optimization space.

This article demonstrates how complex multi-dimensional stencil code and optimizations are expressed using compositions of simple 1D Lift primitives and rewrite rules. We introduce two optimizations that provide high performance for stencils in particular: classical overlapped tiling for multi-dimensional stencils and 2.5D tiling specifically for 3D stencils. We provide an in-depth analysis on how the tiling optimizations affects stencils of different shapes and sizes across different applications. Our experimental results show that our approach outperforms existing compiler approaches and hand-tuned codes.

    Research areas

  • performance portability, GPU computing, stencil, lift, Code generation

Download statistics

No data available

ID: 133713277