Exploiting GPU Hardware Saturation for Fast Compiler Optimization

Alberto Magni, Christophe Dubach, Michael O'Boyle

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Graphics Processing Units (GPUs) are efficient devices capable of delivering high performance for general purpose computation. Realizing their full performance potential often requires extensive compiler tuning. This process is particularly expensive since it has to be repeated for each target program and platform.

In this paper we study the utilization of GPU hardware resources across multiple input sizes and compiler options. In this context we introduce the notion of hardware saturation. Saturation is reached when an application is executed with a number of threads large enough to fully utilize the available hardware resources. We give experimental evidence of hardware saturation and describe its properties using 16 OpenCL kernels on 3 GPUs from Nvidia and AMD. We show that input sizes that saturates the GPU show performance stability across compiler transformations.

Using the thread-coarsening transformation as an example, we show that compiler settings maintain their relative performance across input sizes within the saturation region. Leveraging these hardware and software properties we propose a technique to identify the input size at the lower bound of the saturation zone, we call it Minimum Saturation Point (MSP). By performing iterative compilation on the MSP input size we obtain results effectively applicable for much large input problems reducing the overhead of tuning by an order of magnitude on average.
Original languageEnglish
Title of host publicationProceedings of Workshop on General Purpose Processing Using GPUs
Place of PublicationNew York, NY, USA
PublisherACM
Pages99-106
Number of pages8
ISBN (Print)978-1-4503-2766-4
DOIs
Publication statusPublished - 2014

Keywords

  • GPGPU, OpenCL, iterative compilation, optimization

Fingerprint Dive into the research topics of 'Exploiting GPU Hardware Saturation for Fast Compiler Optimization'. Together they form a unique fingerprint.

Cite this