Abstract
Executing multiple OpenCL kernels on the same GPU concurrently
is a promising method for improving hardware utilisation and
system performance. Schemes of scheduling impact the resulting
performance significantly by selecting different kernels to run together on the same GPU. Existing approaches use either execution time or relative speedup of kernels as a guide to group and map them to the device. However, these simple methods work on the cost of providing suboptimal performance.
In this paper, we propose a graph-based algorithm to schedule
co-run kernel in pairs to optimise the system performance. Target
workloads are represented by a graph, in which vertices stand for distinct kernels while edges between two vertices represent the corresponding two kernels co-execution can deliver a better
performance than run them one after another. Edges are weighted
to provide information of performance gain from co-execution. Our algorithm works in the way of finding out the maximum weighted matching of the graph. By maximising the accumulated weights, our algorithm improves performance significantly
comparing to other approaches.
is a promising method for improving hardware utilisation and
system performance. Schemes of scheduling impact the resulting
performance significantly by selecting different kernels to run together on the same GPU. Existing approaches use either execution time or relative speedup of kernels as a guide to group and map them to the device. However, these simple methods work on the cost of providing suboptimal performance.
In this paper, we propose a graph-based algorithm to schedule
co-run kernel in pairs to optimise the system performance. Target
workloads are represented by a graph, in which vertices stand for distinct kernels while edges between two vertices represent the corresponding two kernels co-execution can deliver a better
performance than run them one after another. Edges are weighted
to provide information of performance gain from co-execution. Our algorithm works in the way of finding out the maximum weighted matching of the graph. By maximising the accumulated weights, our algorithm improves performance significantly
comparing to other approaches.
Original language | English |
---|---|
Title of host publication | Proceedings of the 11th Workshop on General Purpose GPUs |
Place of Publication | New York, NY, USA |
Publisher | ACM |
Pages | 40-49 |
Number of pages | 10 |
ISBN (Print) | 978-1-4503-5647-3 |
DOIs | |
Publication status | Published - 24 Feb 2018 |
Event | 11th Workshop on General Purpose GPUs - Vienna, Austria Duration: 24 Feb 2018 → 28 Feb 2018 https://ppopp18.sigplan.org/track/GPGPU-2018-papers |
Publication series
Name | GPGPU-11 |
---|---|
Publisher | ACM |
Workshop
Workshop | 11th Workshop on General Purpose GPUs |
---|---|
Abbreviated title | GPGPU-11 |
Country/Territory | Austria |
City | Vienna |
Period | 24/02/18 → 28/02/18 |
Internet address |
Keywords
- Concurrent Kernels, GPGPU, Scheduling