Abstract
Heterogeneous multi-core platforms are increasingly prevalent due to their perceived superior performance over homogeneous systems. The best performance, however, can only be achieved if tasks are accurately mapped to the right processors. OpenCL programs can be partitioned to take advantage of all the available processors in a system. However, finding the best partitioning for any heterogeneous system is difficult and depends on the hardware and software implementation.
We propose a portable partitioning scheme for OpenCL programs on heterogeneous CPU-GPU systems. We develop a purely static approach based on predictive modelling and program features. When evaluated over a suite of 47 benchmarks, our model achieves a speedup of 1.57 over a state-of-the-art dynamic run-time approach, a speedup of 3.02 over a purely multi-core approach and 1.55 over the performance achieved by using just the GPU.
We propose a portable partitioning scheme for OpenCL programs on heterogeneous CPU-GPU systems. We develop a purely static approach based on predictive modelling and program features. When evaluated over a suite of 47 benchmarks, our model achieves a speedup of 1.57 over a state-of-the-art dynamic run-time approach, a speedup of 3.02 over a purely multi-core approach and 1.55 over the performance achieved by using just the GPU.
Original language | English |
---|---|
Title of host publication | Compiler Construction |
Subtitle of host publication | - 20th International Conference, CC 2011, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2011, Saarbrücken, Germany, March 26-April 3, 2011. Proceedings |
Publisher | Springer |
Pages | 286-305 |
Number of pages | 20 |
ISBN (Electronic) | 978-3-642-19861-8 |
ISBN (Print) | 978-3-642-19860-1 |
DOIs | |
Publication status | Published - 2011 |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer Berlin Heidelberg |
Volume | 6601 |
ISSN (Print) | 0302-9743 |