Edinburgh Research Explorer

Merge or Separate? Multi-job Scheduling for OpenCL Kernels on CPU/GPU Platforms

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Access status

Open

Documents

http://dl.acm.org/citation.cfm?doid=3038228.3038235
Original languageEnglish
Title of host publicationWorkshop about general purpose processing using GPUs (GPGPU-10)
Subtitle of host publicationHeld in cooperation with 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'17)
PublisherACM
Pages22-31
Number of pages10
ISBN (Print)978-1-4503-4915-4
DOIs
StatePublished - 5 Feb 2017

Abstract

Computer systems are increasingly heterogeneous with nodes consisting of CPUs and GPU accelerators. As such systems become mainstream, they move away from specialized high-performance single application platforms to a more general setting with multiple, concurrent, application jobs. Determining how jobs should be dynamically best scheduled to heterogeneous devices is non-trivial. In certain cases, performance is maximized if jobs are allocated to a single device, in others, sharing is preferable. In this paper, we present a runtime framework which schedules multi-user OpenCL tasks to their most suitable device in a CPU/GPU system. We use a machine learning-based predictive model at runtime to detect whether to merge OpenCL kernels or schedule them separately to the most appropriate devices without the need
for ahead-of-time pro ling. We evaluate out approach over a wide range of workloads, on two separate platforms. We consistently show signi cant performance and turn-around time improvement over the state-of-the-art across programs, workload, and platforms.

Download statistics

No data available

ID: 30927366