Edinburgh Research Explorer

Lazy Allocation and Transfer Fusion Optimization for GPU-Based Heterogeneous Systems

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publication2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)
Place of PublicationCambridge, UK
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages311-315
Number of pages5
ISBN (Electronic)978-1-5386-4975-6
ISBN (Print)978-1-5386-4976-3
DOIs
Publication statusE-pub ahead of print - 7 Jun 2018
Event26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP 2018) - Cambridge, United Kingdom
Duration: 21 Mar 201823 Aug 2018
http://www.pdp2018.org/

Conference

Conference26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP 2018)
Abbreviated titlePDP 2018
CountryUnited Kingdom
CityCambridge
Period21/03/1823/08/18
Internet address

Abstract

We present two memory optimization techniques which improve the efficiency of data transfer over PCIe bus for GPU-based heterogeneous systems, namely lazy allocation and transfer fusion optimization. Both are based on merging data transfers so that less overhead is incurred, thereby increasing transfer throughput and making accelerator usage profitable also for smaller operand sizes. We provide the design and prototype implementation of the two techniques in CUDA. Microbenchmarking results show that especially for smaller and medium-sized operands significant speedups can be achieved. We also prove that our transfer fusion optimization algorithm is optimal.

    Research areas

  • graphics processing units, parallel architectures, peripheral interfaces, data transfer, transfer fusion optimization algorithm, lazy allocation, GPU, heterogeneous systems, memory optimization techniques, PCIe bus, CUDA, Resource management, Arrays, Data transfer, Graphics processing units, Optimization, Merging, Kernel, adaptive message fusion, lazy memory allocation, memory transfer optimization

ID: 74744575