Evaluating and Optimizing OpenCL Kernels for High Performance Computing with FPGAs

Hamid Reza Zohouri, Naoya Maruyama, Aaron Smith, Motohiko Matsuda, Satoshi Matsuoka

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We evaluate the power and performance of the Rodinia benchmark suite using the Altera SDK for OpenCL targeting a Stratix V FPGA against a modern CPU and GPU. We study multiple OpenCL kernels per benchmark, ranging from direct ports of the original GPU implementations to loop-pipelined kernels specifically optimized for FPGAs. Based on our results, we find that even though OpenCL is functionally portable across devices, direct ports of GPU-optimized code do not perform well compared to kernels optimized with FPGA-specific techniques such as sliding windows. However, by exploiting FPGA-specific optimizations, it is possible to achieve up to 3.4x better power efficiency using an Altera Stratix V FPGA in comparison to an NVIDIA K20c GPU, and better run time and power efficiency in comparison to CPU. We also present preliminary results for Arria 10, which, due to hardened FPUs, exhibits noticeably better performance compared to Stratix V in floating-point-intensive benchmarks.
Original languageEnglish
Title of host publicationProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Place of PublicationPiscataway, NJ, USA
PublisherInstitute of Electrical and Electronics Engineers
Pages409-420
Number of pages12
ISBN (Electronic)978-1-4673-8815-3
ISBN (Print)978-1-4673-8816-0
DOIs
Publication statusPublished - 16 Mar 2017
EventProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis 2016 - Salt Lake City, United States
Duration: 13 Nov 201618 Nov 2016
http://sc16.supercomputing.org/

Publication series

NameSC '16
PublisherIEEE Press
ISSN (Electronic)2167-4337

Conference

ConferenceProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis 2016
Abbreviated titleSC 2016
Country/TerritoryUnited States
CitySalt Lake City
Period13/11/1618/11/16
Internet address

Fingerprint

Dive into the research topics of 'Evaluating and Optimizing OpenCL Kernels for High Performance Computing with FPGAs'. Together they form a unique fingerprint.

Cite this