Abstract
Usage of high-level intermediate representations promises the generation of fast code from a high-level description, improving the productivity of developers while achieving the performance traditionally only reached with low-level programming approaches.
High-level IRs come in two flavors: 1) domain-specific IRs designed only for a specific application area; or 2) generic high-level IRs that can be used to generate high-performance code across many domains. Developing generic IRs is more challenging but offers the advantage of reusing a common compiler infrastructure across various applications.
In this paper, we extend a generic high-level IR to enable efficient computation with sparse data structures. Crucially, we encode sparse representation using reusable dense building blocks already present in the high-level IR. We use a form
of dependent types to model sparse matrices in CSR format by expressing the relationship between multiple dense arrays explicitly separately storing the length of rows, the column indices, and the non-zero values of the matrix.
We achieve high-performance compared to sparse low-level library code using our extended generic high-level code generator. On an Nvidia GPU, we outperform the highly tuned Nvidia cuSparse implementation of SpMV (Sparsematrix vector multiplication) multiplication across 28 sparse matrices of varying sparsity on average by 1.7×.
High-level IRs come in two flavors: 1) domain-specific IRs designed only for a specific application area; or 2) generic high-level IRs that can be used to generate high-performance code across many domains. Developing generic IRs is more challenging but offers the advantage of reusing a common compiler infrastructure across various applications.
In this paper, we extend a generic high-level IR to enable efficient computation with sparse data structures. Crucially, we encode sparse representation using reusable dense building blocks already present in the high-level IR. We use a form
of dependent types to model sparse matrices in CSR format by expressing the relationship between multiple dense arrays explicitly separately storing the length of rows, the column indices, and the non-zero values of the matrix.
We achieve high-performance compared to sparse low-level library code using our extended generic high-level code generator. On an Nvidia GPU, we outperform the highly tuned Nvidia cuSparse implementation of SpMV (Sparsematrix vector multiplication) multiplication across 28 sparse matrices of varying sparsity on average by 1.7×.
Original language | English |
---|---|
Title of host publication | CC 2020: Proceedings of the 29th International Conference on Compiler Construction |
Publisher | ACM Association for Computing Machinery |
Pages | 85-95 |
Number of pages | 11 |
ISBN (Print) | 9781450371209 |
DOIs | |
Publication status | Published - 22 Feb 2020 |
Event | ACM SIGPLAN 2020 International Conference on Compiler Construction - San Diego, United States Duration: 22 Feb 2020 → 23 Feb 2020 Conference number: 29 https://conf.researchr.org/home/CC-2020 |
Conference
Conference | ACM SIGPLAN 2020 International Conference on Compiler Construction |
---|---|
Abbreviated title | CC 2020 |
Country/Territory | United States |
City | San Diego |
Period | 22/02/20 → 23/02/20 |
Internet address |
Keywords / Materials (for Non-textual outputs)
- Software and its engineering
- Parallel programming languages
- Compilers
- Sparse Matrix
- Code Generation
- Dependent Types
Fingerprint
Dive into the research topics of 'Generating Fast Sparse Matrix Vector Multiplication from a High Level Generic Functional IR'. Together they form a unique fingerprint.Datasets
-
Generating fast sparse matrix vector multiplication from a high level generic functional IR
Pizzuti, F. (Creator), Steuwer, M. (Creator) & Dubach, C. (Creator), Dryad, 19 Mar 2020
DOI: 10.5061/dryad.wstqjq2gs, https://datadryad.org/stash/dataset/doi:10.5061/dryad.wstqjq2gs
Dataset