Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation

Peter M. W. Knijnenburg, Toru Kisuki, Michael F. P. O'Boyle

Research output: Contribution to journalArticlepeer-review


Loop tiling and unrolling are two important program transformations to exploit locality and expose instruction level parallelism, respectively. However, these transformations are not independent and each can adversely affect the goal of the other. Furthermore, the best combination will vary dramatically from one processor to the next. In this paper, we therefore address the problem of how to select tile sizes and unroll factors simultaneously. We approach this problem in an architecturally adaptive manner by means of iterative compilation, where we generate many versions of a program and decide upon the best by actually executing them and measuring their execution time. We evaluate several iterative strategies based on genetic algorithms, random sampling and simulated annealing. We compare the levels of optimization obtained by iterative compilation to several well-known static techniques and show that we outperform each of them on a range of benchmarks across a variety of architectures. Finally, we show how to quantitatively trade-off the number of profiles needed and the level of optimization that can be reached. In this way, we can reach high levels of optimization within 50 iterations.
Original languageEnglish
Pages (from-to)43-67
Number of pages25
JournalJournal of Supercomputing
Issue number1
Publication statusPublished - Jan 2003


Dive into the research topics of 'Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation'. Together they form a unique fingerprint.

Cite this