Mapping parallelism to multi-cores: a machine learning based approach

Research output: Chapter in Book/Report/Conference proceedingConference contribution


The efficient mapping of program parallelism to multi-core processors is highly dependent on the underlying architecture. This paper proposes a portable and automatic compiler-based approach to mapping such parallelism using machine learning. It develops two predictors: a data sensitive and a data insensitive predictor to select the best mapping for parallel programs. They predict the number of threads and the scheduling policy for any given program using a model learnt off-line. By using low-cost profiling runs, they predict the mapping for a new unseen program across multiple input data sets. We evaluate our approach by selecting parallelism mapping configurations for OpenMP programs on two representative but different multi-core platforms (the Intel Xeon and the Cell processors). Performance of our technique is stable across programs and architectures. On average, it delivers above 96% performance of the maximum available on both platforms. It achieve, on average, a 37% (up to 17.5 times) performance improvement over the OpenMP runtime default scheme on the Cell platform. Compared to two recent prediction models, our predictors achieve better performance with a significant lower profiling cost.
Original languageEnglish
Title of host publicationProceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Place of PublicationNew York, NY, USA
Number of pages10
ISBN (Print)978-1-60558-397-6
Publication statusPublished - 2009


  • artificial neural networks, compiler optimization, machine learning, performance modeling, support vector machine


Dive into the research topics of 'Mapping parallelism to multi-cores: a machine learning based approach'. Together they form a unique fingerprint.

Cite this