Abstract
Machine learning (ML) is increasingly seen as a viable approach for building compiler optimization heuristics, but many ML methods cannot replicate even the simplest of the data flow analyses that are critical to making good optimization decisions. We posit that if ML cannot do that, then it is insufficiently able to reason about programs. We formulate data flow analyses as supervised learning tasks and introduce a large open dataset of programs and their corresponding labels from several analyses. We use this dataset to benchmark ML methods and show that they struggle on these fundamental program reasoning tasks. We propose ProGraML - Program Graphs for Machine Learning - a language-independent, portable representation of program semantics. ProGraML overcomes the limitations of prior works and yields improved performance on downstream optimization tasks.
Original language | English |
---|---|
Title of host publication | Proceedings of the 38th International Conference on Machine Learning |
Publisher | PMLR |
Pages | 2244-2253 |
Number of pages | 10 |
Publication status | Published - 18 Jul 2021 |
Event | Thirty-eighth International Conference on Machine Learning - Online Duration: 18 Jul 2021 → 24 Jul 2021 https://icml.cc/ |
Publication series
Name | Proceedings of Machine Learning Research |
---|---|
Volume | 139 |
ISSN (Electronic) | 2640-3498 |
Conference
Conference | Thirty-eighth International Conference on Machine Learning |
---|---|
Abbreviated title | ICML 2021 |
Period | 18/07/21 → 24/07/21 |
Internet address |