ComPy-Learn: A toolbox for exploring machine learning representations for compilers

Alexander Brauckmann, Andrés Goens, Jeronimo Castrillon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Deep Learning methods have not only shown to improve software performance in compiler heuristics, but also e.g. to improve security in vulnerability prediction or to boost developer productivity in software engineering tools. A key to the success of such methods across these use cases is the expressiveness of the representation used to abstract from the program code. Recent work has shown that different such representations have unique advantages in terms of performance. However, determining the best-performing one for a given task is often not obvious and requires empirical evaluation. Therefore, we present ComPy-Learn, a toolbox for conveniently defining, extracting, and exploring representations of program code. With syntax-level language information from the Clang compiler frontend and low-level information from the LLVM compiler backend, the tool supports the construction of linear and graph representations and enables an efficient search for the best-performing representation and model for tasks on program code.
Original languageEnglish
Title of host publicationProceedings of the 2020 Forum on specification & Design Languages (FDL)
Number of pages4
ISBN (Electronic)978-1-7281-8928-4
ISBN (Print)978-1-7281-8928-4
Publication statusPublished - 3 Nov 2020
Event2020 Forum on specification & Design Languages - Kiel, Germany
Duration: 15 Sept 202017 Sept 2020

Publication series

NameForum on Specification, Verification and Design Languages, FDL
ISSN (Print)1636-9874


Conference2020 Forum on specification & Design Languages
Abbreviated titleFDL 2020

Keywords / Materials (for Non-textual outputs)

  • Compilers
  • Clang
  • LLVM
  • Machine Learning
  • Code Representations


Dive into the research topics of 'ComPy-Learn: A toolbox for exploring machine learning representations for compilers'. Together they form a unique fingerprint.

Cite this