A Case Study on Machine Learning for Synthesizing Benchmarks

Andrés Goens, Alexander Brauckmann, Sebastian Ertel, Chris Cummins, Hugh Leather, Jeronimo Castrillon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Good benchmarks are hard to find because they require a substantial effort to keep them representative for the constantly changing challenges of a particular field. Synthetic benchmarks are a common approach to deal with this, and methods from machine learning are natural candidates for synthetic benchmark generation. In this paper we investigate the usefulness of machine learning in the prominent CLgen benchmark generator. We re-evaluate CLgen by comparing the benchmarks generated by the model with the raw data used to train it. This re-evaluation indicates that, for the use case considered, machine learning did not yield additional benefit over a simpler method using the raw data. We investigate the reasons for this and provide further insights into the challenges the problem could pose for potential future generators.
Original languageEnglish
Title of host publicationProceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages
EditorsTim Mattson, Abdullah Muzahid, Armando Solar-Lezama
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery, Inc
Number of pages9
ISBN (Print)9781450367196
Publication statusPublished - 22 Jun 2019
Event40th ACM SIGPLAN Conference on Programming Language Design and Implementation - Phoenix, United States
Duration: 24 Jun 201926 Jun 2019


Conference40th ACM SIGPLAN Conference on Programming Language Design and Implementation
Abbreviated titlePLDI 2019
Country/TerritoryUnited States
Internet address

Keywords / Materials (for Non-textual outputs)

  • Machine Learning
  • Benchmarking
  • Synthetic program generation
  • CLGen
  • Generative models


Dive into the research topics of 'A Case Study on Machine Learning for Synthesizing Benchmarks'. Together they form a unique fingerprint.

Cite this