ManySStuBs4J Dataset

  • Rafael Karampatsis (Creator)



The ManySStuBs4J corpus contains simple statement bugs mined from open-source Java projects hosted in GitHub. There are two variations of the dataset. One mined from the 100 Java Maven Projects and one mined from the top 1000 Java Projects. A project's popularity is determined by computing the sum of z-scores of its forks and watchers. See "README.txt" for further details.

Data Citation

Karampatsis, Rafael-Michael. (2019). ManySStuBs4J Dataset, [dataset]. University of Edinburgh. College of Science & Engineering. School of Informatics. Institute for Language, Cognition and Computation (ILCC).
Date made available30 Sept 2019
PublisherEdinburgh DataShare

Cite this