Parameter-Free Probabilistic API Mining across GitHub

Jaroslav Fowkes, Charles Sutton

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Existing API mining algorithms can be difficult to use as they require expensive parameter tuning and the returned set of API calls can be large, highly redundant and difficult to understand. To address this, we present PAM (Probabilistic API Miner), a near parameter-free probabilistic algorithm for mining the most interesting API call patterns. We show that PAM significantly outperforms both MAPO and UPMiner, achieving 69% test-set precision, at retrieving relevant API call sequences from GitHub. Moreover, we focus on libraries for which the developers have explicitly provided code examples, yielding over 300,000 LOC of hand-written API example code from the 967 client projects in the data set. This evaluation suggests that the hand-written examples actually have limited coverage of real API usages.
Original languageEnglish
Title of host publicationFSE 2016: ACM SIGSOFT International Symposium on the Foundations of Software Engineering
Place of PublicationSeattle, United States
PublisherACM
Pages254-265
Number of pages12
ISBN (Electronic)978-1-4503-4218-6
DOIs
Publication statusPublished - 1 Nov 2016
Event24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering - Seattle, United States
Duration: 13 Nov 201618 Nov 2016
http://www.cs.ucdavis.edu/fse2016/

Conference

Conference24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering
Abbreviated titleFSE 2016
Country/TerritoryUnited States
CitySeattle
Period13/11/1618/11/16
Internet address

Fingerprint

Dive into the research topics of 'Parameter-Free Probabilistic API Mining across GitHub'. Together they form a unique fingerprint.

Cite this