Learning Natural Coding Conventions

Miltiadis Allamanis, Earl T. Barr, Christian Bird, Charles Sutton

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Every programmer has a characteristic style, ranging from preferences about identifier naming to preferences about object relationships and design patterns. Coding conventions define a consistent syntactic style, fostering readability and hence maintainability. When collaborating, programmers strive to obey a project’s coding conventions. However, one third of reviews of changes contain feedback about coding conventions, indicating that programmers do not always follow them and that project members care deeply about adherence. Unfortunately, programmers are often unaware of coding conventions because inferring them requires a global view, one that aggregates the many local decisions programmers make and identifies emergent consensus on style. We present NATURALIZE, a framework that learns the style of a codebase, and suggests revisions to improve stylistic consistency. NATURALIZE builds on recent work in applying statistical natural language processing to source code. We apply NATURALIZE to suggest natural identifier names and formatting conventions. We present four tools focused on ensuring natural code during development and release management, including code review. NATURALIZE achieves 94 % accuracy in its top suggestions for identifier names. We used NATURALIZE to generate 18 patches for 5 open source projects: 14 were accepted.
Original languageEnglish
Title of host publicationProceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering
Place of PublicationNew York, NY, USA
Number of pages13
ISBN (Print)978-1-4503-3056-5
Publication statusPublished - 11 Nov 2014
Event22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering - , Hong Kong
Duration: 16 Nov 201522 Nov 2015


Symposium22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering
Abbreviated titleFSE 2014
Country/TerritoryHong Kong
Internet address

Keywords / Materials (for Non-textual outputs)

  • Coding conventions, naturalness of software


Dive into the research topics of 'Learning Natural Coding Conventions'. Together they form a unique fingerprint.

Cite this