TY - GEN
T1 - Identifying casualty changes in software patches
AU - Sejfia, Adriana
AU - Zhao, Yixue
AU - Medvidović, Nenad
PY - 2021/8/20
Y1 - 2021/8/20
N2 - Noise in software patches impacts their understanding, analysis, and use for tasks such as change prediction. Although several approaches have been developed to identify noise in patches, this issue has persisted. An analysis of a dataset of security patches for the Tomcat web server, which we further expanded with security patches from five additional systems, uncovered several kinds of previously unreported noise which we call nonessential casualty changes. These are changes that themselves do not alter the logic of the program but are necessitated by other changes made in the patch. In this paper, we provide a comprehensive taxonomy of casualty changes. We then develop CasCADe, an automated technique for automatically identifying casualty changes. We evaluate CasCADe with several publicly available datasets of patches and tools that focus on them. Our results show that CasCADe is highly accurate, that the kinds of noise it identifies occur relatively commonly in patches, and that removing this noise improves upon the evaluation results of a previously published change-based approach.
AB - Noise in software patches impacts their understanding, analysis, and use for tasks such as change prediction. Although several approaches have been developed to identify noise in patches, this issue has persisted. An analysis of a dataset of security patches for the Tomcat web server, which we further expanded with security patches from five additional systems, uncovered several kinds of previously unreported noise which we call nonessential casualty changes. These are changes that themselves do not alter the logic of the program but are necessitated by other changes made in the patch. In this paper, we provide a comprehensive taxonomy of casualty changes. We then develop CasCADe, an automated technique for automatically identifying casualty changes. We evaluate CasCADe with several publicly available datasets of patches and tools that focus on them. Our results show that CasCADe is highly accurate, that the kinds of noise it identifies occur relatively commonly in patches, and that removing this noise improves upon the evaluation results of a previously published change-based approach.
KW - Change-based Analysis
KW - Noise in Patches
KW - Software Patches
UR - http://www.scopus.com/inward/record.url?scp=85116231218&partnerID=8YFLogxK
U2 - 10.1145/3468264.3468624
DO - 10.1145/3468264.3468624
M3 - Conference contribution
AN - SCOPUS:85116231218
T3 - ESEC/FSE 2021 - Proceedings of the 29th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering
SP - 304
EP - 315
BT - ESEC/FSE 2021 - Proceedings of the 29th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering
A2 - Spinellis, Diomidis
PB - ACM
T2 - 29th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021
Y2 - 23 August 2021 through 28 August 2021
ER -