Characterizing and identifying reverted commits

Yan, Meng; Xia, Xin; Lo, David; Hassan, Ahmed E.; Li, Shanping

doi:10.1007/s10664-019-09688-8

Characterizing and identifying reverted commits

Published: 02 March 2019

Volume 24, pages 2171–2208, (2019)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Meng Yan^1,2,
Xin Xia ORCID: orcid.org/0000-0002-6302-3256³,
David Lo⁴,
Ahmed E. Hassan⁵ &
…
Shanping Li¹

809 Accesses
21 Citations
1 Altmetric
Explore all metrics

Abstract

In practice, a popular and coarse-grained approach for recovering from a problematic commit is to revert it (i.e., undoing the change). However, reverted commits could induce some issues for software development, such as impeding the development progress and increasing the difficulty for maintenance. In order to mitigate these issues, we set out to explore the following central question: can we characterize and identify which commits will be reverted? In this paper, we characterize commits using 27 commit features and build an identification model to identify commits that will be reverted. We first identify reverted commits by analyzing commit messages and comparing the changed content, and extract 27 commit features that can be divided into three dimensions, namely change, developer and message, respectively. Then, we build an identification model (e.g., random forest) based on the extracted features. To evaluate the effectiveness of our proposed model, we perform an empirical study on ten open source projects including a total of 125,241 commits. Our experimental results show that our model outperforms two baselines in terms of AUC-ROC and cost-effectiveness (i.e., percentage of detected reverted commits when inspecting 20% of total changed LOC). In terms of the average performance across the ten studied projects, our model achieves an AUC-ROC of 0.756 and a cost-effectiveness of 0.746, significantly improving the baselines by substantial margins. In addition, we found that “developer” is the most discriminative dimension among the three dimensions of features for the identification of reverted commits. However, using all the three dimensions of commit features leads to better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quick remedy commits and their impact on mining software repositories

Article Open access 28 October 2021

Early prediction of merged code changes to prioritize reviewing tasks

Article 19 March 2018

Mining the Contributions Along the Lifecycles of Open-Source Projects

Notes

References

Abdi H (2007) Bonferroni and šidák corrections for multiple comparisons. Encyclopedia of measurement and statistics 3:103–107
Google Scholar
Beller M, Bacchelli A, Zaidman A, Juergens E (2014) Modern code reviews in open-source projects: Which problems do they fix?. In: Proceedings of the 11th working conference on mining software repositories. ACM, pp 202–211
Bird C, Nagappan N, Murphy B, Gall H, Devanbu P (2011) Don’t touch my code!: examining the effects of ownership on software quality. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering. ACM, pp 4–14
Boyd K, Costa VS, Davis J, Page CD (2012) Unachievable region in precision-recall space and its effect on empirical evaluation. In: Proceedings of the international conference on machine learning, NIH public access, vol 2012, p 349
Bradley AP (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Breunig MM, Kriegel HP, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. In: ACM Sigmod record, vol 29. ACM, pp 93–104
Codoban M, Ragavan SS, Dig D, Bailey B (2015) Software history under the lens: a study on why and how developers examine it. In: 2015 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 1–10
da Costa DA, McIntosh S, Shang W, Kulesza U, Coelho R, Hassan AE (2017) A framework for evaluating the results of the szz approach for identifying bug-introducing changes. IEEE Trans Softw Eng 43(7):641–657
Article Google Scholar
Davis J, Goadrich M (2006) The relationship between precision-recall and roc curves. In: Proceedings of the 23rd international conference on machine learning. ACM, pp 233–240
Fan Y, Xia X, Lo D, Hassan AE (2018a) Chaff from the wheat: characterizing and determining valid bug reports. IEEE transactions on software engineering
Fan Y, Xia X, Lo D, Li S (2018b) Early prediction of merged code changes to prioritize reviewing tasks. Empir Softw Eng, pp 1–48
Fluri B, Gall HC (2006) Classifying change types for qualifying change couplings. In: 14th IEEE international conference on program comprehension, 2006. ICPC 2006. IEEE, pp 35–45
Fluri B, Wuersch M, PInzger M, Gall H (2007) Change distilling: tree differencing for fine-grained source code change extraction. IEEE Trans Softw Eng 33 (11):725–743
Article Google Scholar
Fu Y, Yan M, Zhang X, Xu L, Yang D, Kymer JD (2015) Automated classification of software change messages by semi-supervised latent dirichlet allocation. Inf Softw Technol 57:369–377
Article Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1):10–18
Article Google Scholar
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
MATH Google Scholar
Hassan AE (2008) Automated classification of change messages in open source projects. In: Proceedings of the 2008 ACM symposium on applied computing. ACM, pp 837–841
Hassan AE (2009) Predicting faults using the complexity of code changes. In: Proceedings of the 31st international conference on software engineering. IEEE Computer Society, pp 78–88
Herzig K, Just S, Zeller A (2013) It’s not a bug, it’s a feature: how misclassification impacts bug prediction. In: Proceedings of the 2013 international conference on software engineering. IEEE Press, pp 392–401
Hindle A, German DM, Holt R (2008) What do large commits tell us?: a taxonomical study of large commits. In: Proceedings of the 2008 international working conference on mining software repositories. ACM, pp 99–108
Huang J, Ling CX (2005) Using auc and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310
Article Google Scholar
Huang Q, Shihab E, Xia X, Lo D, Li S (2017) Identifying self-admitted technical debt in open source projects using text mining. Empir Softw Eng, pp 1–34
Jiang T, Tan L, Kim S (2013) Personalized defect prediction. In: Proceedings of the 28th IEEE/ACM international conference on automated software engineering. IEEE Press, pp 279–289
Kabinna S, Shang W, Bezemer CP, Hassan AE (2016) Examining the stability of logging statements. In: 2016 IEEE 23rd international conference on software analysis, evolution, and reengineering (SANER), vol 1, pp 326–337
Kamei Y, Shihab E, Adams B, Hassan AE, Mockus A, Sinha A, Ubayashi N (2013) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng 39(6):757–773
Article Google Scholar
Kim S, Whitehead Jr EJ, Zhang Y (2008) Classifying software changes: clean or buggy? IEEE Trans Softw Eng 34(2):181–196
Article Google Scholar
Lampert TA, Gançarski P (2014) The bane of skew. Mach Learn 97(1–2):5–32
Article MathSciNet MATH Google Scholar
Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496
Article Google Scholar
Li H, Shang W, Zou Y, Hassan AE (2016) Towards just-in-time suggestions for log changes. Empir Softw Eng, pp 1–35
Li H, Shang W, Zou Y, Hassan AE (2017) Towards just-in-time suggestions for log changes. Empir Softw Eng 22(4):1831–1865
Article Google Scholar
Li H, Chen THP, Shang W, Hassan AE (2018) Studying software logging using topic models. Empir Softw Eng, pp 1–40
Long JD, Feng D, Cliff N (2003) Ordinal analysis of behavioral data. Handbook of psychology
Macho C, McIntosh S, Pinzger M (2016) Predicting build co-changes with source code change and commit categories. In: 2016 IEEE 23rd international conference on software analysis, evolution, and reengineering (SANER), vol 1. IEEE, pp 541–551
Mäntylä MV, Lassenius C (2009) What types of defects are really discovered in code reviews? IEEE Trans Softw Eng 35(3):430–448
Article Google Scholar
McCallum A, Nigam K, et al. (1998) A comparison of event models for naive bayes text classification. In: AAAI-98 workshop on learning for text categorization, Madison, WI, vol 752, pp 41–48
McIntosh S, Adams B, Nagappan M, Hassan AE (2014) Mining co-change information to understand when build changes are necessary. In: 2014 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 241–250
Mockus A, Votta LG (2000) Identifying reasons for software changes using historic databases. In: icsm, pp 120–130
Mockus A, Weiss DM (2000) Predicting risk of software changes. Bell Labs Tech J 5(2):169–180
Article Google Scholar
Nam J, Kim S (2015) Clami: defect prediction on unlabeled datasets (t). In: 2015 30th IEEE/ACM international conference on automated software engineering (ASE). IEEE, pp 452–463
Romano J, Kromrey JD, Coraggio J, Skowronek J, Devine L (2006) Exploring methods for evaluating group differences on the nsse and other surveys: Are the t-test and cohen’s d indices the most appropriate choices. In: Annual meeting of the southern association for institutional research, Citeseer
Rosen C, Grawi B, Shihab E (2015) Commit guru: analytics and risk prediction of software commits. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering. ACM, pp 966–969
Scott AJ, Knott M (1974) A cluster analysis method for grouping means in the analysis of variance. Biometrics, pp 507–512
Shimagaki J, Kamei Y, McIntosh S, Pursehouse D, Ubayashi N (2016) Why are commits being reverted?: a comparative study of industrial and open source projects. In: 2016 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 301–311
Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes?. In: ACM Sigsoft software engineering notes, vol 30. ACM, pp 1–5
Souza R, Chavez C, Bittencourt RA (2015) Rapid releases and patch backouts: a software analytics approach. IEEE Softw 32(2):89–96
Article Google Scholar
Tantithamthavorn C, McIntosh S, Hassan AE, Ihara A, Matsumoto K (2015) The impact of mislabelling on the performance and interpretation of defect prediction models. In: 2015 IEEE/ACM 37th IEEE international conference on software engineering (ICSE), vol 1, pp 812–823
Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2017) An empirical comparison of model validation techniques for defect prediction models. IEEE Trans Softw Eng 43(1):1–18
Article Google Scholar
Tao Y, Han D, Kim S (2014) Writing acceptable patches: an empirical study of open source project patches. In: 2014 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 271–280
Tian Y, Nagappan M, Lo D, Hassan AE (2015) What are the characteristics of high-rated apps? a case study on free android applications. In: 2015 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 301–310
Valdivia Garcia H, Shihab E (2014) Characterizing and predicting blocking bugs in open source projects. In: Proceedings of the 11th working conference on mining software repositories. ACM, pp 72–81
Wilcoxon F (1992) Individual comparisons by ranking methods. Breakthroughs in statistics, pp 196–202
Wolpert DH, Macready WG (1999) An efficient method to estimate bagging’s generalization error. Mach Learn 35(1):41–55
Article MATH Google Scholar
Xia X, Lo D, Qiu W, Wang X, Zhou B (2014) Automated configuration bug report prediction using text mining. In: 2014 IEEE 38th annual computer software and applications conference (COMPSAC). IEEE, pp 107–116
Xia X, Lo D, McIntosh S, Shihab E, Hassan AE (2015a) Cross-project build co-change prediction. In: 2015 IEEE 22nd international conference on software analysis, evolution and reengineering (SANER). IEEE, pp 311–320
Xia X, Lo D, Shihab E, Wang X, Yang X (2015b) Elblocker: predicting blocking bugs with ensemble imbalance learning. Inf Softw Technol 61:93–106
Article Google Scholar
Xia X, Lo D, Pan SJ, Nagappan N, Wang X (2016a) Hydra: massively compositional model for cross-project defect prediction. IEEE Trans Softw Eng 42 (10):977–998
Article Google Scholar
Xia X, Shihab E, Kamei Y, Lo D, Wang X (2016b) Predicting crashing releases of mobile applications. In: Proceedings of the 10th ACM/IEEE international symposium on empirical software engineering and measurement. ACM, p 29
Xia X, Bao L, Lo D, Kochhar PS, Hassan AE, Xing Z (2017) What do developers search for on the web? Empir Softw Eng, pp 1–37
Yan M, Fu Y, Zhang X, Yang D, Xu L, Kymer JD (2016) Automatically classifying software changes via discriminative topic model: supporting multi-category and cross-project. J Syst Softw 113:296–308
Article Google Scholar
Yan M, Fang Y, Lo D, Xia X, Zhang X (2017) File-level defect prediction: unsupervised vs. supervised models. In: 2017 ACM/IEEE international symposium on empirical software engineering and measurement (ESEM), IEEE, pp 344–353
Yan M, Xia X, Shihab E, Lo D, Yin J, Yang X (2018) Automating change-level self-admitted technical debt determination. IEEE Trans Softw Eng
Yang Y, Zhou Y, Liu J, Zhao Y, Lu H, Xu L, Xu B, Leung H (2016) Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering. ACM, pp 157–168
Yoon Y, Myers BA (2012) An exploratory study of backtracking strategies used by developers. In: Proceedings of the 5th international workshop on co-operative and human aspects of software engineering. IEEE Press, pp 138–144

Download references

Acknowledgment

This research was partially supported by the National Key Research and Development Program of China (2018YFB1003904), NSFC Program (No. 61602403) and China Postdoctoral Science Foundation (No. 2017M621931).

Author information

Authors and Affiliations

College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Meng Yan & Shanping Li
Alibaba-Zhejiang University Joint Institute of Frontier Technologies, Hangzhou, China
Meng Yan
Faculty of Information Technology, Monash University, Melbourne, Australia
Xin Xia
School of Information Systems, Singapore Management University, Singapore, Singapore
David Lo
Software Analysis and Intelligence Lab (SAIL), Queen’s University, Kingston, ON, Canada
Ahmed E. Hassan

Authors

Meng Yan
View author publications
You can also search for this author in PubMed Google Scholar
Xin Xia
View author publications
You can also search for this author in PubMed Google Scholar
David Lo
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed E. Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Shanping Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Xia.

Additional information

Communicated by: Massimiliano Di Penta

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, M., Xia, X., Lo, D. et al. Characterizing and identifying reverted commits. Empir Software Eng 24, 2171–2208 (2019). https://doi.org/10.1007/s10664-019-09688-8

Download citation

Published: 02 March 2019
Issue Date: 15 August 2019
DOI: https://doi.org/10.1007/s10664-019-09688-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Characterizing and identifying reverted commits

Abstract

Access this article

Similar content being viewed by others

Quick remedy commits and their impact on mining software repositories

Early prediction of merged code changes to prioritize reviewing tasks

Mining the Contributions Along the Lifecycles of Open-Source Projects

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Characterizing and identifying reverted commits

Abstract

Access this article

Similar content being viewed by others

Quick remedy commits and their impact on mining software repositories

Early prediction of merged code changes to prioritize reviewing tasks

Mining the Contributions Along the Lifecycles of Open-Source Projects

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation