Abstract
Two similar code segments, or clones, form a clone pair within a software system. The changes to the clones over time create a clone evolution history. Late propagation is a specific pattern of clone evolution. In late propagation, one clone in the clone pair is modified, causing the clone pair to become inconsistent. The code segments are then re-synchronized in a later revision. Existing work has established late propagation as a clone evolution pattern, and suggested that the pattern is related to a high number of faults. In this chapter, we replicate and extend the work by Barbour et al. (2011 27th IEEE International Conference on Software Maintenance (ICSM). IEEE (2011) [1]) by examining the characteristics of late propagation in 10 long-lived open-source software systems using the iClones clone detection tool. We identify eight types of late propagation and investigate their fault-proneness. Our results confirm that late propagation is the more harmful clone evolution pattern and that some specific cases of late propagations are more harmful than others. We trained machine learning models using 18 clone evolution related features to predict the evolution of late propagation and achieved high precision within the range of 0.91–0.94 and AUC within the range of 0.87–0.91.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
There is no \(H_{01}\) because RQ1 is exploratory.
- 4.
References
L. Barbour, F. Khomh, Y. Zou, Late propagation in software clones, in 2011 27th IEEE International Conference on Software Maintenance (ICSM) (IEEE, 2011), pp. 273–282
M. Kim, V. Sazawal, D. Notkin, G. Murphy, An empirical study of code clone genealogies, in Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. ESEC/FSE-13 (ACM, New York, NY, USA, 2005), pp. 187–196
L. Aversano, L. Cerulo, M. Di Penta, How clones are maintained: an empirical study, in 11th European Conference on Software Maintenance and Reengineering (2007), pp. 81 –90
S. Thummalapenta, L. Cerulo, L. Aversano, M. Di Penta, An empirical study on the maintenance of source code clones. Empir. Softw. Eng. 15, 1–34 (2010)
L. Barbour, L. An, F. Khomh, Y. Zou, S. Wang, An investigation of the fault-proneness of clone evolutionary patterns. Softw. Qual. J. 26(4), 1187–1222 (2018)
N. Göde, R. Koschke, Incremental clone detection, in 13th European Conference on Software Maintenance and Reengineering (IEEE, 2009), pp. 219–228
J. Svajlenko, C.K. Roy, Evaluating modern clone detection tools, in 2014 IEEE International Conference on Software Maintenance and Evolution (IEEE, 2014), pp. 321–330
N. Göde, Evolution of type-1 clones, in Proceedings of the 9th International Working Conference on Source Code Analysis and Manipulation (IEEE Computer Society, 2009), pp. 77–86
J. Krinke, A study of consistent and inconsistent changes to code clones, in Working Conference on Reverse Engineering (2007), pp. 170–178
C.C.S., whatthepatch—python’s third party patch parsing library Online. Accessed 17 Aug 2020
J. Śliwerski, T. Zimmermann, A. Zeller, When do changes induce fixes? ACM Sigsoft Softw. Eng. Notes 30(4), 1–5 (2005)
M. Fischer, M. Pinzger, H. Gall, Populating a release history database from version control and bug tracking systems, in International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings (IEEE, 2003), pp. 23–32
D. Sheskin, Handbook of Parametric and Nonparametric Statistical Procedures, 4th ed. (Chapman & Hall, 2007)
T. Chen, C. Guestrin, Xgboost: a scalable tree boosting system, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016), pp. 785–794
Z. Chen, F. Jiang, Y. Cheng, X. Gu, W. Liu, J. Peng, XGBoost classifier for DDoS attack detection and analysis in SDN-based cloud, in IEEE International Conference on Big Data and Smart Computing (bigcomp) (IEEE, 2018), pp. 251–256
S.S. Dhaliwal, A.-A. Nahid, R. Abbas, Effective intrusion detection system using xgboost. Information 9(7), 149 (2018)
M. Abidi, M.S. Rahman, M. Openja, F. Khomh, Are multi-language design smells fault-prone? An empirical study
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Ehsan, O., Barbour, L., Khomh, F., Zou, Y. (2021). Is Late Propagation a Harmful Code Clone Evolutionary Pattern? An Empirical Study. In: Inoue, K., Roy, C.K. (eds) Code Clone Analysis. Springer, Singapore. https://doi.org/10.1007/978-981-16-1927-4_11
Download citation
DOI: https://doi.org/10.1007/978-981-16-1927-4_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-1926-7
Online ISBN: 978-981-16-1927-4
eBook Packages: Computer ScienceComputer Science (R0)