Skip to main content

Is Late Propagation a Harmful Code Clone Evolutionary Pattern? An Empirical Study

  • 320 Accesses

Abstract

Two similar code segments, or clones, form a clone pair within a software system. The changes to the clones over time create a clone evolution history. Late propagation is a specific pattern of clone evolution. In late propagation, one clone in the clone pair is modified, causing the clone pair to become inconsistent. The code segments are then re-synchronized in a later revision. Existing work has established late propagation as a clone evolution pattern, and suggested that the pattern is related to a high number of faults. In this chapter, we replicate and extend the work by Barbour et al. (2011 27th IEEE International Conference on Software Maintenance (ICSM). IEEE (2011) [1]) by examining the characteristics of late propagation in 10 long-lived open-source software systems using the iClones clone detection tool. We identify eight types of late propagation and investigate their fault-proneness. Our results confirm that late propagation is the more harmful clone evolution pattern and that some specific cases of late propagations are more harmful than others. We trained machine learning models using 18 clone evolution related features to predict the evolution of late propagation and achieved high precision within the range of 0.91–0.94 and AUC within the range of 0.87–0.91.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-981-16-1927-4_11
  • Chapter length: 17 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   129.00
Price excludes VAT (USA)
  • ISBN: 978-981-16-1927-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   169.99
Price excludes VAT (USA)
Hardcover Book
USD   169.99
Price excludes VAT (USA)

Notes

  1. 1.

    https://ghtorrent.org/gcloud.html.

  2. 2.

    https://github.com/artem-solovev/gloc.

  3. 3.

    There is no \(H_{01}\) because RQ1 is exploratory.

  4. 4.

    https://github.com/qecelab/latepropagation.

References

  1. L. Barbour, F. Khomh, Y. Zou, Late propagation in software clones, in 2011 27th IEEE International Conference on Software Maintenance (ICSM) (IEEE, 2011), pp. 273–282

    Google Scholar 

  2. M. Kim, V. Sazawal, D. Notkin, G. Murphy, An empirical study of code clone genealogies, in Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. ESEC/FSE-13 (ACM, New York, NY, USA, 2005), pp. 187–196

    Google Scholar 

  3. L. Aversano, L. Cerulo, M. Di Penta, How clones are maintained: an empirical study, in 11th European Conference on Software Maintenance and Reengineering (2007), pp. 81 –90

    Google Scholar 

  4. S. Thummalapenta, L. Cerulo, L. Aversano, M. Di Penta, An empirical study on the maintenance of source code clones. Empir. Softw. Eng. 15, 1–34 (2010)

    CrossRef  Google Scholar 

  5. L. Barbour, L. An, F. Khomh, Y. Zou, S. Wang, An investigation of the fault-proneness of clone evolutionary patterns. Softw. Qual. J. 26(4), 1187–1222 (2018)

    CrossRef  Google Scholar 

  6. N. Göde, R. Koschke, Incremental clone detection, in 13th European Conference on Software Maintenance and Reengineering (IEEE, 2009), pp. 219–228

    Google Scholar 

  7. J. Svajlenko, C.K. Roy, Evaluating modern clone detection tools, in 2014 IEEE International Conference on Software Maintenance and Evolution (IEEE, 2014), pp. 321–330

    Google Scholar 

  8. N. Göde, Evolution of type-1 clones, in Proceedings of the 9th International Working Conference on Source Code Analysis and Manipulation (IEEE Computer Society, 2009), pp. 77–86

    Google Scholar 

  9. J. Krinke, A study of consistent and inconsistent changes to code clones, in Working Conference on Reverse Engineering (2007), pp. 170–178

    Google Scholar 

  10. C.C.S., whatthepatch—python’s third party patch parsing library Online. Accessed 17 Aug 2020

    Google Scholar 

  11. J. Śliwerski, T. Zimmermann, A. Zeller, When do changes induce fixes? ACM Sigsoft Softw. Eng. Notes 30(4), 1–5 (2005)

    CrossRef  Google Scholar 

  12. M. Fischer, M. Pinzger, H. Gall, Populating a release history database from version control and bug tracking systems, in International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings (IEEE, 2003), pp. 23–32

    Google Scholar 

  13. D. Sheskin, Handbook of Parametric and Nonparametric Statistical Procedures, 4th ed. (Chapman & Hall, 2007)

    Google Scholar 

  14. T. Chen, C. Guestrin, Xgboost: a scalable tree boosting system, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016), pp. 785–794

    Google Scholar 

  15. Z. Chen, F. Jiang, Y. Cheng, X. Gu, W. Liu, J. Peng, XGBoost classifier for DDoS attack detection and analysis in SDN-based cloud, in IEEE International Conference on Big Data and Smart Computing (bigcomp) (IEEE, 2018), pp. 251–256

    Google Scholar 

  16. S.S. Dhaliwal, A.-A. Nahid, R. Abbas, Effective intrusion detection system using xgboost. Information 9(7), 149 (2018)

    CrossRef  Google Scholar 

  17. M. Abidi, M.S. Rahman, M. Openja, F. Khomh, Are multi-language design smells fault-prone? An empirical study

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Osama Ehsan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Ehsan, O., Barbour, L., Khomh, F., Zou, Y. (2021). Is Late Propagation a Harmful Code Clone Evolutionary Pattern? An Empirical Study. In: Inoue, K., Roy, C.K. (eds) Code Clone Analysis. Springer, Singapore. https://doi.org/10.1007/978-981-16-1927-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-1927-4_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-1926-7

  • Online ISBN: 978-981-16-1927-4

  • eBook Packages: Computer ScienceComputer Science (R0)