Advertisement

Measuring the effect of clone refactoring on the size of unit test cases in object-oriented software: an empirical study

  • Mourad BadriEmail author
  • Linda Badri
  • Oussama Hachemane
  • Alexandre Ouellet
Original Paper
  • 22 Downloads

Abstract

This paper aims at empirically measuring the effect of clone refactoring on the size of unit test cases in object-oriented software. We investigated various research questions related to the: (1) impact of clone refactoring on source code attributes (particularly size, complexity and coupling) that are related to testability of classes, (2) impact of clone refactoring on the size of unit test cases, (3) correlations between the variations observed after clone refactoring in both source code attributes and the size of unit test cases and (4) variations after clone refactoring in the source code attributes that are more associated with the size of unit test cases. We used different metrics to quantify the considered source code attributes and the size of unit test cases. To investigate the research questions, and develop predictive and explanatory models, we used various data analysis and modeling techniques, particularly linear regression analysis and five machine learning algorithms (C4.5, KNN, Naïve Bayes, Random Forest and Support Vector Machine). We conducted an empirical study using data collected from two open-source Java software systems (ANT and ARCHIVA) that have been clone refactored. Overall, the paper contributions can be summarized as: (1) the results revealed that there is a strong and positive correlation between code clone refactoring and reduction in the size of unit test cases, (2) we showed how code quality attributes that are related to testability of classes are significantly improved when clones are refactored, (3) we observed that the size of unit test cases can be significantly reduced when clone refactoring is applied, and (4) complexity/size measures are commonly associated with the variations of the size of unit test cases when compared to coupling.

Keywords

Object-oriented software Clone refactoring Source code attributes Unit test cases Metrics Relationships Linear regression Machine learning algorithms 

Notes

Acknowledgements

This work was partially supported by a NSERC (Natural Sciences and Engineering Research Council of Canada) grant.

References

  1. 1.
    Fowler M (1999) Refactoring: improving the design of existing code. Addison Wesley, BostonzbMATHGoogle Scholar
  2. 2.
    Roy CK, Cordy JR, Koschke R (2009) Comparison and evaluation of code clone detection techniques and tools: a qualitative approach. Sci Comput Program 74(7):470MathSciNetzbMATHGoogle Scholar
  3. 3.
    Baker B (1995) On finding duplication and near-duplication in large software systems In: 2nd working conference on reverse engineering, WCREGoogle Scholar
  4. 4.
    Baxter ID, Yahin A, Moura L, Sant’Anna M, Bier L (1998) Clone detection using abstract syntax trees. In: ICSMGoogle Scholar
  5. 5.
    Roy CK, Cordy JR (2008) An empirical study of function clones in open source software systems. In: 15th working conference on reverse engineering, WCREGoogle Scholar
  6. 6.
    Mens T, Tourwé T (2004) A survey of software refactoring. IEEE Trans Softw Eng 30(2):126Google Scholar
  7. 7.
    Sajnani H, Saini V, Lopes CV (2014) A comparative study of bug patterns in java cloned and non-cloned code. In: 14th international working conference on source code analysis and manipulation, pp 21–30Google Scholar
  8. 8.
    Saini V, Sajnani H, Lopes C (2016) Comparing quality metrics for cloned and non-cloned java methods: a large scale empirical study. In: International conference on software maintenance and evolution, IEEEGoogle Scholar
  9. 9.
    Kaur P, Mittal P (2017) Impact of clones refactoring on external quality attributes of open source software. Int J Adv Res Comput Sci 8(5):1Google Scholar
  10. 10.
    Shahzad S, Hussain A, Nazir S (2017) A clone management framework to improve code quality of FOSS software. In: International conference on communication, computing and digital system (CCODE), IEEEGoogle Scholar
  11. 11.
    Fontana A, Zanoni F, Ranchetti A, Ranchetti D (2013) Software clone detection and refactoring. Hindawi Publishing Corporation, INRN Software Engineering, CairoGoogle Scholar
  12. 12.
    Alshayeb M (2009) Empirical investigation of refactoring effect on software quality. Inf Softw Technol 51:1319–1326Google Scholar
  13. 13.
    Badri M, Kout A, Badri L (2012) On the effect of aspect-oriented refactoring on testability of classes: a case study. In: IEEE international conf computer systems and industrial informaticsGoogle Scholar
  14. 14.
    Badri M, Kout A, Badri L (2017) Investigating the effect of aspect-oriented refactoring on the unit testing effort of classes: an empirical evaluation. Int J Softw Eng Knowl Eng 27(5):749–789Google Scholar
  15. 15.
    Bruntink M, Deursen AV (2004) Predicting class testability using object-oriented metrics. In: 4th international workshop on source code analysis and manipulation (SCAM)Google Scholar
  16. 16.
    Bruntink M, Deursen AV (2006) An empirical study into class testability. J Syst Softw 79(9):1219Google Scholar
  17. 17.
    Singh Y, Kaur A, Malhota R (2008) Predicting testability effort using artificial neural network. In: Proceedings of the world congress on engineering and computer science, San Francisco, USAGoogle Scholar
  18. 18.
    Singh Y, Saha A (2010) Predicting testability of Eclipse: a case study. J Softw Eng 4(2):122Google Scholar
  19. 19.
    Badri M, Touré F (2011) Empirical analysis for investigating the effect of control flow dependencies on testability of classes. In: 23rd international conference on software engineering and knowledge engineering, USAGoogle Scholar
  20. 20.
    Badri M, Touré F (2012) Empirical analysis of object-oriented design metrics for predicting unit testing effort of classes. J Softw Eng Appl 5(7):513Google Scholar
  21. 21.
    Zhou Y, Leung H, Song Q, Zhao J, Lu H, Chen L, Xu B (2012) An in-depth investigation into the relationships between structural metrics and unit testability in OOS. Inf Sci 55(12):2800Google Scholar
  22. 22.
    Toure F, Badri M, Lamontagne L (2014) Towards a metrics suite for JUnit test cases. In: 26th international conference on software engineering and knowledge engineering (SEKE), VancouverGoogle Scholar
  23. 23.
    Toure F, Badri M, Lamontagne L (2014) A metrics suite for JUnit test code: a multiple case study on open source software. J Softw Eng Res Dev (JSERD) 2:14Google Scholar
  24. 24.
    Toure F, Badri M, Lamontagne L (2018) Predicting different levels of the unit testing effort of classes using source code metrics: a multiple case study on open-source software. Innov Syst Softw Eng 14:15–46Google Scholar
  25. 25.
    Chidamber SR, Kemerer CF (1994) A Metrics suite for OO design. IEEE Trans Softw Eng 20(6):476–493Google Scholar
  26. 26.
    Chidamber SR, Darcy DP, Kemerer CF (1998) Managerial use of metrics for object-oriented software: an exploratory analysis. IEEE Trans Softw Eng 24(8):629–639Google Scholar
  27. 27.
    Hegedüs G, Hrabovszki G (2010) Effect of object-oriented refactoring on testability, error proneness and other maintainability attributes. In: ECOOP’2010 Maribor, Slovenia EU, ACMGoogle Scholar
  28. 28.
    Kataoka Y, Imai T, Andou H, Fukaya T (2002) A quantitative evaluation of maintainability enhancement by refactoring. In: Proceedings of the international conference on software maintenanceGoogle Scholar
  29. 29.
    Dandashi F (2002) A method for assessing the reusability of object-oriented code using a validated set of automated measurements. In: Proceedings of the ACM symposium on applied computingGoogle Scholar
  30. 30.
    Murgia A, Tonelli R, Marchesi M, Concas G, Counsell S, McFall J, Swift S (2012) Refactoring and its relationship with fan-in and fan-out: an empirical study. In: Proceedings of the 16th European conference on software maintenance and reengineering (CSMR)Google Scholar
  31. 31.
    Szöke G, Csaba Nagy G, Ferenc R, Gyimòthy T (2017) Empirical study on refactoring large-scale industrial systems and its effects on maintainability. J Syst Softw 129:107Google Scholar
  32. 32.
    Kadar I, Hegedüs P, Ferenc R, Gyimothy T (2016) A code refactoring dataset and its assessment regarding software maintainability. In: 23rd international conference on software analysis, evolution, and reengineeringGoogle Scholar
  33. 33.
    Basit HA, Hammad M, Koschke R (2015) A survey on goal-oriented visualization of clone data. In: VISSOFT 2015, IEEE, BremenGoogle Scholar
  34. 34.
    Sajnani H, Sainiy V, Svajlenkoz J, Roy CK, Lopesy CV, Sourcerer CC (2016) Scaling code clone detection to big-code. In: 38th international conference on software engineering. IEEE/ACMGoogle Scholar
  35. 35.
    Kapser C, Godfrey MW (2006) Cloning considered harmful. In: Proceedings of the 13th working conference on reverse engineering (WCRE’06), IEEEGoogle Scholar
  36. 36.
    Koschke R (2007) Survey of research on software clones. In: Proceedings of duplication, redundancy, and similarity in softwareGoogle Scholar
  37. 37.
    Toomim M, Begel A, Graham S (2004) Managing duplicated code with linked editing. In: 2004 IEEE symposium on visual languages and human centric computingGoogle Scholar
  38. 38.
    Kim M, Sazawal V, Notkin D, Murphy G (2005) An empirical study of code clone genealogies. In: Proceedings of FSEGoogle Scholar
  39. 39.
    Kapser C, Gofrey M (2008) Cloning considered harmful: patterns of cloning in software. Empir Softw Eng 13(6):645–692Google Scholar
  40. 40.
    Rahman F, Bird C, Devanbu P (2012) Clones: what is that smell? Empir Softw Eng 17(4–5):503–530Google Scholar
  41. 41.
    Mondal M, Rahman S, Saha RK, Roy CK, Krinke J, Schneider KA (2011) An empirical study of the impacts of clones in software maintenance. In: 19th international conference on program comprehension, IEEEGoogle Scholar
  42. 42.
    Rahman F, Bird C, Devanbu P (2012) Clones: what is that smell? Empir Softw Eng 7(4–5):503–530Google Scholar
  43. 43.
    Saidur Rahman M, Aryaniy A, Roy CK, Perinz F (2013) On the relationships between domain-based coupling and code clones: an exploratory study. ICSE, San FranciscoGoogle Scholar
  44. 44.
    Saidur Rahman M, Roy CK (2017) On the relationships between stability and bug-proneness of code clones: an empirical study. In: 17th international working conference on source code analysis and manipulation, IEEEGoogle Scholar
  45. 45.
    Devi U, Sharma A, Kesswani N (2016) A review on quality models to analyze the impact of refactoring code on maintainability with reference to software product line. In: International conference on computing for sustainable global development (INDIACom)Google Scholar
  46. 46.
    Basili VR, Briand LC, Melo W (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751Google Scholar
  47. 47.
    Fenton N, Pfleeger SL (1997) Software metrics: a rigorous and practical approach. PWS Publishing Company, BostonGoogle Scholar
  48. 48.
    Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 3(10):897–910Google Scholar
  49. 49.
    Zhou Y, Leung H (2006) Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans Softw Eng 32(10):771–789Google Scholar
  50. 50.
    Zhou Y, Xu B, Leung H (2010) On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. J Syst Softw 83:4Google Scholar
  51. 51.
    Shatnawi R, Li W, Swain J, Newman T (2010) Finding software metrics threshold values using ROC curves. J Softw Maint Evol Res Pract 22:1–16Google Scholar
  52. 52.
    Shatnawi R (2010) A quantitative investigation of the acceptable risk levels of object-oriented metrics in open-source systems. IEEE Trans Softw Eng 36:216–225Google Scholar
  53. 53.
    Srivastava S, Kumar R (2013) Indirect method to measure software quality using CK-OO suite. In: International conference on intelligent systems and signal processing (ISSP), IEEEGoogle Scholar
  54. 54.
    Isong B, Obeten E (2013) A systematic review of the empirical validation of object-oriented metrics towards fault-proneness prediction. Int J Softw Eng Knowl Eng 23:1513Google Scholar
  55. 55.
    Boucher A, Badri M (2018) Software metrics thresholds calculation techniques to predict fault-proneness: an empirical comparison. Inf Softw Technol 96:38Google Scholar
  56. 56.
    Shatnawi R (2015) Deriving metrics thresholds using log transformation. J Softw 27(2):95–113Google Scholar
  57. 57.
    Binder RV (1994) Design for testability in object-oriented systems. Commun ACM 37(9):87–101Google Scholar
  58. 58.
    Malhotra R, Bansal AJ (2015) Fault prediction considering threshold effects of object-oriented metrics. Expert Syst 32(2):203Google Scholar
  59. 59.
    Kaur A, Kaur K (2014) Performance analysis of ensemble learning for predicting defects in open source software. In: 2014 international conference on advances in computing, communications and informatics (ICACCI)Google Scholar
  60. 60.
    Moeyersoms J, Junqué de Fortuny E, Dejaeger K, Baesens B, Martens D (2015) Comprehensible software fault and effort prediction: a data mining approach. J Syst Softw 100:203Google Scholar
  61. 61.
    Aha D, Kibler D (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66zbMATHGoogle Scholar
  62. 62.
    Shatnawi R (2012) Improving software fault-prediction for imbalanced data. In: International conference on innovations in information technology, IITGoogle Scholar
  63. 63.
    Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng 38(6):1276Google Scholar
  64. 64.
    Shatnawi R (2017) The application of ROC analysis in threshold identification, data im balance and metrics selection for software fault prediction. Innov Syst Softw Eng 13:201Google Scholar
  65. 65.
    Breiman L (2001) Random forests. Mach Learn 45:5zbMATHGoogle Scholar
  66. 66.
    Moeyersoms J, Junqué de Fortuny E, Dejaeger K, Baesens B, Martens D (2015) Comprehensible software fault and effort prediction: a data mining approach. J Syst Softw 100:80–90Google Scholar
  67. 67.
    Malhotra R, Bansal AJ (2015) Fault prediction considering threshold effects of object-oriented metrics. Expert Syst 32(2):203–219Google Scholar
  68. 68.
    Malhotra R, Jain A (2012) Fault prediction using statistical and machine learning methods for improving software quality. J Inf Process Syst 8(2):241–262Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Software Engineering Research Laboratory, Department of Mathematics and Computer ScienceUniversity of QuebecTrois-RivièresCanada

Personalised recommendations