Skip to main content
Log in

On the applicability of search-based algorithms for software change prediction

  • Original article
  • Published:
International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Abstract

Numerous research studies have claimed that search-based algorithms have the potential to be effectively used in various software engineering domains. An important task in software organizations is to efficiently recognize change prone classes of a software, as it is crucial to plan efficient resource utilization and to take precautionary design measures as early as possible in the software product lifecycle. This assures development of good quality software products at lower costs. The current study attempts to evaluate the capability of search-based algorithms while developing prediction models for identification of the change prone classes in a software. Though previous literature has evaluated the use of statistical category and machine learning category of algorithms in this domain, the suitability of search-based algorithms needs extensive investigation in this area. Furthermore, the study compares the performance of search-based classifiers with statistical and machine learning classifiers, by empirically validating the results on fourteen open source data sets. The results indicate comparable and in some cases even better performance of search based algorithms in comparison to other evaluated categories of algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Abdelhalim M.B, Habib SED (2009). Particle swarm optimization for HW/SW partitioning. In: Lazinica A (ed) Particle swarm optimization. In-Tech Publication, pp 49–76

  • Abdi Y, Parsa S, Seyfari Y (2015) A hybrid one-class rule learning approach based on swarm intelligence for software fault prediction. Innov Syst Softw Eng 11(4):289–301

    Article  Google Scholar 

  • Aggarwal KK, Singh Y, Kaur A, Malhotra R (2006) Empirical study of object-oriented metrics. J Object Technol 5(8):149–173

    Article  Google Scholar 

  • Aguilar-Ruiz JS, Riquelme JC, Toro M (2003) Evolutionary learning of hierarchical decision rules. IEEE Trans Syst Man Cybern Part B (Cybern) 33(2):324–331

    Article  Google Scholar 

  • Ali S, Briand LC, Hemmati H, Panesar-Walawege RK (2010) A systematic review of the application and empirical investigation of search-based test case generation. IEEE Trans Softw Eng 36(6):742–762

    Article  Google Scholar 

  • Arcuri A, Fraser G (2013) Parameter tuning or default values? an empirical investigation in search-based software engineering. Empir Softw Eng 18(3):594–623

    Article  Google Scholar 

  • Arisholm E, Briand LC, Foyen A (2004) Dynamic coupling measurement for object-oriented software. IEEE Trans Softw Eng 30(8):491–506

    Article  Google Scholar 

  • Azar D (2010) A genetic algorithm for improving accuracy of software quality predictive models: a search-based software engineering approach. Int J Comput Intell Appl 9(02):125–136

    Article  MATH  Google Scholar 

  • Azar D, Vybihal J (2011) An ant colony optimization algorithm to improve software quality prediction models: case of class stability. Inf Softw Technol 53(4):388–393

    Article  Google Scholar 

  • Bacardit J (2004) Pittsburgh genetics-based machine learning in the data mining era: representations, generalization, and run-time. Doctoral dissertation, Ramon Llull University, Barcelona, Catalonia, Spain

  • Bacardit J, Garrell JM (2003) Evolving multiple discretizations with adaptive intervals for a pittsburgh rule-based learning classifier system. In: Genetic and evolutionary computation conference 2003, pp. 1818–1831. Springer, Berlin

  • Bacardit J, Krasnogor N (2009) Performance and efficiency of memetic pittsburgh learning classifier systems. Evol Comput 17(3):307–342

    Article  Google Scholar 

  • Bansal A (2017) Empirical analysis of search based algorithms to identify change prone classes of open source software. Comput Lang Syst Struct 47:211–231

    Google Scholar 

  • Bardsiri VK, Jawawi DN, Hashim SZ, Khatibi E (2013) A PSO-based model to increase the accuracy of software development effort estimation. Softw Qual J 21(3):501–526

    Article  Google Scholar 

  • Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761

    Article  Google Scholar 

  • Bernadó-Mansilla E, Garrell-Guiu JM (2003) Accuracy-based learning classifier systems: models, analysis and applications to classification tasks. Evol Comput 11(3):209–238

    Article  Google Scholar 

  • Boughorbel S, Jarray F, El-Anbari M (2017) Optimal classifier for imbalanced data using matthews correlation coefficient metric. PloS one 12(6):p.e0177678

    Article  Google Scholar 

  • Briand LC, Daly JW, Wüst JK (1998) A unified framework for cohesion measurement in object-oriented systems. Empir Softw Eng 3(1):65–117

    Article  Google Scholar 

  • Briand LC, Daly JW, Wust JK (1999) A unified framework for coupling measurement in object-oriented systems. IEEE Trans Softw Eng 25(1):91–121

    Article  Google Scholar 

  • Briand LC, Wüst J, Daly JW, Porter DV (2000) Exploring the relationships between design measures and software quality in object-oriented systems. J Syst Softw 51(3):245–273

    Article  Google Scholar 

  • Briand LC, Wüst J, Lounis H (2001) Replicated case studies for investigating quality factors in object-oriented designs. Empir Softw Eng 6(1):11–58

    Article  MATH  Google Scholar 

  • Burgess CJ, Lefley M (2001) Can genetic programming improve software effort estimation? A comparative evaluation. Inf Softw Technol 43(14):863–873

    Article  Google Scholar 

  • Butz MV, Kovacs T, Lanzi PL, Wilson SW (2001) How XCS evolves accurate classifiers. In: Pesic B (ed) Proceedings of the 3rd annual conference on genetic and evolutionary computation. morgan kaufmann publishers inc, USA, pp. 927–934

  • Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21(1):6

    Article  Google Scholar 

  • Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493

    Article  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach learn 20(3):273–97

    Article  MATH  Google Scholar 

  • De Carvalho AB, Pozo A, Vergilio SR (2010) A symbolic fault-prediction model based on multiobjective particle swarm optimization. J Syst Softw 83(5):868–882

    Article  Google Scholar 

  • Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach learn res 7:1–30

    MATH  Google Scholar 

  • Elish MO, Al-Rahman Al-Khiaty M (2013) A suite of metrics for quantifying historical changes to predict future change-prone classes in object-oriented software. J Softw Evolut Process 25(5):407–437

    Article  Google Scholar 

  • Eski S, Buzluca F (2011) An empirical study on object-oriented metrics and software evolution in order to reduce testing costs by predicting change-prone classes. In: 2011 IEEE fourth international conference on software testing, verification and validation workshops, pp. 566–571. IEEE.

  • Ferreira C (2001) Gene expression programming: a new adaptive algorithm for solving problems. Complex Syst 13(2):89–129

    MATH  Google Scholar 

  • Ferrucci F, Salza P, Sarro F (2018) Using hadoop mapreduce for parallel genetic algorithms: a comparison of the global, grid and island models. Evol Comput 26(4):535–567

    Article  Google Scholar 

  • Fogel DB (1997) The advantages of evolutionary computation. In: Proceedings of biocomputing and emergent computation, pp. 1–11

  • Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701

    Article  MATH  Google Scholar 

  • Giger E, Pinzger M, Gall HC (2012). Can we predict types of code changes? an empirical analysis. In: 2012 9th IEEE working conference on mining software repositories (MSR), pp. 217–226. IEEE

  • Harman M (2010a) The relationship between search based software engineering and predictive modeling. In: Proceedings of the 6th international conference on predictive models in software engineering, pp. 1–13. ACM

  • Harman M (2010b) Why the virtual nature of software makes it ideal for search based optimization. In: International conference on fundamental approaches to software engineering, pp. 1–12. Springer, Berlin

  • Harman M, Clark J (2004) Metrics are fitness functions too. In: 10th international symposium on software metrics, pp. 58–69. IEEE

  • Harman M, Jones BF (2001) Search-based software engineering. Inf Softw Technol 43(14):833–839

    Article  Google Scholar 

  • Harman M, McMinn P, De Souza JT, Yoo S (2012) Search based software engineering: techniques, taxonomy, tutorial. Empirical software engineering and verification. Springer, Berlin, pp 1–59

    Google Scholar 

  • Harman M, Islam S, Jia Y, Minku LL, Sarro F, Srivisut K (2014) Less is more: temporal fault predictive performance over multiple hadoop releases. international symposium on search based software engineering. Springer, Cham, pp 240–246

    Google Scholar 

  • Haykin S, Network N (2004) A comprehensive foundation. Neural networks Pearson Education, Delhi

    Google Scholar 

  • Hosseini S, Turhan B, Mäntylä M (2018) A benchmark study on the effectiveness of search-based data selection and feature selection for cross project defect prediction. Inf Softw Technol 95:296–312

    Article  Google Scholar 

  • Jin C, Jin SW (2015) Prediction approach of software fault-proneness based on hybrid artificial neural network and quantum particle swarm optimization. Appl Soft Comput 35:717–725

    Article  Google Scholar 

  • Kaur L, Mishra A, (2018). A comparative analysis of evolutionary algorithms for the prediction of software change. In: International conference on innovations in information technology, pp. 187–192. IEEE

  • Koru AG, Liu H (2007) Identifying and characterizing change-prone classes in two large-scale open-source products. J Syst Softw 80(1):63–73

    Article  Google Scholar 

  • Koru AG, Tian J (2005) Comparing high-change modules and modules with the highest measurement values in two large-scale open-source products. IEEE Trans Software Eng 31(8):625–642

    Article  Google Scholar 

  • Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one-sided selection. Int Conf Mach Learn 97:179–186

    Google Scholar 

  • Kumar S, Pal SK, Singh RP (2016) Intelligent energy conservation: indoor temperature forecasting with extreme learning machine. In: International symposium on intelligent systems technologies and applications, pp. 977–988. Springer, Cham

  • Kumar S, Kalia A, Sharma A (2017) Predictive analysis of alertness related features for driver drowsiness detection. In: International conference on intelligent systems design and applications , pp. 368–377. Springer, Cham

  • Kumar L, Behera RK, Rath S, Sureka A (2017) Transfer learning for cross-project change-proneness prediction in object-oriented software systems: a feasibility analysis. ACM SIGSOFT Softw Eng Notes 42(3):1–1

    Google Scholar 

  • Kumar S, Singh J, Singh O (2020) Ensemble-based extreme learning machine model for occupancy detection with ambient attributes. Int J Syst Assur Eng Manag 11:173–183

    Article  Google Scholar 

  • Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496

    Article  Google Scholar 

  • Lu H, Zhou Y, Xu B, Leung H, Chen L (2012) The ability of object-oriented metrics to predict change-proneness: a meta-analysis. Empir Softw Eng 17(3):200–242

    Article  Google Scholar 

  • Malhotra R, Khanna M (2013) Investigation of relationship between object-oriented metrics and change proneness. Int J Mach Learn Cybern 4(4):273–286

    Article  Google Scholar 

  • Malhotra R, Khanna M (2014) The ability of search-based algorithms to predict change-prone classes. Softw Qual Prof 17(1):17

    Google Scholar 

  • Malhotra R, Khanna M (2017) An empirical study for software change prediction using imbalanced data. Empir Softw Eng 22(6):2806–2851

    Article  Google Scholar 

  • Malhotra R, Khanna M (2017) An exploratory study for software change prediction in object-oriented systems using hybridized techniques. Autom Softw Eng 24(3):673–717

    Article  Google Scholar 

  • Malhotra R, Khanna M (2018) Prediction of change prone classes using evolution-based and object-oriented metrics. J Intell Fuzzy Syst 34(3):1755–1766

    Article  Google Scholar 

  • Malhotra R, Khanna M, Raje RR (2017) On the application of search-based techniques for software engineering predictive modeling: a systematic review and future directions. Swarm Evol Comput 32:85–109

    Article  Google Scholar 

  • Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13

    Article  Google Scholar 

  • Rathore SS, Gupta A (2012) Validating the effectiveness of object-oriented metrics over multiple releases for predicting fault proneness. In: 2012 19th Asia-Pacific software engineering conference, Vol. 1, pp. 350–355. IEEE

  • Romano D, Pinzger M (2011) Using source code metrics to predict change-prone java interfaces. In: 2011 27th IEEE international conference on software maintenance (ICSM) ,pp. 303–312. IEEE

  • Ryu D, Baik J (2016) Effective multi-objective naïve Bayes learning for cross-project defect prediction. Appl Soft Comput 49:1062–1077

    Article  Google Scholar 

  • Singh Y, Malhotra R (2012) Object-oriented software engineering. PHI Learning, New Delhi

    Google Scholar 

  • Singh Y, Kaur A, Malhotra R (2010) Empirical validation of object-oriented metrics for predicting fault proneness models. Softw Qual J 18(1):3–35

    Article  Google Scholar 

  • Sousa T, Silva A, Neves A (2004) Particle swarm based data mining algorithms for classification tasks. Parallel Comput 30(5–6):767–783

    Article  Google Scholar 

  • Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J Roy Stat Soc: Ser B (Methodol) 36(2):111–133

    MATH  Google Scholar 

  • Xia X, Lo D, Pan SJ, Nagappan N, Wang X (2016) Hydra: massively compositional model for cross-project defect prediction. IEEE Trans Softw Eng 42(10):977–998

    Article  Google Scholar 

  • Zhou Y, Leung H, Xu B (2009) Examining the potentially confounding effect of class size on the associations between object-oriented metrics and change-proneness. IEEE Trans Softw Eng 35(5):607–623

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Megha Khanna.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Malhotra, R., Khanna, M. On the applicability of search-based algorithms for software change prediction. Int J Syst Assur Eng Manag 14, 55–73 (2023). https://doi.org/10.1007/s13198-021-01099-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13198-021-01099-7

Keywords

Navigation