Soft Computing

, Volume 22, Issue 6, pp 1959–1980 | Cite as

Multi-objective cross-version defect prediction

  • Swapnil Shukla
  • T. Radhakrishnan
  • K. Muthukumaran
  • Lalita Bhanu Murthy Neti
Methodologies and Application


Defect prediction models help software project teams to spot defect-prone source files of software systems. Software project teams can prioritize and put up rigorous quality assurance (QA) activities on these predicted defect-prone files to minimize post-release defects so that quality software can be delivered. Cross-version defect prediction is building a prediction model from the previous version of a software project to predict defects in the current version. This is more practical than the other two ways of building models, i.e., cross-project prediction model and cross- validation prediction models, as previous version of same software project will have similar parameter distribution among files. In this paper, we formulate cross-version defect prediction problem as a multi-objective optimization problem with two objective functions: (a) maximizing recall by minimizing misclassification cost and (b) maximizing recall by minimizing cost of QA activities on defect prone files. The two multi-objective defect prediction models are compared with four traditional machine learning algorithms, namely logistic regression, naïve Bayes, decision tree and random forest. We have used 11 projects from the PROMISE repository consisting of a total of 41 different versions of these projects. Our findings show that multi-objective logistic regression is more cost-effective than single-objective algorithms.


Cross-version defect prediction Multi-objective optimization Search-based software engineering Misclassification cost Cost-effectiveness 


Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.


  1. Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. Softw Eng IEEE Trans 22(10):751–761CrossRefGoogle Scholar
  2. Canfora G, De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S (2013) Multi-objective cross-project defect prediction. In: 2013 IEEE sixth international conference on software testing, verification and validation (ICST), IEEE, pp 252–261Google Scholar
  3. Canfora G, Lucia AD, Penta MD, Oliveto R, Panichella A, Panichella S (2015) Defect prediction as a multiobjective optimization problem. Softw Test Verif Reliab 25(4):426–459CrossRefGoogle Scholar
  4. Chidamber S, Kemerer C (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493CrossRefGoogle Scholar
  5. Coello CC, Lamont GB, Van Veldhuizen DA (2007) Evolutionary algorithms for solving multi-objective problems. Springer, BerlinzbMATHGoogle Scholar
  6. Czibula G, Marian Z, Czibula IG (2014) Software defect prediction using relational association rule mining. Inf Sci 264:260–278. doi: 10.1016/j.ins.2013.12.031 CrossRefGoogle Scholar
  7. D’Ambros M, Lanza M, Robbes R (2010) An extensive comparison of bug prediction approaches. In: 2010 7th IEEE working conference on mining software repositories (MSR), IEEE, pp 31–41Google Scholar
  8. De Carvalho AB, Pozo A, Vergilio SR (2010) A symbolic fault-prediction model based on multiobjective particle swarm optimization. J Syst Softw 83(5):868–882CrossRefGoogle Scholar
  9. Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, New YorkzbMATHGoogle Scholar
  10. Deb K, Agrawal S, Pratap A, Meyarivan T (2000) A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. Lect Notes Comput Sci 1917:849–858CrossRefGoogle Scholar
  11. Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the 37th international conference on software engineering, ICSE ’15—Volume 1. IEEE Press, Piscataway, pp 789–800Google Scholar
  12. Goldberg DE (2006) Genetic algorithms. Pearson Education India, New DelhiGoogle Scholar
  13. Harman M (2010) The relationship between search based software engineering and predictive modeling. In: Proceedings of the 6th international conference on predictive models in software engineering, PROMISE ’10. ACM, New York, pp 1:1–1:13. doi: 10.1145/1868328.1868330
  14. Harman M, Clark J (2004) Metrics are fitness functions too. In: Proceedings of 10th international symposium on software metrics. IEEE, pp 58–69Google Scholar
  15. Hassan AE (2009) Predicting faults using the complexity of code changes. In: Proceedings of the 31st international conference on software engineering. IEEE Computer Society, pp 78–88Google Scholar
  16. He Z, Peters F, Menzies T, Yang Y (2013) Learning from open-source projects: an empirical study on defect prediction. In: 2013 ACM/IEEE international symposium on empirical software engineering and measurement, pp 45–54. doi: 10.1109/ESEM.2013.20
  17. Herbold S (2013) Training data selection for cross-project defect prediction. In: Proceedings of the 9th international conference on predictive models in software engineering, PROMISE ’13. ACM, New York, pp 6:1–6:10. doi: 10.1145/2499393.2499395
  18. Jureczko M, Madeyski L (2010) Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th international conference on predictive models in software engineering, PROMISE ’10. ACM, New York, pp 9:1–9:10. doi: 10.1145/1868328.1868342
  19. Kamei Y, Matsumoto S, Monden A, Matsumoto Ki, Adams B, Hassan A (2010) Revisiting common bug prediction findings using effort-aware models. In: 2010 IEEE international conference on software maintenance (ICSM), pp 1–10. doi: 10.1109/ICSM.2010.5609530
  20. Kim S, Zimmermann T, Whitehead EJ Jr, Zeller A (2007) Predicting faults from cached history. In: Proceedings of the 29th international conference on software engineering. IEEE Computer Society, pp 489–498Google Scholar
  21. Krall J, Menzies T, Davies M (2015) GALE: Geometric active learning for search-based software engineering. IEEE Trans Softw Eng 41(10):1001–1018CrossRefGoogle Scholar
  22. Krishnan S, Strasburg C, Lutz RR, Goševa-Popstojanova K (2011) Are change metrics good predictors for an evolving software product line? In: Proceedings of the 7th international conference on predictive models in software engineering, Promise ’11. ACM, New York, pp 7:1–7:10. doi: 10.1145/2020390.2020397
  23. Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496CrossRefGoogle Scholar
  24. Ma W, Chen L, Yang Y, Zhou Y, Xu B (2016) Empirical analysis of network measures for effort-aware fault-proneness prediction. Inf Softw Technol 69:50–70CrossRefGoogle Scholar
  25. Marian Z, Czibula IG, Czibula G, Sotoc S (2015) Software defect detection using self-organizing maps. Stud Unive Babes-Bolyai Inform 60(2):55–69Google Scholar
  26. MATLAB (2015) version 8.5.0 (R2015a). The MathWorks Inc., NatickGoogle Scholar
  27. Mende T, Koschke R (2009) Revisiting the evaluation of defect prediction models. In: Proceedings of the 5th international conference on predictor models in software engineering, PROMISE ’09. ACM, New York, pp 7:1–7:10. doi: 10.1145/1540438.1540448
  28. Menzie T, Krishna R, Pryor D (2015) The promise repository of empirical software engineering data.
  29. Moser R, Pedrycz W, Succi G (2008) A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: ACM/IEEE 30th international conference on software engineering, 2008. ICSE’08. IEEE, pp 181–190Google Scholar
  30. Muthukumaran K, Choudhary A, Murthy NB (2015) Mining github for novel change metrics to predict buggy files in software systems. In: 2015 international conference on computational intelligence and networks (CINE). IEEE, pp 15–20Google Scholar
  31. Peters F, Menzies T, Marcus A (2013) Better cross company defect prediction. In: 2013 10th IEEE working conference on mining software repositories (MSR), pp 409–418. doi: 10.1109/MSR.2013.6624057
  32. Rahman F, Posnett D, Devanbu P (2012) Recalling the “imprecision” of cross-project defect prediction. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the foundations of software engineering, FSE ’12. ACM, New York, pp 61:1–61:11. doi: 10.1145/2393596.2393669
  33. Subramanyam R, Krishnan M (2003) Empirical analysis of CK metrics for object-oriented design complexity: implications for software defects. IEEE Transa Softw Eng 29(4):297–310. doi: 10.1109/TSE.2003.1191795 CrossRefGoogle Scholar
  34. Yang X, Tang K, Yao X (2015) A learning-to-rank approach to software defect prediction. IEEE Trans Reliab 64(1):234–246CrossRefGoogle Scholar
  35. Zhang D, El Emam K, Liu H et al (2009) An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans Softw Eng 35(2):293–304CrossRefGoogle Scholar
  36. Zimmermann T, Premraj R, Zeller A (2007) Predicting defects for eclipse. In: International workshop on predictor models in software engineering, PROMISE’07: ICSE Workshops 2007. IEEE, pp 9–9Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Swapnil Shukla
    • 1
  • T. Radhakrishnan
    • 1
  • K. Muthukumaran
    • 1
  • Lalita Bhanu Murthy Neti
    • 1
  1. 1.Department of Computer Science and Information SystemsBITS Pilani Hyderabad CampusShameerpet, HyderabadIndia

Personalised recommendations