Abstract
In the realm of software testing various organizations wish to predict the faults in their software systems prior to their deployment. This improves the delivered quality and also reduces the maintenance effort. A multitude of software metrics and statistical models have been developed to solve this problem and one such method is called defect prediction. Defect prediction is the process of identifying the defects in the software program prior to its deployment. In recent times, a class of learners called evolutionary computation (EC) techniques has emerged. These EC techniques apply the Darwinian principle of ‘survival of the fittest’. This study performs an empirical assessment of the performance of various EC techniques in the prediction of software defects over multiple data sets. An empirical assessment compares and assesses the performance capability of 16 EC techniques for evaluating the relationship between object-oriented metrics and defect prediction. The developed models are validated using 7 data sets obtained from open source software systems developed by the Software Foundation. On investigating their predictive capabilities and comparative performance, it was found that a majority of EC techniques proved to be highly effective. DTG (a hybridized algorithm) was observed to be the best performing technique. The work done in the current study shows that EC techniques are very effective and can be highly beneficial to testers in the realm of defect prediction in the future.
Similar content being viewed by others
Change history
08 September 2018
Unfortunately, Acknowledgment section was missing in the original article. It is given below.
08 September 2018
Unfortunately, Acknowledgment section was missing in the original article. It is given below.
08 September 2018
Unfortunately, Acknowledgment section was missing in the original article. It is given below.
08 September 2018
Unfortunately, Acknowledgment section was missing in the original article. It is given below.
08 September 2018
Unfortunately, Acknowledgment section was missing in the original article. It is given below.
08 September 2018
Unfortunately, Acknowledgment section was missing in the original article. It is given below.
08 September 2018
Unfortunately, Acknowledgment section was missing in the original article. It is given below.
References
Grosan, C.; Abraham, A.: Hybrid evolutionary algorithms: methodologies, architectures, and reviews. In: Abraham, A., Grosan, C., Ishibuchi, H. (eds.) Hybrid Evolutionary Algorithms, pp. 1–17. Springer, Berlin Heidelberg (2007)
Elish, K.O.; Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Syst. Softw. 81(5), 649–660 (2008)
Harman, M.: Why the virtual nature of software makes it ideal for search based optimization. In: International Conference on Fundamental Approaches to Software Engineering, pp. 1–12. Springer, Berlin, Heidelberg (2010)
Jiang, Y.; Cukic, B.; Menzies, T.; Lin, J.: Incremental development of fault prediction models. Int. J. Softw. Eng. Knowl. Eng. 23(10), 1399–1425 (2013)
Chhillar, R.S.: Empirical analysis of object-oriented design metrics for predicting high, medium and low severity faults using mallows C p. ACM SIGSOFT Softw. Eng. Notes 36(6), 1–9 (2011)
Chidamber, S.R.; Kemerer, C.F.: A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 20(6), 476–493 (1994)
Gyimothy, T.; Ferenc, R.; Siket, I.: Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng. 31(10), 897–910 (2005)
Zhou, Y.; Xu, B.; Leung, H.: On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. J. Syst. Softw. 83(4), 660–674 (2010)
Rodriguez, D.; Ruiz, R.; Riquelme, J.C.; Harrison, R.: A study of subgroup discovery approaches for defect prediction. Inf. Softw. Technol. 55(10), 1810–1822 (2013)
Yu, L.: An evolutionary programming based asymmetric weighted least squares support vector machine ensemble learning methodology for software repository mining. Inf. Sci. 191, 31–46 (2012)
Rodrguez, D.; Ruiz, R.; Riquelme, J.C.; Aguilar-Ruiz, J.S.: Searching for rules to detect defective modules: a subgroup discovery approach. Inf. Sci. 191, 14–30 (2012)
Liu, Y.; Khoshgoftaar, T.M.; Seliya, N.: Evolutionary optimization of software quality modeling with multiple repositories. IEEE Trans. Softw. Eng. 36(6), 852–864 (2010)
Catal, C.; Diri, B.: Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf. Sci. 179(8), 1040–1058 (2009)
De Carvalho, A.B.; Pozo, A.; Vergilio, S.; Lenz, A.: Predicting fault proneness of classes trough a multiobjective particle swarm optimization algorithm. In: ICTAI’08. 20th IEEE International Conference on Tools with Artificial Intelligence, 2008, vol. 2, pp. 387–394. IEEE (2008)
Catal, C.; Diri, B.; Ozumut, B.: An artificial immune system approach for fault prediction in object-oriented software. In: 2nd International Conference on Dependability of Computer Systems, 2007. DepCoS-RELCOMEX’07, pp. 238–245. IEEE (2007)
Lessmann, S.; Baesens, B.; Mues, C.; Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)
Shatnawi, R.; Li, W.: The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. J. Syst. Softw. 81(11), 1868–1882 (2008)
Semwal, V.B.; Mondal, K.; Nandi, G.C.: Robust and accurate feature selection for humanoid push recovery and classification: deep learning approach. Neural Comput. Appl. 28, 1–10 (2015)
Semwal, V.B.; Singha, J.; Sharma, P.K.; Chauhan, A.; Behera, B.: An optimized feature selection technique based on incremental feature analysis for bio-metric gait data classification. Multimed. Tools Appl. 76, 1–192 (2016)
Vandecruys, O.; Martens, D.; Baesens, B.; Mues, C.; De Backer, M.; Haesen, R.: Mining software repositories for comprehensible software fault prediction models. J. Syst. Softw. 81(5), 823–839 (2008)
Gondra, I.: Applying machine learning to software fault-proneness prediction. J. Syst. Softw. 81(2), 186–195 (2008)
Kanmani, S.; Uthariaraj, V.R.; Sankaranarayanan, V.; Thambidurai, P.: Object-oriented software fault prediction using neural networks. Inf. Softw. Technol. 49(5), 483–492 (2007)
Di Martino, S.; Ferrucci, F.; Gravino, C.; Sarro, F.: A genetic algorithm to configure support vector machines for predicting fault-prone components. In: International Conference on Product Focused Software Process Improvement, pp. 247–261. Springer, Berlin, Heidelberg (2011)
Azar, D.; Vybihal, J.: An ant colony optimization algorithm to improve software quality prediction models: case of class stability. Inf. Softw. Technol. 53(4), 388–393 (2011)
Pal, A.; Jain, H.; Kumar, M.: Optimizing software error proneness prediction using bird mating algorithm. In: Mahmood, Z. (ed.) Software Project Management for Distributed Computing, pp. 257–287. Springer International Publishing (2017)
Rathore, S.S.; Kumar, S.: Towards an ensemble based system for predicting the number of software faults. Expert Syst. Appl. 82, 357–382 (2017)
Bansiya, J.; Davis, C.G.: A hierarchical model for object-oriented design quality assessment. IEEE Trans. Softw. Eng. 28(1), 4–17 (2002)
Henderson-Sellers, B.: Object-Oriented Metrics: Measures of Complexity. Prentice-Hall Inc, Englewood Cliffs (1995)
Zou, H.; Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Khari, M., Kumar, P. Evolutionary Computation-Based Techniques Over Multiple Data Sets: An Empirical Assessment. Arab J Sci Eng 43, 3875–3885 (2018). https://doi.org/10.1007/s13369-017-2653-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-017-2653-5