Skip to main content
Log in

Software defect prediction ensemble learning algorithm based on 2-step sparrow optimizing extreme learning machine

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Software defect prediction is a crucial discipline within the software development life cycle. Accurate identification of defective modules in software can result in time and cost savings for developers. The ELM algorithm offers the benefits of rapid training and robust learning capabilities. Numerous researchers in the field of software defect prediction have employed the ELM algorithm. However, the ELM algorithm, a single hidden layer feedforward neural network, faces challenges related to random parameter selection and limited generalization ability. To enhance the predictive performance of the ELM algorithm in software defect prediction. Most researchers utilize swarm intelligence optimization algorithms to optimize extreme learning machines. However, these optimization methods may encounter challenges related to fall into local optimal solution. This paper introduces a new sparrow search algorithm (2SSSA) built upon the original sparrow search algorithm. To enhance the original sparrow algorithm’s ability to escape local extrema, the pinhole imaging reverse learning and somersault foraging strategies are employed. The performance of 2SSSA in terms of optimization and convergence speed is assessed using 8 randomly selected benchmark functions and 8 CEC2017 functions. Additionally, ensemble learning is a prominent research focus in the field of software defect prediction. Ensemble learning is known for its ability to significantly enhance prediction performance and model generalization. As a result, the ELM optimized using 2SSSA serves as the foundational predictor in the bagging ensemble learning algorithm. We propose an ensemble algorithm for software defect prediction, denoted as 2SSEBA, which employs a 2-step optimization sparrow algorithm (2SSSA) to optimize extreme learning machines. Based on an evaluation of 25 publicly available software defect prediction datasets using 5 commonly employed metrics. The predictive performance of 2SSEBA significantly outperforms the other five advanced prediction algorithms. Furthermore, this conclusion is supported by both Friedman ranking and Holm’s post-hoc test.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 2
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

Data will be made available on reasonable request. Datasets URL is http://promise.site.uottawa.ca/SERepository/datasets-page.html.

References

  1. Zhao, Y., Damevski, K., Chen, H.: A systematic survey of just-in-time software defect prediction. ACM Comput. Surv. 55(10), 1–35 (2023)

    Article  Google Scholar 

  2. Tabassum, S., Minku, L.L., Feng, D.: Cross-project online just-in-time software defect prediction. IEEE Trans. Softw. Eng. 49(1), 268–287 (2022)

    Article  Google Scholar 

  3. Wang, H., Zhuang, W., Zhang, X.: Software defect prediction based on gated hierarchical LSTMs. IEEE Trans. Reliab. 70(2), 711–727 (2021)

    Article  Google Scholar 

  4. Nevendra, M., Singh, P.: A survey of software defect prediction based on deep learning. Arch. Comput. Methods Eng. 29(7), 5723–5748 (2022)

    Article  Google Scholar 

  5. Cabral, G.G., Minku, L.L.: Towards reliable online just-in-time software defect prediction. IEEE Trans. Softw. Eng. 49(3), 1342–1358 (2022)

    Article  Google Scholar 

  6. Xu, J., Ai, J., Liu, J., Shi, T.: ACGDP: an augmented code graph-based system for software defect prediction. IEEE Trans. Reliab. 71(2), 850–864 (2022)

    Article  Google Scholar 

  7. Wan, X., Zheng, Z., Liu, Y.: SPE$^{2}$: self-paced ensemble of ensembles for software defect prediction. IEEE Trans. Reliab. 71(2), 865–879 (2022)

    Article  Google Scholar 

  8. Goyal, S.: Handling class-imbalance with KNN (neighbourhood) under-sampling for software defect prediction. Artif. Intell. Rev. 55(3), 2023–2064 (2022)

    Article  Google Scholar 

  9. Gong, L., Rajbahadur, G.K., Hassan, A.E.: Revisiting the impact of dependency network metrics on software defect prediction. IEEE Trans. Softw. Eng. 48(12), 5030–5049 (2021)

    Google Scholar 

  10. Gangwar, A.K., Kumar, S.: Concept drift in software defect prediction: a method for detecting and handling the drift. ACM Trans. Internet Technol. 23(2), 1–28 (2023)

    Article  Google Scholar 

  11. Gong, L., Zhang, H., Zhang, J., Wei, M., Huang, Z.: A comprehensive investigation of the impact of class overlap on software defect prediction. IEEE Trans. Softw. Eng. 49(4), 2440–2458 (2022)

    Article  Google Scholar 

  12. Xu, Z., Liu, J., Luo, X.P., Yang, Z.J., Zhang, Y.F., Yuan, P.P., Tang, Y.T., Zhang, T.: Software defect prediction based on kernel PCA and weighted extreme learning machine. Inf. Softw. Technol. 106, 182–200 (2019)

    Article  Google Scholar 

  13. Mi, W., Li, Y., Wen, M., Chen, Y.: Using active learning selection approach for cross-project software defect prediction. Connect. Sci. 34(1), 1482–1499 (2022)

    Article  Google Scholar 

  14. Mehta, S., Patnaik, K.S.: Improved prediction of software defects using ensemble machine learning techniques. Neural Comput. Appl. 33, 10551–10562 (2021)

    Article  Google Scholar 

  15. Zivkovic, T., Nikolic, B., Simic, V., Pamucar, D., Bacanin, N.: Software defects prediction by metaheuristics tuned extreme gradient boosting and analysis based on shapley additive explanations. Appl. Softw. Comput. 146, 110659 (2023)

    Article  Google Scholar 

  16. Jiang, F., Yu, X., Gong, D.W., Du, J.W.: A random approximate reduct-based ensemble learning approach and its application in software defect prediction. Inf. Sci. 609, 1147–1168 (2022)

    Article  Google Scholar 

  17. Thirumoorthy, K., Britto, J.J.J.: A feature selection model for software defect prediction using binary Rao optimization algorithm. Appl. Softw. Comput. 131, 109737 (2022)

    Article  Google Scholar 

  18. Tong, H.N., Lu, W., Xing, W.W., Liu, B., Wang, S.H.: SHSE: a subspace hybrid sampling ensemble method for software defect number prediction. Inf. Softw. Technol. 142, 106747 (2022)

    Article  Google Scholar 

  19. Feng, S., Keung, J., Yu, X., Xiao, Y., Bennin, K.E., Kabir, M.A., Zhang, M.: COSTE: complexity-based oversampling technique to alleviate the class imbalance problem in software defect prediction. Inf. Softw. Technol. 129, 106432 (2021)

    Article  Google Scholar 

  20. Ding, L., Zhang, X.Y., Wu, D.Y.: Application of an extreme learning machine network with particle swarm optimization in syndrome classification of primary liver cancer. J. Integr. Med. 19(5), 395–407 (2021)

    Article  Google Scholar 

  21. Li, L.L., Sun, J., Tseng, M.L.: Extreme learning machine optimized by whale optimization algorithm using insulated gate bipolar transistor module aging degree evaluation. Expert Syst. Appl. 127, 58–67 (2019)

    Article  Google Scholar 

  22. Kaur, G., Arora, S.: Chaotic whale optimization algorithm. J Comput. Design Eng. 5(3), 275–284 (2018)

    Article  Google Scholar 

  23. Abualigah, L., Diabat, A., Mirjalili, S., Abd, E.M., Gandomi, A.H.: The arithmetic optimization algorithm. Comput. Methods Appl. Mech. Eng. 376, 113609 (2021)

    Article  MathSciNet  Google Scholar 

  24. Abualigah, L., Yousri, D., Abd, E.M., Ewees, A.A., Al-Qaness, M.A., Gandomi, A.H.: Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput. Ind. Eng. 157, 107250 (2021)

    Article  Google Scholar 

  25. Abualigah, L., Abd, E.M., Sumari, P., Geem, Z.W., Gandomi, A.H.: Reptile search algorithm (RSA): a nature-inspired meta-heuristic optimizer. Expert Syst. Appl. 191, 116158 (2022)

    Article  Google Scholar 

  26. Xue, J.T., Shen, B.: A novel swarm intelligence optimization approach: sparrow search algorithm. Syst. Sci. Control Eng. 8(1), 22–34 (2020)

    Article  Google Scholar 

  27. Abualigah, L., Qasim, L.M.: Feature selection and enhanced krill herd algorithm for text document clustering. Springer, Berlin (2019)

    Book  Google Scholar 

  28. Ganti, P.K., Naik, H., Barada, M.K.: Environmental impact analysis and enhancement of factors affecting the photovoltaic (PV) energy utilization in mining industry by sparrow search optimization based gradient boosting decision tree approach. Energy 244, 122561 (2022)

    Article  Google Scholar 

  29. Ouyang, C.T., Qiu, Y., Zhu, D.L.: Adaptive spiral flying sparrow search algorithm. Sci. Progr. 2021, 1–16 (2021)

    Google Scholar 

  30. Jiang, Z.Y., Ge, J., Xu, Q., Yang, T.: Fast trajectory optimization for gliding reentry vehicle based on improved sparrow search algorithm. J Phys.: Conf. Ser. 1986(1), 012114 (2021)

    Google Scholar 

  31. Li, J., Chen, J., Shi, J.: Evaluation of new sparrow search algorithms with sequential fusion of improvement strategies. Comput. Ind. Eng. 182, 109425 (2023)

    Article  Google Scholar 

  32. Geng, J., Sun, X., Wang, H., Bu, X., Liu, D., Li, F., Zhao, Z.: A modified adaptive sparrow search algorithm based on chaotic reverse learning and spiral search for global optimization. Neural Comput. Appl. 2023, 1–18 (2023)

    Google Scholar 

  33. Ren, J.J., Wang, Y.P., Mao, M.P.: Equalization ensemble for large scale highly imbalanced data classification. Knowl. Based Syst. 242, 108295 (2022)

    Article  Google Scholar 

  34. Dai, Q., Liu, J.W.: Multi-granularity relabeled under-sampling algorithm for imbalanced data. Appl. Softw. Comput. 124, 109083 (2022)

    Article  Google Scholar 

  35. Dai, Q., Liu, J.W., Yang, J.P.: Class-imbalanced positive instances augmentation via three-line hybrid. Knowl. Based Syst. 257, 109902 (2022)

    Article  Google Scholar 

  36. Vuttipittayamongkol, P., Elyan, E., Petrovski, A.: On the class overlap problem in imbalanced data classification. Knowl.-Based Syst. 212, 106631 (2021)

    Article  Google Scholar 

  37. Ganaie, M.A., Hu, M., Malik, A.K., Tanveer, M., Suganthan, P.N.: Ensemble deep learning: a review. Eng. Appl. Artif. Intell. 115, 105151 (2022)

    Article  Google Scholar 

  38. Leo, B.: Bagging predictors. Mach Learn 24(2), 123–140 (1996)

    Article  Google Scholar 

  39. Duffy, N., Helmbold, D.: Boosting methods for regression. Mach Learn 47(2), 153–200 (2002)

    Article  Google Scholar 

  40. Winsen, M., Denman, S., Corcoran, E., Hamilton, G.: Automated detection of koalas with deep learning ensembles. Remote Sens. 14(10), 2432 (2022)

    Article  Google Scholar 

  41. Tian, J., Li, K., Xue, W.: An adaptive ensemble predictive strategy for multiple scale electrical energy usages forecasting. Sustain. Cities Soc. 66, 102654 (2021)

    Article  Google Scholar 

  42. Feng, D.C., Cetiner, B., Azadi, K.M.R., Taciroglu, E.: Data-driven approach to predict the plastic hinge length of reinforced concrete columns and its application. J. Struct. Eng. 147(2), 04020332 (2021)

    Article  Google Scholar 

  43. Sun, Z., Song, Q., Zhu, X.: Using coding-based ensemble learning to improve software defect prediction. IEEE Trans. Syst. Man, Cybern. Part C (Appl. Rev.). 42(6), 1806–1817 (2012)

    Article  Google Scholar 

  44. Xu, C., Zhang, S.W.: A genetic algorithm-based sequential instance selection framework for ensemble learning. Expert Syst. Appl. 236, 121269 (2023)

    Article  Google Scholar 

  45. Bhutamapuram, U.S., Sadam, R.: With-in-project defect prediction using bootstrap aggregation based diverse ensemble learning technique. J King Saud Univ. Comput. Inform. Sci. 34(10), 8675–8691 (2022)

    Google Scholar 

  46. Khadijah, K., Sasongko, P.S.: Software defect prediction using synthetic minority over-sampling technique and extreme learning machine. Kinetik Game Technol. Inf. Syst. Comput. Netw. Comput. Electron. Control 7(2), 60–68 (2019)

    Google Scholar 

  47. Zain, Z.M., Sakri, S., Ismail, N.H.A., Parizi, R.: Software defect prediction harnessing on multi 1-dimensional convolutional neural network structure. CMC-Comput. Mater. Continua 71(1), 1521–1546 (2022)

    Article  Google Scholar 

  48. Zhu, K., Ying, S., Zhang, N.: Software defect prediction based on enhanced metaheuristic feature selection optimization and a hybrid deep neural network. J. Syst. Softw. 180, 111026 (2021)

    Article  Google Scholar 

  49. Ding, Z., Xing, L.: Improved software defect prediction using Pruned Histogram-based isolation forest. Reliab. Eng. Syst. Saf. 204, 107170 (2020)

    Article  Google Scholar 

  50. Pandey, S.K., Rathee, D., Tripathi, A.K.: Software defect prediction using K-PCA and various kernel-based extreme learning machine: an empirical study. IET Softw. 14(7), 768–782 (2020)

    Article  Google Scholar 

  51. Liu, B.Y., Chen, G.L., Lin, H.C.: Prediction of IGBT junction temperature using improved cuckoo search-based extreme learning machine. Microelectron. Reliab. 124, 114267 (2021)

    Article  Google Scholar 

  52. Tang, Y., Dai, Q., Yang, M.Y., Du, T., Chen, L.F.: Software defect prediction ensemble learning algorithm based on adaptive variable sparrow search algorithm. Int. J. Mach. Learn. Cybern. 14(6), 1967–1987 (2023)

    Article  Google Scholar 

  53. Zhai, J., Xu, H., Wang, X.: Dynamic ensemble extreme learning machine based on sample entropy. Soft. Comput. 16(9), 1493–1502 (2012)

    Article  Google Scholar 

  54. Zhao, L.J., Yuan, D.C., Chai, T.Y., Tang, J.: KPCA and ELM ensemble modeling of wastewater effluent quality indices. Procedia Eng. 15, 5558–5562 (2011)

    Article  Google Scholar 

  55. Tian, Z.D., Chen, H.: A novel decomposition-ensemble prediction model for ultra-short-term wind speed. Energy Convers. Manage. 248, 114775 (2021)

    Article  Google Scholar 

  56. Long, W., Jiao, J., Liang, X.M.: Pinhole-imaging-based learning butterfly optimization algorithm for global optimization and feature selection. Appl. Softw. Comput. 103, 107164 (2021)

    Article  Google Scholar 

  57. Zhao, W.G., Zhang, Z.X., Wang, L.Y.: Manta ray foraging optimization: An effective bio-inspired optimizer for engineering applications. Eng. Appl. Artif. Intell. 87, 103300 (2020)

    Article  Google Scholar 

  58. Wang, Y., Lin, K.Y., Cheng, S., Li, L.: Variational quantum extreme learning machine. Neurocomputing 512, 83–99 (2022)

    Article  Google Scholar 

  59. Zhang, Z., Cai, Y., Gong, W.: Semi-supervised learning with graph convolutional extreme learning machines. Expert Syst. Appl. 213, 119164 (2023)

    Article  Google Scholar 

  60. Zhu, X., He, Y., Cheng, L.: Software change-proneness prediction through combination of bagging and resampling methods. J. Softw. Maint. Evol. 30(12), e2111 (2018)

    Article  Google Scholar 

  61. Zhang, G., Wang, C., Liu, C., Sha, D.: Bagging-based positive-unlabeled learning algorithm with Bayesian hyperparameter optimization for three-dimensional mineral potential mapping. Comput. Geosci. 154, 104817 (2021)

    Article  Google Scholar 

  62. Ma, J., Hao, Z.Y., Sun, W.J.: Enhancing sparrow search algorithm via multi-strategies for continuous optimization problems. Inf. Process. Manage. 59(2), 102854 (2022)

    Article  Google Scholar 

  63. Garcia, S., Triguero, I., Carmona, C.J., Herrera, F.: Evolutionary-based selection of generalized instances for imbalanced classification. Knowl. Based Syst. 25(1), 3–12 (2012)

    Article  Google Scholar 

  64. Wu, H., Zhang, A.H., Han, Y., Li, K.: Fast stochastic configuration network based on an improved sparrow search algorithm for fire flame recognition. Knowl. Based Syst. 245, 108626 (2022)

    Article  Google Scholar 

  65. Wang, S.H., Huang, S.Y.: Perturbation theory for cross data matrix-based PCA. J. Multivar. Anal. 190, 104960 (2022)

    Article  MathSciNet  Google Scholar 

  66. Meng, D.X., Li, Y.J.: An imbalanced learning method by combining SMOTE with center offset factor. Appl. Softw. Comput. 120, 108618 (2022)

    Article  Google Scholar 

  67. Zhang, Y., Lo, D., Xia, X., Sun, J.: An empirical study of classifier combination for cross-project defect prediction. IEEE 39th Annu. Comput. Softw. Appl. Conf. 2, 264–269 (2015)

    Google Scholar 

  68. Chen, L., Fang, B., Shang, Z., Tang, Y.: Negative samples reduction in cross-company software defects prediction. Inf. Softw. Technol. 62, 67–77 (2015)

    Article  Google Scholar 

  69. Shao, Y., Liu, B., Wang, S.: Software defect prediction based on correlation weighted class association rule mining. Knowl. Based Syst. 196, 105742 (2020)

    Article  Google Scholar 

  70. Dai, Q., Liu, J.W.: Class-overlap undersampling based on schur decomposition for class-imbalance problems. Expert Syst. Appl. 221, 119735 (2023)

    Article  Google Scholar 

  71. Tang, Y., Dai, Q., Du, Y., Chen, L.F., Niu, X.W.: A software defect prediction method based on learnable three-line hybrid feature fusion. Expert Syst. Appl. 239, 122409 (2024)

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (2022YFB3105105). We would like to thank the editor and anonymous reviewers for their valuable comments and suggestions to improve the paper.

Funding

This work was funded by National Key Research and Development Program of China (Grant No. 2022YFB3105105).

Author information

Authors and Affiliations

Authors

Contributions

Yu Tang: Writing-original draft, Editing and visualization. Qi Dai: Methodology and review. Mengyuan Yang: Data curation and visualization. Lifang Chen: Review and conceptualization. Ye Du: Resources, Supervision and review.

Corresponding author

Correspondence to Ye Du.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, Y., Dai, Q., Yang, M. et al. Software defect prediction ensemble learning algorithm based on 2-step sparrow optimizing extreme learning machine. Cluster Comput (2024). https://doi.org/10.1007/s10586-024-04446-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10586-024-04446-y

Keywords

Navigation