Skip to main content

A Feature Selection-Based K-NN Model for Fast Software Defect Prediction

  • Conference paper
  • First Online:
Computational Science and Its Applications – ICCSA 2022 Workshops (ICCSA 2022)

Abstract

Software Defect Prediction (SDP) is an advanced technological method of predicting software defects in the software development life cycle. Various research works have been previously being done on SDP but the performance of these methods varied from several datasets, hence, making them inconsistent for SDP in the unknown software project. But the hybrid technique using feature selection enabled with machine learning for SDP can be very efficient as it takes the advantage of various methods to come up with better prediction accuracy for a given dataset when compared with an individual classifier. The major issues with individual ML-based models for SDP are the long detection time, vulnerability of the software project, and high dimensionality of the feature parameters. Therefore, this study proposes a hybrid model using a feature selection enabled Extreme Gradient Boost (XGB) classifier to address these mentioned challenges. The cleaned NASA MDP datasets were used for the implementation of the proposed model, and various performance metrics like F-score, accuracy, and MCC were used to reveal the performance of the model. The results of the proposed model when compared with state-of-the-art methods without feature selection perform better in terms of the metrics used. The results reveal that the proposed model outperformed all other prediction techniques.

A. E. Adeniyi and M. K. Abiodun—Landmark University SDG 4 (Quality Education)

M. K. Abiodun—Landmark University SDG 16 (Peace and Justice, Strong Institution)

A. E. Adeniyi—Landmark University SDG 11 (Sustainable Cities and Communities)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Rathore, S.S., Kumar, S.: Towards an ensemble based system for predicting the number of software faults. Expert Syst. Appl. 82, 357–382 (2017)

    Article  Google Scholar 

  2. Laradji, I.H., Alshayeb, M., Ghouti, L.: Software defect prediction using ensemble learning on selected features. Inf. Softw. Technol. 58, 388–402 (2015)

    Article  Google Scholar 

  3. Abisoye, O.A., Akanji, O.S., Abisoye, B.O., Awotunde, J.: Slow hypertext transfer protocol mitigation model in software defined networks. In: 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy, ICDABI 2020, 9325601 (2020)

    Google Scholar 

  4. Malhotra, R., Jain, J.: Handling imbalanced data using ensemble learning in software defect prediction. In: 2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence), pp. 300–304. IEEE (2020)

    Google Scholar 

  5. Awotunde, J.B., Ayo, F.E., Ogundokun, R.O., Matiluko, O.E., Adeniyi, E.A.: Investigating the roles of effective communication among stakeholders in collaborative software development projects. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2020, 12254 LNCS, pp. 311–319 (2020)

    Google Scholar 

  6. Awotunde, J.B., Folorunso, S.O., Bhoi, A.K., Adebayo, P.O., Ijaz, M.F.: Disease diagnosis system for IoT-based wearable body sensors with machine learning algorithm. Intelligent Systems Reference Library 2021(209), 201–222 (2021)

    Article  Google Scholar 

  7. Awotunde, J.B., Misra, S.: Feature extraction and artificial intelligence-based intrusion detection model for a secure internet of things networks. Lecture Notes Data Eng. .ications Technol. 2022(109), 21–44 (2022)

    Article  Google Scholar 

  8. Behera, R.K., Shukla, S., Rath, S.K., Misra, S.: Software reliability assessment using machine learning technique. In: Gervasi, O., et al. (eds.) Computational Science and Its Applications – ICCSA 2018: 18th International Conference, Melbourne, VIC, Australia, July 2-5, 2018, Proceedings, Part V, pp. 403–411. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-319-95174-4_32

    Chapter  Google Scholar 

  9. Chicco, D., Jurman, G.: The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21(1), 1–13 (2020)

    Article  Google Scholar 

  10. Shukla, S., Behera, R.K., Misra, S., Rath, S.K.: Software reliability assessment using deep learning technique. In: Chakraverty, S., Goel, A., Misra, S. (eds.) Towards Extensible and Adaptable Methods in Computing, pp. 57–68. Springer, Singapore (2018). https://doi.org/10.1007/978-981-13-2348-5_5

    Chapter  Google Scholar 

  11. Awotunde, J.B., Chakraborty, C., Adeniyi, A.E.: Intrusion detection in industrial internet of things network-based on deep learning model with rule-based feature selection. Wirel. Commun. Mob. Comput. 2021(2021), 7154587 (2021)

    Google Scholar 

  12. Ogundokun, R.O., Awotunde, J.B., Sadiku, P., Adeniyi, E.A., Abiodun, M., Dauda, O.I.: An enhanced intrusion detection system using particle swarm optimization feature extraction technique. Procedia Computer Science 193, 504–512 (2021)

    Article  Google Scholar 

  13. Jagdhuber, R., Lang, M., Stenzl, A., Neuhaus, J., Rahnenführer, J.: Cost-Constrained feature selection in binary classification: adaptations for greedy forward selection and genetic algorithms. BMC Bioinformatics 21(1), 1–21 (2020)

    Article  Google Scholar 

  14. Kumari, A., Behera, R.K., Sahoo, B., Sahoo, S.P.: Prediction of link evolution using community detection in social network. Computing, 1–22 (2022)

    Google Scholar 

  15. Mishra, N., Soni, H.K., Sharma, S., Upadhyay, A.K.: Development and analysis of artificial neural network models for rainfall prediction by using time-series data. International Journal of Intelligent Systems Applications, 10(1) (2018)

    Google Scholar 

  16. Zhang, X., Mohanty, S.N., Parida, A.K., Pani, S.K., Dong, B., Cheng, X.: Annual and non-monsoon rainfall prediction modelling using SVR-MLP: an empirical study from Odisha. IEEE Access 8, 30223–30233 (2020)

    Article  Google Scholar 

  17. Jagdale, R.S., Shirsat, V.S., Deshmukh, S.N.: Sentiment analysis on product reviews using machine learning techniques. In: Mallick, P.K., Balas, V.E., Bhoi, A.K., Zobaa, A.F. (eds.) Cognitive Informatics and Soft Computing. AISC, vol. 768, pp. 639–647. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-0617-4_61

    Chapter  Google Scholar 

  18. Hassonah, M.A., Al-Sayyed, R., Rodan, A., Ala’M, A.Z., Aljarah, I., Faris, H.: An efficient hybrid filter and evolutionary wrapper approach for sentiment analysis of various topics on Twitter. Knowledge-Based Syst.192, 105353 (2020)

    Google Scholar 

  19. Rehman, A.U., Malik, A.K., Raza, B., Ali, W.: A hybrid CNN-LSTM model for improving accuracy of movie reviews sentiment analysis. Multimedia Tools and Applications 78(18), 26597–26613 (2019)

    Article  Google Scholar 

  20. Awotunde, J.B., Abiodun, K.M., Adeniyi, E.A., Folorunso, S.O., Jimoh, R.G.: A deep learning-based intrusion detection technique for a secured IoMT system. Communications in Computer and Information Science, 2022, 1547 CCIS, pp. 50–62 (2021)

    Google Scholar 

  21. Verma, A., Ranga, V.: Machine learning based intrusion detection systems for IoT applications. Wireless Pers. Commun. 111(4), 2287–2310 (2020)

    Article  Google Scholar 

  22. Amouri, A., Alaparthy, V.T., Morgera, S.D.: A machine learning based intrusion detection system for mobile Internet of Things. Sensors 20(2), 461 (2020)

    Article  Google Scholar 

  23. Matloob, F., Aftab, S., Iqbal, A.: A framework for software defect prediction using feature selection and ensemble learning techniques. International Journal of Modern Education Computer Sci. 11(12) (2019)

    Google Scholar 

  24. Yalçıner, B., Özdeş, M.: Software defect estimation using machine learning algorithms. In: 2019 4th International Conference on Computer Science and Engineering (UBMK), pp. 487–491. IEEE (2019)

    Google Scholar 

  25. Arar, Ö.F., Ayan, K.: Software defect prediction using cost-sensitive neural network. Appl. Soft Comput. 33, 263–277 (2015)

    Article  Google Scholar 

  26. Iqbal, A., et al.: Performance analysis of machine learning techniques on software defect prediction using NASA datasets. Int. J. Adv. Comput. Sci. Appl 10(5), 300–308 (2019)

    Google Scholar 

  27. Iqbal, A., Aftab, S., Ullah, I., Bashir, M.S., Saeed, M.A.: A feature selection based ensemble classification framework for software defect prediction. Int. J. Modern Education Comput. Sci. 11(9), 54 (2019)

    Article  Google Scholar 

  28. Lanubile, F., Lonigro, A., Vissagio, G.: Comparing models for identifying fault-prone software components. In: SEKE, pp. 312–319 (1995)

    Google Scholar 

  29. Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Syst. Softw. 81(5), 649–660 (2008)

    Article  Google Scholar 

  30. Gondra, I.: Applying machine learning to software fault-proneness prediction. J. Syst. Softw. 81(2), 186–195 (2008)

    Article  Google Scholar 

  31. Manjula, C., Florence, L.: Deep neural network based hybrid approach for software defect prediction using software metrics. Clust. Comput. 22(4), 9847–9863 (2018). https://doi.org/10.1007/s10586-018-1696-z

    Article  Google Scholar 

  32. Witten, I.H., Frank, E.: Data mining: practical machine learning tools and techniques with Java implementations. ACM SIGMOD Rec. 31(1), 76–77 (2002)

    Article  Google Scholar 

  33. Dai, H., Hwang, H.G., Tseng, V.S.: Convolutional neural network based automatic screening tool for cardiovascular diseases using different intervals of ECG signals. Comput. Methods Programs Biomed. 203, 106035 (2021)

    Article  Google Scholar 

  34. Awotunde, J.B., et al.: An improved machine learnings diagnosis technique for COVID-19 pandemic using chest X-ray images. Communications in Computer and Information Science, 2021, 1455 CCIS, pp. 319–330 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joseph Bamidele Awotunde .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Awotunde, J.B., Misra, S., Adeniyi, A.E., Abiodun, M.K., Kaushik, M., Lawrence, M.O. (2022). A Feature Selection-Based K-NN Model for Fast Software Defect Prediction. In: Gervasi, O., Murgante, B., Misra, S., Rocha, A.M.A.C., Garau, C. (eds) Computational Science and Its Applications – ICCSA 2022 Workshops. ICCSA 2022. Lecture Notes in Computer Science, vol 13380. Springer, Cham. https://doi.org/10.1007/978-3-031-10542-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-10542-5_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-10541-8

  • Online ISBN: 978-3-031-10542-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics