Abstract
Software fault prediction (SFP) techniques identify faults at the early stages of the software development life cycle (SDLC). We find machine learning techniques commonly used for SFP compared to deep learning methods, which can produce more accurate results. Deep learning offers exceptional results in various domains, such as computer vision, natural language processing, and speech recognition. In this study, we use three deep learning methods, namely, long short-term memory (LSTM), bidirectional LSTM (BILSTM), and radial basis function network (RBFN) to predict software faults and compare our results with existing models to show how our results are more accurate. Our study uses Chidamber and Kemerer (CK) metrics-based datasets to conduct experiments and test our proposed algorithm. We conclude that LSTM and BILSTM perform better, whereas RBFN is faster in producing the required results. We use k-fold cross-validation to do the model evaluation. Our proposed models provide software developers with a more accurate and efficient SFP mechanism.
Similar content being viewed by others
Data availibility
All datasets used in this paper are publicly available.
References
Akour, M., Alsghaier, H., & Al Qasem, O. (2020). The effectiveness of using deep learning algorithms in predicting students achievements. Indonesian Journal of Electrical Engineering and Computer Science, 19, 387–393.
Al Qasem, O., & Akour, M. (2019). Software fault prediction using deep learning algorithms. International Journal of Open Source Software and Processes (IJOSSP), 10, 1–19.
Al Qasem, O., Akour, M., & Alenezi, M. (2020). The influence of deep learning algorithms factors in software fault prediction. IEEE Access, 8, 63945–63960.
Ali, A., & Gravino, C. (2021). An empirical comparison of validation methods for software prediction models. Journal of Software: Evolution and Process, 33, e2367.
Ali, H., & Khan, T. A. (2019). On fault localization using machine learning techniques. In: 2019 International Conference on Frontiers of Information Technology (FIT), IEEE. pp. 357–3575.
Aziz, S. R., Khan, T., & Nadeem, A. (2019). Experimental validation of inheritance metrics’ impact on software fault prediction. IEEE Access, 7, 85262–85275. https://doi.org/10.1109/ACCESS.2019.2924040
Aziz, S. R., Khan, T. A., & Nadeem, A. (2020). Efficacy of inheritance aspect in software fault prediction - A survey paper. IEEE Access, 8, 170548–170567. https://doi.org/10.1109/ACCESS.2020.3022087
Aziz, S. R., Khan, T. A., & Nadeem, A. (2021). Exclusive use and evaluation of inheritance metrics viability in software fault prediction - an experimental study. PeerJ Computer Science, 7, e563. https://doi.org/10.7717/peerj-cs.563
Batool, I., & Khan, T. A. (2022). Software fault prediction using data mining, machine learning and deep learning techniques: A systematic literature review. Computers and Electrical Engineering, 100, 107886. https://www.sciencedirect.com/science/article/pii/S0045790622001744, https://doi.org/10.1016/j.compeleceng.2022.107886
Borandag, E., Ozcift, A., Kilinc, D., & Yucalar, F. (2019). Majority vote feature selection algorithm in software fault prediction. Computer Science and Information Systems, 16, 515–539.
Boucher, A., & Badri, M. (2018). Software metrics thresholds calculation techniques to predict fault-proneness: An empirical comparison. Information and Software Technology, 96, 38–67.
Bowes, D., Hall, T., & Petrić, J. (2018). Software defect prediction: do different classifiers find the same defects? Software Quality Journal, 26, 525–552.
Cai, X., Niu, Y., Geng, S., Zhang, J., Cui, Z., Li, J., & Chen, J. (2020). An under-sampled software defect prediction method based on hybrid multi-objective cuckoo search. Concurrency and Computation: Practice and Experience, 32, e5478.
Catal, C., Sevim, U., & Diri, B. (2011). Practical development of an eclipse-based software fault prediction tool using naive bayes algorithm. Expert Systems with Applications, 38, 2347–2353.
Chao, M. A., Kulkarni, C., Goebel, K., & Fink, O. (2019). Hybrid deep fault detection and isolation: Combining deep neural networks and system performance models. arXiv preprint arXiv:1908.01529
Chatterjee, S., Nigam, S., & Roy, A. (2017). Software fault prediction using neuro-fuzzy network and evolutionary learning approach. Neural Computing and Applications, 28, 1221–1231.
Dam, H. K., Pham, T., Ng, S. W., Tran, T., Grundy, J., Ghose, A., Kim, T., & Kim, C. J. (2018). A deep tree-based model for software defect prediction. arXiv preprint arXiv:1802.00921
Erturk, E., & Sezer, E. A. (2016). Iterative software fault prediction with a hybrid approach. Applied Soft Computing, 49, 1020–1033.
Fan, G., Diao, X., Yu, H., Yang, K., & Chen, L. (2019). Software defect prediction via attention-based recurrent neural network. Scientific Programming, 2019.
Farhadi, F. (2017). Learning activation functions in deep neural networks. Montreal (Canada): Ecole Polytechnique.
Gao, K., Khoshgoftaar, T. M., Wang, H., & Seliya, N. (2011). Choosing software metrics for defect prediction: An investigation on feature selection techniques. Software: Practice and Experience, 41, 579–606.
Han, J., Pei, J., & Tong, H. (2022). Data mining: Concepts and techniques. Morgan kaufmann.
Hoang, T., Dam, H. K., Kamei, Y., Lo, D., & Ubayashi, N. (2019). Deepjit: An end-to-end deep learning framework for just-in-time defect prediction. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), IEEE. pp. 34–45.
Huda, S., Alyahya, S., Ali, M. M., Ahmad, S., Abawajy, J., Al-Dossari, H., & Yearwood, J. (2017). A framework for software defect prediction and metric selection. IEEE access, 6, 2844–2858.
Jayanthi, R., & Florence, L. (2019). Software defect prediction techniques using metrics based on neural network classifier. Cluster Computing, 22, 77–88.
Jin, C., & Jin, S. W. (2015). Prediction approach of software fault-proneness based on hybrid artificial neural network and quantum particle swarm optimization. Applied Soft Computing, 35, 717–725.
Jin, C., Jin, S. W., & Ye, J. M. (2012). Artificial neural network-based metric selection for software fault-prone prediction model. IET software, 6, 479–487.
Jones, C., & Bonsignour, O. (2011). The economics of software quality. Addison-Wesley Professional.
Jothi, R. (2018). A comparative study of unsupervised learning algorithms for software fault prediction. In: 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), IEEE. pp. 741–745.
Karlik, B., & Olgac, A. V. (2011). Performance analysis of various activation functions in generalized mlp architectures of neural networks. International Journal of Artificial Intelligence and Expert Systems, 1, 111–122.
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
LeCun, Y., Bengio, Y., Hinton, G., et al. (2015). Deep learning. Nature, 521(7553), 436-444. Google Scholar Cross Ref.
Li, J., He, P., Zhu, J., & Lyu, M. R. (2017). Software defect prediction via convolutional neural network. In: 2017 IEEE international conference on software quality, reliability and security (QRS), IEEE. pp. 318–328.
Li, X., Xu, Y., Li, N., Yang, B., & Lei, Y. (2022). Remaining useful life prediction with partial sensor malfunctions using deep adversarial networks. IEEE/CAA Journal of Automatica Sinica, 10, 121–134.
Liang, H., Yu, Y., Jiang, L., & Xie, Z. (2019). Seml: A semantic lstm model for software defect prediction. IEEE Access, 7, 83812–83824.
Lin, G., Zhang, J., Luo, W., Pan, L., Xiang, Y., De Vel, O., & Montague, P. (2018). Cross-project transfer representation learning for vulnerable function discovery. IEEE Transactions on Industrial Informatics, 14, 3289–3297.
Malhotra, R., & Bansal, A. J. (2015). Fault prediction considering threshold effects of object-oriented metrics. Expert Systems, 32, 203–219.
Malhotra, R., & Jain, A. (2012). Fault prediction using statistical and machine learning methods for improving software quality. Journal of Information Processing Systems, 8, 241–262.
Manjula, C., & Florence, L. (2019). Deep neural network based hybrid approach for software defect prediction using software metrics. Cluster Computing, 22, 9847–9863.
Mercioni, M. A., Tiron, A., & Holban, S. (2019). Dynamic modification of activation function using the backpropagation algorithm in the artificial neural networks. IJACSA International Journal of Advanced Computer Science and Applications, 10.
Nevendra, M., & Singh, P. (2021). Software defect prediction using deep learning. Acta Polytechnica Hungarica, 18, 173–189.
Padhy, N., Satapathy, S., & Singh, R., (2018). State-of-the-art object-oriented metrics and its reusability: A decade review. Smart Computing and Informatics, pp. 431–441.
Peng, S., Jiang, H., Wang, H., Alwageed, H., & Yao, Y. D. (2017). Modulation classification using convolutional neural network based deep learning model. In: 2017 26th Wireless and Optical Communication Conference (WOCC), IEEE. pp. 1–5.
Phan, A. V., & LeNguyen, M. (2017). Convolutional neural networks on assembly code for predicting software defects. In: 2017 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES), IEEE. pp. 37–42.
Pornprasit, C., & Tantithamthavorn, C. (2022). Deeplinedp: Towards a deep learning approach for line-level defect prediction. IEEE Transactions on Software Engineering.
Radjenović, D., Heričko, M., Torkar, R., & Živkovič, A. (2013). Software fault prediction metrics: A systematic literature review. Information and Software Technology, 55, 1397–1418.
Rathore, S. S., & Kumar, S. (2016). A decision tree regression based approach for the number of software faults prediction. ACM SIGSOFT Software Engineering Notes, 41, 1–6.
Rosli, M. M., Teo, N. H. I., Yusop, N. S. M., & Mohamad, N. S. (2011). Fault prediction model for web application using genetic algorithm. In: International conference on computer and software Modeling (IPCSIT), pp. 71–77.
Sandhu, P. S., Singh, J., Gupta, V., Kaur, M., Manhas, S., & Sidhu, R. (2010). A k-means based clustering approach for finding faulty modules in open source software systems. World academy of science, Engineering and technology, 72, 654–658.
Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85–117.
Shaik, A., Reddy, K., & Damodaram, A. (2012). Object oriented software metrics and quality assessment: Current state of the art. International Journal of Computer Applications, 37, 6–15.
Sharma, A., & Dubey, S. K. (2012). Comparison study and review on object-oriented metrics. Global Journal of Computer Science and Technology, 12, 47–56.
Siami-Namini, S., Tavakoli, N., & Namin, A. S. (2019). The performance of LSTM and BiLSTM in forecasting time series. In: 2019 IEEE International Conference on Big Data (Big Data), IEEE. pp. 3285–3292.
Singh, P., Chaudhary, K., & Verma, S. (2011). An investigation of the relationships between software metrics and defects. International Journal of Computer Applications, 28, 13–17.
Singh, P., Pal, N. R., Verma, S., & Vyas, O. P. (2016). Fuzzy rule-based approach for software fault prediction. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 47, 826–837.
Singh, Y. (2011). Software Testing. Cambridge University Press. https://doi.org/10.1017/CBO9781139196185
Snuverink, I. (2017). Deep learning for pixelwise classification of hyperspectral images. Ph.D. thesis. Thesis. Delft, Netherlands: Faculty of Mechanical, Maritime and Materials.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15, 1929–1958.
Suresh, Y., Kumar, L., & Rath, S. K. (2014). Statistical and machine learning methods for software fault prediction using CK metric suite: A comparative analysis. International Scholarly Research Notices, 2014.
Suresh, Y., Pati, J., & Rath, S. K. (2012). Effectiveness of software metrics for object-oriented system. Procedia technology, 6, 420–427.
Suri, B., & Singhal, S. (2015). Investigating the oo characteristics of software using ckjm metrics. 2015 4th International Conference on Reliability (pp. 1–6). IEEE: Infocom Technologies and Optimization (ICRITO)(Trends and Future Directions).
Turabieh, H., Mafarja, M., & Li, X. (2019). Iterated feature selection algorithms with layered recurrent neural network for software fault prediction. Expert systems with applications, 122, 27–42.
Uddin, M. N., Li, B., Ali, Z., Kefalas, P., Khan, I., & Zada, I. (2022). Software defect prediction employing bilstm and bert-based semantic feature. Soft Computing, 26, 7877–7891.
Verma, S., Chug, A., & Singh, A. P. (2020). Impact of hyperparameter tuning on deep learning based estimation of disease severity in grape plant. In: Recent Advances on Soft Computing and Data Mining: Proceedings of the Fourth International Conference on Soft Computing and Data Mining (SCDM 2020), Melaka, Malaysia, January 22–23, 2020, Springer. pp. 161–171.
Wahono, R. S., & Herman, N. S. (2014). Genetic feature selection for software defect prediction. Advanced Science Letters, 20, 239–244.
Wan, Z., Xia, X., Hassan, A. E., Lo, D., Yin, J., & Yang, X. (2018). Perceptions, expectations, and challenges in defect prediction. IEEE Transactions on Software Engineering, 46, 1241–1266.
Wang, H., Zhuang, W., & Zhang, X. (2021). Software defect prediction based on gated hierarchical lstms. IEEE Transactions on Reliability, 70, 711–727.
Wang, S., Liu, T., Nam, J., & Tan, L. (2018). Deep semantic feature learning for software defect prediction. IEEE Transactions on Software Engineering, 46, 1267–1293.
Wu, Y., Wang, H., Zhang, B., & Du, K. L. (2012). 2012. International Scholarly Research Notices: Using radial basis function networks for function approximation and classification.
Xu, J., Wang, F., & Ai, J. (2020). Defect prediction with semantics and context features of codes based on graph representation learning. IEEE Transactions on Reliability, 70, 613–625.
Yu, L. (2012). Using negative binomial regression analysis to predict software faults: A study of apache ant.
Yu, Q., Qian, J., Jiang, S., Wu, Z., & Zhang, G. (2019a). An empirical study on the effectiveness of feature selection for cross-project defect prediction. IEEE Access, 7, 35710–35718.
Yu, Y., Si, X., Hu, C., & Zhang, J. (2019b). A review of recurrent neural networks: Lstm cells and network architectures. Neural computation, 31, 1235–1270.
Zhang, C., Patras, P., & Haddadi, H. (2019). Deep learning in mobile and wireless networking: A survey. IEEE Communications Survey Tutor, 1.
Zhang, W., Wang, Z., & Li, X. (2023). Blockchain-based decentralized federated transfer learning methodology for collaborative machinery fault diagnosis. Reliability Engineering & System Safety, 229, 108885.
Zheng, J. (2010). Cost-sensitive boosting neural networks for software defect prediction. Expert Systems with Applications, 37, 4537–4543.
Author information
Authors and Affiliations
Contributions
Iqra Batool and Tamim Ahmed Khan have contributed equally to the preparation of this manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Batool, I., Khan, T.A. Software fault prediction using deep learning techniques. Software Qual J 31, 1241–1280 (2023). https://doi.org/10.1007/s11219-023-09642-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11219-023-09642-4