Skip to main content

Advertisement

Log in

Deep neural network based hybrid approach for software defect prediction using software metrics

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

In the field of early prediction of software defects, various techniques have been developed such as data mining techniques, machine learning techniques. Still early prediction of defects is a challenging task which needs to be addressed and can be improved by getting higher classification rate of defect prediction. With the aim of addressing this issue, we introduce a hybrid approach by combining genetic algorithm (GA) for feature optimization with deep neural network (DNN) for classification. An improved version of GA is incorporated which includes a new technique for chromosome designing and fitness function computation. DNN technique is also improvised using adaptive auto-encoder which provides better representation of selected software features. The improved efficiency of the proposed hybrid approach due to deployment of optimization technique is demonstrated through case studies. An experimental study is carried out for software defect prediction by considering PROMISE dataset using MATLAB tool. In this study, we have used the proposed novel method for classification and defect prediction. Comparative study shows that the proposed approach of prediction of software defects performs better when compared with other techniques where 97.82% accuracy is obtained for KC1 dataset, 97.59% accuracy is obtained for CM1 dataset, 97.96% accuracy is obtained for PC3 dataset and 98.00% accuracy is obtained for PC4 dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. IEEE Standard Glossary of Software Engineering Terminology: In: IEEE Std 610.12-1990, 31 December 1990, pp. 1–84 ( 1990)

  2. Ouriques, J.F.S., Cartaxo, E.G., Machado, P.D.L., Neto, F.G.O., Coutinho, A.E.V.B.: On the use of fault abstractions for assessing system test case prioritization techniques. In: Proceedings of the 1st Brazilian Symposium on Systematic and Automated Software Testing (SAST). ACM, New York, Article 7 (2016). https://doi.org/10.1145/2993288.2993295

  3. Benediktsson, O., Dalcher, D., Thorbergsson, H.: Comparison of software development life cycles: a multiproject experiment. IEE Proc. Softw. 153(3), 87–101 (2006)

    Article  Google Scholar 

  4. Hassan, M. M., Afzal, W., Blom, M., Lindström, B., Andler, S. F., Eldh, S.: Testability and software robustness: a systematic literature review. In: 2015 41st Euromicro Conference on Software Engineering and Advanced Applications, Funchal, pp. 341–348 (2015)

  5. Tomaszewski, P., Håkansson, J., Grahn, H., Lundberg, L.: Statistical models vs. expert estimation for fault prediction in modified code—an industrial case study. J. Syst. Softw. 80, 1227–1238 (2007)

    Article  Google Scholar 

  6. Catal, C., Diri, B.: A systematic review of software fault predictions studies. Expert Syst. Appl. 36(4), 7346–7354 (2009)

    Article  Google Scholar 

  7. El Emam, K., Benlarbi, S., Goel, N., Rai, S.N.: The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans. Softw. Eng. 27, 630–650 (2001)

    Article  Google Scholar 

  8. Gittens, M., Kim, Y., Godwin, D.: The vital few versus the trivial many: examining the Pareto principle for software. In: 29th Annual International Computer Software and Applications Conference (COMPSAC’05). 2, 179–185 (2005)

  9. Khoshgoftaar, T.M., Gao, K.: Count models for software quality estimation. IEEE Trans. Rel. 56, 212–222 (2007)

    Article  Google Scholar 

  10. Gondra, I.: Applying machine learning to software fault-proneness prediction. J. Syst. Softw. 81(2), 186–195 (2008). https://doi.org/10.1016/j.jss.2007.05.035

    Article  Google Scholar 

  11. Thwin, M.M.T., Quah, T.-S.: Application of neural networks for software quality prediction using object-oriented metrics. J. Syst. Softw. 76, 147–156 (2005)

    Article  Google Scholar 

  12. Bo, Y., Xiang, L.: A study on software reliability prediction based on support vector machines. In: 2007 IEEE International Conference on Industrial Engineering and Engineering Management, pp. 1176–1180 (2007)

  13. Vandecruys, O., Martens, D., Baesens, B., Mues, C., De Backer, M., Haesen, R.: Mining software repositories for comprehensible software fault prediction models. J. Syst. Softw. 81, 823–839 (2008)

    Article  Google Scholar 

  14. Espejo, P.G., Ventura, S., Herrera, F.: A survey on the application of genetic programming to classification. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 40(2), 121–144 (2010)

    Article  Google Scholar 

  15. Dick, S., Meeks, A., Last, M., Bunke, H., Kandel, A.: Data mining in software metrics databases. Fuzzy Sets Syst. 145, 81–110 (2004)

    Article  MathSciNet  Google Scholar 

  16. Seliya, N., Khoshgoftaar, T.M.: Software quality analysis of unlabeled program modules with semisupervised clustering. IEEE Trans. Syst. Man Cybern. Part A 37, 201–211 (2007)

    Article  Google Scholar 

  17. Dejaeger, K., Verbraken, T., Baesens, B.: Toward comprehensible software fault prediction models using Bayesian network classifiers. IEEE Trans. Softw. Eng. 39(2), 237–257 (2013)

    Article  Google Scholar 

  18. Shuai, B., Li, H., Li, M., Zhang, Q., Tang, C.: Software defect prediction using dynamic support vector machine. In: 2013 Ninth International Conference on Computational Intelligence and Security, Leshan, pp. 260–263 (2013)

  19. Yang, X., Lo, D., Xia, X., Zhang, Y., Sun, J.: Deep learning for just-in-time defect prediction. In: 2015 IEEE International Conference on Software Quality, Reliability and Security, Vancouver, BC, pp. 17–26 (2015)

  20. Hinton, G.E., Osindero, S., Teh, Y.-W.: A fastlearning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  Google Scholar 

  21. Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 20(6), 476–493 (1994)

    Article  Google Scholar 

  22. Basili, V.R., Briand, L.C., Melo, W.L.: A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Eng. 22(10), 751–761 (1996)

    Article  Google Scholar 

  23. Denaro, G., Pezze, M.: An empirical evaluation of fault-proneness models. In: Proceedings of the 24th International Conference on Software Engineering (ICSE 2002), Orlando, FL, USA, pp. 241–251 (2002)

  24. Gyimothy, T., Ferenc, R., Siket, I.: Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng. 31(10), 897–910 (2005)

    Article  Google Scholar 

  25. Bishnu, P.S., Bhattacherjee, V.: Software fault prediction using Quad Tree-based K-means clustering algorithm. IEEE Trans. Knowl. Data Eng. 24(6), 1146–1150 (2012)

    Article  Google Scholar 

  26. Yuan, X., Khoshgoftaar, T.M., Allen, E.B., Ganesan, K.: An application of fuzzy clustering to software quality prediction. In: Proceedings 3rd IEEE Symposium on Application-Specific Systems and Software Engineering Technology, Richardson, TX, pp. 85–90 (2000)

  27. Azar, D., Vybihal, J.: An ant colony optimization algorithm to improve software quality prediction models: case of class stability. Inf. Softw. Technol. 53(4), 388–393 (2011)

    Article  Google Scholar 

  28. Chen, W.-N., Zhang, J.: Ant colony optimization for software project scheduling and staffing with an event-based scheduler. IEEE Trans. Softw. Eng. 39(1), 1–17 (2013)

    Article  Google Scholar 

  29. Park, B.-J., Oh, S.-K., Pedrycz, W.: The design of polynomial function-based neural network predictors for detection of software defects. Inf. Sci. 229(20), 40–57 (2013)

    Article  MathSciNet  Google Scholar 

  30. Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Syst. Softw. 81(5), 649–660 (2008)

    Article  Google Scholar 

  31. Gray, D., Bowes, D., Davey, N., Sun, Y., Christianson, B.: Using the support vector machine as a classification method for software defect prediction with static code metrics. In: Engineering Applications of Neural Networks, pp. 223–234. Springer, Berlin (2009)

  32. Rong, X., Li, F., Cui, Z.: A model for software defect prediction using support vector machine based on CBA. Int. J. Intell. Syst. Technol. Appl. 15(1), 19–34 (2016)

    Google Scholar 

  33. Shivaji, S., James Whitehead, E., Akella, R., Kim, S.: Reducing features to improve code change-based bug prediction. IEEE Trans. Softw. Eng. 39(4), 552–569 (2013)

    Article  Google Scholar 

  34. Rathore, S.S., Kumar, S.: A decision tree logic based recommendation system to select software fault prediction techniques. Computing 99(3), 255–285 (2017)

    Article  MathSciNet  Google Scholar 

  35. Yang, X., Lo, D., Xia, X., Zhang, Y., Sun, J.: Deep learning for just-in-time defect prediction. In: Proceedings of the 2015 IEEE International Conference on Software Quality, Reliability and Security (QRS ’15). IEEE Computer Society, Washington, DC, USA, pp. 17–26 (2015)

  36. Kumudha, P., Venkatesan, R.: Cost-sensitive radial basis function neural network classifier for software defect prediction. Sci. World J. 2016, Article ID 2401496 (2015)

  37. Wahono, R.S., Herman, N.S., Ahmad, S.: Neural network parameter optimization based on genetic algorithm for software defect prediction. Adv. Sci. Lett. 20, 1951–1955 (2014)

    Article  Google Scholar 

  38. Suzuki, M., Tsuruta, S., Knauf, R.: Structural diversity for genetic algorithms and its use for creating individuals. In: IEEE Congress on Evolutionary Computation, Cancun, pp. 783–788 (2013)

  39. Huang, C.L., Wang, C.J.: A GA-based feature selection and parameters optimization for support vector machines. Expert Syst. Appl. 31(2), 231–240 (2006)

    Article  Google Scholar 

  40. Zhang, X.L.: Nonlinear dimensionality reduction of data by deep distributed random samplings. In: Asian Conference on Machine Learning, February, pp. 221–233 (2015)

  41. Gallagher, S., Kerry, M.: Genetic algorithms: a powerful tool for large-scale nonlinear optimization problems. Comput. Geosci. 20(7), 1229–1236 (1994)

    Article  Google Scholar 

  42. Rajan, C., Shanthi, N.: Genetic based optimization for multicast routing algorithm for Manet’. Sadhana Acad. Proc. Eng. Sci. 40(7), 2341–2352 (2015)

    MathSciNet  Google Scholar 

  43. Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103 (2008)

  44. Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T.S., Yan, S.: Sparse representation for computer vision and pattern recognition. Proc. IEEE 98, 1031–1044 (2010)

    Article  Google Scholar 

  45. Software Defect Dataset: PROMISE REPOSITORY. http://promise.site.uottawa.ca/SERepository/datasets-page.html

  46. Arar, O.F., Ayan, K.: Software defect prediction using cost sensitive neural network. Appl. Soft Comput. J. 33, 263–277 (2015)

    Article  Google Scholar 

  47. Abaei, G., Selamat, A., Fujita, H.: An empirical study based on semi-supervised hybrid self-organizing map for software fault prediction. Knowl. Based Syst. 74, 28–39 (2015)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to C. Manjula.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Manjula, C., Florence, L. Deep neural network based hybrid approach for software defect prediction using software metrics. Cluster Comput 22 (Suppl 4), 9847–9863 (2019). https://doi.org/10.1007/s10586-018-1696-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-018-1696-z

Keywords

Navigation