Defect count prediction via metric-based convolutional neural network

Nevendra, Meetesh; Singh, Pradeep

doi:10.1007/s00521-021-06158-5

Defect count prediction via metric-based convolutional neural network

Original Article
Published: 08 June 2021

Volume 33, pages 15319–15344, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

509 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

With the increasing complexity and volume of the software, the number of defects in software modules is also increasing consistently, which affects the quality and delivery of software in time and budget. To improve the software quality and timely allocation of resources, defects should be detected at the initial phases of the software development life cycle. However, the existing defect prediction methodology based on high-dimensional and limited data only focuses on predicting defective modules. In contrast, the number of defects present in the software module has not been explored so far, especially using deep neural network. Also, whether deep learning could enhance the performance of defect count prediction is still uninvestigated. To fill this gap, we proposed an improved Convolutional Neural Network model, called metrics-based convolutional neural network (MB-CNN), which combines the advantages of appropriate metrics and an improved CNN method by introducing dropout for regularization between convolutions and dense layer. The proposed method predicts the presented defect count in the software module for homogeneous scenarios as within-version and cross-version. The experimental results show that, on average, across the fourteen real-world defect datasets, the proposed approach improves Li’s CNN architecture by 31% in within-version prediction and 28% in cross-version prediction. Moreover, the Friedman ranking test and Wilcoxon nonparametric test reveal the usefulness of our proposed approach over ten benchmark learning algorithms to predict defect count.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data collection and quality challenges in deep learning: a data-centric AI perspective

Article 03 January 2023

A Hybrid Machine Learning Model for Code Optimization

Article 22 September 2023

Deep learning approaches for detecting DDoS attacks: a systematic review

Article 27 January 2022

Notes

References

Rathore SS, Kumar S (2017) Towards an ensemble based system for predicting the number of software faults. Expert Syst Appl 82:357–382. https://doi.org/10.1016/j.eswa.2017.04.014
Article Google Scholar
Rathore SS, Kumar S (2018) An approach for the prediction of number of software faults based on the dynamic selection of learning techniques. IEEE Trans Reliab 68:216–236. https://doi.org/10.1109/TR.2018.2864206
Article Google Scholar
Rathore SS, Kumar S (2017) An empirical study of some software fault prediction techniques for the number of faults prediction. Soft Comput 21:7417–7434. https://doi.org/10.1007/s00500-016-2284-x
Article Google Scholar
Malhotra R (2016) An empirical framework for defect prediction using machine learning techniques with Android software. Appl Soft Comput 49:1034–1050. https://doi.org/10.1016/j.asoc.2016.04.032
Article Google Scholar
Ryu D, Baik J (2016) Effective multi-objective naïve Bayes learning for cross-project defect prediction. Appl Soft Comput 49:1062–1077. https://doi.org/10.1016/j.asoc.2016.04.009
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Article MATH Google Scholar
Chen T, Guestrin C (2016) XGBoost. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, USA, pp 785–794
Hosseini S, Turhan B, Gunarathna D (2019) A systematic literature review and meta-analysis on cross project defect prediction. IEEE Trans Softw Eng 45:111–147. https://doi.org/10.1109/TSE.2017.2770124
Article Google Scholar
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324. https://doi.org/10.1109/5.726791
Article Google Scholar
Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527
Article MathSciNet MATH Google Scholar
Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
MathSciNet MATH Google Scholar
Levi G, Hassncer T (2015) Age and gender classification using convolutional neural networks. In: 2015 IEEE conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 34–42
Li J, He P, Zhu J, Lyu MR (2017) Software defect prediction via convolutional neural network. In: 2017 IEEE international conference on software quality, reliability and security (QRS). IEEE, pp 318–328
Menzies T, Turhan B, Bener A, et al (2008) Implications of ceiling effects in defect predictors. In: Proceedings of the 4th international workshop on predictor models in software engineering—PROMISE’08. ACM Press, New York, USA, p 47
Arar ÖF, Ayan K (2017) A feature dependent Naive Bayes approach and its application to the software defect prediction problem. Appl Soft Comput 59:197–209. https://doi.org/10.1016/j.asoc.2017.05.043
Article Google Scholar
Xia X, Lo D, Pan SJ et al (2016) HYDRA: massively compositional model for cross-project defect prediction. IEEE Trans Softw Eng 42:977–998. https://doi.org/10.1109/TSE.2016.2543218
Article Google Scholar
McCabe TJ (1976) A complexity measure. IEEE Trans Softw Eng 2:308–320. https://doi.org/10.1109/TSE.1976.233837
Article MathSciNet MATH Google Scholar
Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20:476–493. https://doi.org/10.1109/32.295895
Article Google Scholar
Goodfellow I, Bengio Y, Courville A (2016) Deep learning
Cui Z, Du L, Wang P et al (2019) Malicious code detection based on CNNs and multi-objective algorithm. J Parallel Distrib Comput 129:50–58. https://doi.org/10.1016/j.jpdc.2019.03.010
Article Google Scholar
Abdel-Hamid O, Mohamed A, Jiang H, Penn G (2012) Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 4277–4280
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Cortes C, Lawrence ND, Lee DD et al (eds) Advances in neural information processing systems, vol 28. Curran , Inc, Red Hook, pp 649–657
Google Scholar
Nair V, Hinton G (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of 27th international conference on machine learning, pp 807–814
Nagi J, Ducatelle F, Di Caro GA et al (2011) Max-pooling convolutional neural networks for vision-based hand gesture recognition. In: 2011 IEEE international conference on signal and image processing applications (ICSIPA). IEEE, pp 342–347
Hinton GE, Srivastava N, Krizhevsky A et al (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv Prepr arXiv12070580
Dahl GE, Sainath TN, Hinton GE (2013) Improving deep neural networks for LVCSR using rectified linear units and dropout. In: 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, pp 8609–8613
Hinton G, Srivastava N, Swersky K (2012) Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on 14
Riedmiller M, Braun H (1993) A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: IEEE international conference on neural networks. IEEE, pp 586–591
Turabieh H, Mafarja M, Li X (2019) Iterated feature selection algorithms with layered recurrent neural network for software fault prediction. Expert Syst Appl 122:27–42. https://doi.org/10.1016/j.eswa.2018.12.033
Article Google Scholar
Nam J, Kim S (2015) Heterogeneous defect prediction. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering—ESEC/FSE 2015. ACM Press, New York, USA, pp 508–519
Huan Liu, Setiono R (1995) Chi2: feature selection and discretization of numeric attributes. In: Proceedings of 7th IEEE international conference on tools with artificial intelligence. IEEE Comput. Soc. Press, pp 388–391
Shukla AK, Singh P, Vardhan M (2018) A hybrid gene selection method for microarray recognition. Biocybern Biomed Eng 38:975–991. https://doi.org/10.1016/j.bbe.2018.08.004
Article Google Scholar
tera-PROMISE: welcome to one of the largest repositories of SE research data. http://openscience.us/repo/. Accessed 30 Nov 2017
Sanner MF, Jolla L (1999) Python: a programming language for software integration and development. J Mol Graph Model 17:57–61
Google Scholar
Chollet F et al (2015) Keras: deep learning library for theano and tensorflow. https://kerasio/k7:T1
scikit-learn: machine learning in Python—scikit-learn 0.19.1 documentation. http://scikit-learn.org/stable/. Accessed 19 Apr 2018
Rathore SS, Kumar S (2017) Linear and non-linear heterogeneous ensemble methods to predict the number of faults in software systems. Knowl Based Syst 119:232–256. https://doi.org/10.1016/j.knosys.2016.12.017
Article Google Scholar
Kohavi R et al (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai. pp 1137–1145
Hull D (1993) Using statistical testing in the evaluation of retrieval experiments. In: Proceedings of the 16th annual international ACM SIGIR conference on research and development in information retrieval—SIGIR’93. ACM Press, New York, New York, USA, pp 329–338
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701
Article Google Scholar
Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 5:65–70
MathSciNet MATH Google Scholar
Procedure SB, Tests M, Author S et al (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75:800–802
Article MathSciNet Google Scholar
Frey BB (2018) Holm’s sequential Bonferroni procedure. In: The SAGE encyclopedia of educational research, measurement, and evaluation. SAGE Publications, Inc., 2455 Teller Road, Thousand Oaks, California 91320, pp 1–8
Welcome to imbalanced-learn documentation! Imbalanced-learn 0.3.0 documentation. http://contrib.scikit-learn.org/imbalanced-learn/stable/index.html. Accessed 5 Apr 2018
Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33:2–13. https://doi.org/10.1109/TSE.2007.256941
Article Google Scholar
Harrison R, Counsell SJ, Nithi RV (1998) An evaluation of the MOOD set of object-oriented software metrics. IEEE Trans Softw Eng 24:491–496. https://doi.org/10.1109/32.689404
Article Google Scholar
Halstead MH (1977) Elements of software science. Elsevier Sci Inc 7:127
Moser R, Pedrycz W, Succi G (2008) A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: Proceedings of the 13th international conference on Software engineering—ICSE’08. ACM Press, New York, New York, USA, p 181
Nagappan N, Ball T (2007) Using software dependencies and churn metrics to predict field failures: an empirical case study. In: First international symposium on empirical software engineering and measurement (ESEM 2007). IEEE, pp 364–373
Hassan AE (2009) Predicting faults using the complexity of code changes. In: 2009 IEEE 31st international conference on software engineering. IEEE, pp 78–88
Wei H, Hu C, Chen S et al (2019) Establishing a software defect prediction model via effective dimension reduction. Inf Sci (Ny) 477:399–409. https://doi.org/10.1016/j.ins.2018.10.056
Article MathSciNet Google Scholar
Panichella A, Oliveto R, De Lucia A (2014) Cross-project defect prediction models: L’Union fait la force. In: 2014 software evolution week—IEEE conference on software maintenance, reengineering, and reverse engineering (CSMR-WCRE). IEEE, pp 164–173
Zimmermann T, Nagappan N, Gall H et al (2009) cross-project defect prediction. in: proceedings of the 7th joint meeting of the european software engineering conference and the acm sigsoft symposium on the foundations of software engineering on European software engineering conference and foundations of software engineering symposium - E. ACM Press, New York, USA, p 91
Nam J, Pan SJ, Kim S (2013) Transfer defect learning. In: 2013 35th international conference on software engineering (ICSE). IEEE, pp 382–391
Singh P, Verma S (2015) Cross project software fault prediction at design phase. Int J Comput Inf Eng 9:800–805
Google Scholar
Herbold S, Trautsch A, Grabowski J (2019) Correction of “A comparative study to benchmark cross-project defect prediction approaches.” IEEE Trans Softw Eng 45:632–636. https://doi.org/10.1109/TSE.2018.2790413
Article Google Scholar
Akiyama F (1971) An example of software system debugging. In: IFIP congress (1). pp 353–359
Huda S, Liu K, Abdelrazek M et al (2018) An ensemble oversampling model for class imbalance problem in software defect prediction. IEEE Access 6:24184–24195. https://doi.org/10.1109/ACCESS.2018.2817572
Article Google Scholar
Wang T, Li W, Shi H, Liu Z (2011) Software defect prediction based on classifiers ensemble. J Inf Comput Sci 8:4241–4254
Google Scholar
Singh P, Pal NR, Verma S, Vyas OP (2017) Fuzzy rule-based approach for software fault prediction. IEEE Trans Syst Man Cybern Syst 47:826–837. https://doi.org/10.1109/TSMC.2016.2521840
Article Google Scholar
Nevendra M, Singh P (2018) Multistage preprocessing approach for software defect data prediction. In: Communications in computer and information science, pp 505–515
Jing X-Y, Wu F, Dong X, Xu B (2017) An improved SDA based defect prediction framework for both within-project and cross-project class-imbalance problems. IEEE Trans Softw Eng 43:321–339. https://doi.org/10.1109/TSE.2016.2597849
Article Google Scholar
Mnih A, Hinton GE (2009) A scalable hierarchical distributed language model. Adv Neural Inf Process Syst 21:1081–1088
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Google Scholar
Salakhutdinov R, Hinton G (2009) Semantic hashing. Int J Approx Reason 50:969–978. https://doi.org/10.1016/j.ijar.2008.11.006
Article Google Scholar
Mohamed A, Dahl GE, Hinton G (2012) Acoustic modeling using deep belief networks. IEEE Trans Audio Speech Lang Process 20:14–22. https://doi.org/10.1109/TASL.2011.2109382
Article Google Scholar
Zhao L, Shang Z, Zhao L et al (2019) Siamese dense neural network for software defect prediction with small data. IEEE Access 7:7663–7677. https://doi.org/10.1109/ACCESS.2018.2889061
Article Google Scholar
Zhao L, Shang Z, Zhao L et al (2019) Software defect prediction via cost-sensitive Siamese parallel fully-connected neural networks. Neurocomputing 352:64–74. https://doi.org/10.1016/j.neucom.2019.03.076
Article Google Scholar
Yang X, Lo D, Xia X et al (2015) Deep learning for just-in-time defect prediction. In: 2015 IEEE international conference on software quality, reliability and security. IEEE, pp 17–26
Viet Phan A, Le Nguyen M, Thu Bui L (2017) Convolutional neural networks over control flow graphs for software defect prediction. In: 2017 IEEE 29th international conference on tools with artificial intelligence (ICTAI). IEEE, pp 45–52

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology, Raipur, India
Meetesh Nevendra & Pradeep Singh

Authors

Meetesh Nevendra
View author publications
You can also search for this author in PubMed Google Scholar
Pradeep Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pradeep Singh.

Ethics declarations

Conflict of interest

The manuscript does not have any conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nevendra, M., Singh, P. Defect count prediction via metric-based convolutional neural network. Neural Comput & Applic 33, 15319–15344 (2021). https://doi.org/10.1007/s00521-021-06158-5

Download citation

Received: 27 May 2020
Accepted: 24 May 2021
Published: 08 June 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s00521-021-06158-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Defect count prediction via metric-based convolutional neural network

Abstract

Access this article

Similar content being viewed by others

Data collection and quality challenges in deep learning: a data-centric AI perspective

A Hybrid Machine Learning Model for Code Optimization

Deep learning approaches for detecting DDoS attacks: a systematic review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Defect count prediction via metric-based convolutional neural network

Abstract

Access this article

Similar content being viewed by others

Data collection and quality challenges in deep learning: a data-centric AI perspective

A Hybrid Machine Learning Model for Code Optimization

Deep learning approaches for detecting DDoS attacks: a systematic review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation