Optimized Defect Prediction Model Using Statistical Process Control and Correlation-Based Feature Selection Method

  • J. Nanditha
  • K. N. Sruthi
  • Sreeja Ashok
  • M. V. Judy
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 384)


Defects are the flaws in software development process that causes the software to perform in an unexpected manner and produce erroneous outputs. Detecting these defects is an important task to ensure the quality of the software product. Defect prediction models acts as quality indicators that helps in detecting the defective components in the early phases of software development cycle. These models leads to reduced rework effort, more stable products and improved customer satisfaction. It is hard to find the high risk components that are major contributors for the defects from large number of variables. Thus feature selection is a very important aspect associated with defect analysis. Here we propose a defect prediction model to control the quality of software products using statistical process control. The key contributors for building the prediction models are derived using Correlation and ANOVA based feature selection methods. The proposed model is evaluated using benchmark dataset and the results are promising when compared with standard classification models.


Feature selection ANOVA Correlation Control charts Defect analysis Prediction models 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Liu, H., Yu, L.: Toward Integrating Feature Selection Algorithms for Classification and ClusteringGoogle Scholar
  2. 2.
    IEEE Transactions on Knowledge and Data Engineering 17(4), 491−502 (2005)Google Scholar
  3. 3.
    International Journal of Computer Theory and Engineering. Cancer Classification of Bioinformatics data using ANOVA 2(3),1793−8201, June 2010Google Scholar
  4. 4.
    Patil, T.R., Sherekar, S.S.: Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification. International Journal Of Computer Science And Applications 6(2), April 2013. ISSN: 0974-1011Google Scholar
  5. 5.
    Dangare, C.S., Apte, S.S.: Improved Study of Heart Disease Prediction System using Data Mining Classification Techniques. International Journal of Computer Applications (0975 – 888) 47(10), June 2012Google Scholar
  6. 6.
    Tiwari, R., Singh, M.P.: Correlation based attribute selection using Genitic Algorithm. International journal of computer Applications(0975-8887) 4(8), August 2010Google Scholar
  7. 7.
    Shana, J., Venkatachalam, T.: Identifying Key Performance Indicators and Predicting the Result from Student Data. International Journal of Computer Applications (0975-8887) 25(9), July 2011Google Scholar
  8. 8.
    Vlahou, A., Schorge, J.O., Gregory, B.W., Coleman, R.L.: Diagnosis of Ovarian Cancer Using Decision Tree Classification of Mass Spectral Data. Journal of Biomedicine and Biotechnology 2003(5), 308–314 (2003)CrossRefGoogle Scholar
  9. 9.
    Lavanya, D., Rani, K.U.: Analysis of Feature SelectionwithClassfication: BreastCancer Datasets. Indian Journal of Computer Science and Engineering (IJCSE)Google Scholar
  10. 10.
    Kalyani, P., Karnan, M.: Attribute Reduction using Forward Selection and Relative Reduct Algorithm. International Journal of Computer Applications (0975 – 8887) 11(3), December 2010Google Scholar
  11. 11.
    Mahapoonyanont, N., Mahapoonyanont, T., Pengkaew, N., Kamhangkit, R.: Power of the test of One-Way Anova after transforming with large sample size data. International Journal Procedia Social and Behavioral Sciences 9, 933–937 (2010)CrossRefGoogle Scholar
  12. 12.
    Hayati, F., Maghsoodloo, S., DeVivo, M.J., Carnahan, B.J.: Control chart for monitoring occupational asthma. Journal of Safety Research 37, 17–26 (2006)CrossRefGoogle Scholar
  13. 13.
    Mahmud, W.M., Agiza, H.N., Radwan, E.: Intrusion detection using rough sets based parallel genetic algorithm hybrid model. In: Proceedings of the World Congress on Engineering and Computer Science, WCECS 2009, San Francisco, USA, vol. II, October 2009Google Scholar
  14. 14.
    Subbulakshmi, T., Ramamoorthi, A., Mercy, S.: Shalinie Ensemble design for intrusion detection systems. International Journal of Computer science & Information Technology (IJCSIT) 9, August 2009Google Scholar
  15. 15.
    Spangler, W.E., Vargas, M.G.: Choosing Data mining Methods for Multiple Classification: Representational and performance measurement Implications for Decision Support. Journal of Management Information Sysytem 16(1)Google Scholar
  16. 16.
    Dimitoglou, G., Adams, J.A., Jim, C.M.: Comparison of the C4.5 and a Naive BayesClassifier for the Prediction of Lung CancerSurvivabilityGoogle Scholar
  17. 17.
    Kaur, P.J., Pallavi: Data Mining Models for Software Defect Prediction. International Journal of Software and Web Sciences (IJSWS) (2013)Google Scholar
  18. 18.
    Malhotra, R., Jain, A: Fault Prediction Using Statistical and Machine Learning (2012)Google Scholar
  19. 19.
    Xing, F.: A Novel method for early software quality prediction based on support vector machine. In: Proc.of the 16th ISSRE, pp. 213−222 (2005)Google Scholar
  20. 20.
    Fenton, N.: Predicting Software Defects in Varying Development lifecycles Using Bayesian Nets. Information and Software Technology 49(1), 32–43 (2007)CrossRefGoogle Scholar
  21. 21.
    Nancy, S.G., Alias Balamurugan, S.A.: A comparative study of feature selection methods for cancer classification using gene expression dataset. Journal of Computer Applications (JCA) VI(3) (2013). ISSN: 0974-1925Google Scholar
  22. 22.
    Tomar, D., Agarwal, S.: A Survey on Pre-processing and Post-processing Techniques in Data Mining. International Journal of Database Theory and Application 7(4), 99–128 (2014)Google Scholar
  23. 23.
    Jenzi, S., Priyanka, P., Alli, P.: A Reliable Classifier Model Using Data Mining Approach For Heart Disease Prediction. International Journal of Advanced Research in Computer Science and Software Engineering 3(3), March 2013. ISSN: 2277 128XGoogle Scholar
  24. 24.
    Gayathri, M., Sudha, A.: Software Defect Prediction System using Multilayer Perceptron Neural Network with Data Mining. International Journal of Recent Technology and Engineering (IJRTE) 3(2), May 2014. ISSN: 2277-3878Google Scholar
  25. 25.
    Kumaresh, S., Meenakshy Sivaguru, B.R.: Software Defect Classification using Bayesian Classification Techniques. International Journal of Computer Applications (0975 – 8887). International Conference on Communication, Computing and Information Technology (ICCCMIT-2014)Google Scholar
  26. 26.
    Azeem, N., Usmani, S.: Analysis of Data Mining Based Software Defect Prediction Techniques. Global Journal of Computer Science and Technology 11(16) Version 1.0, September 2011Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • J. Nanditha
    • 1
  • K. N. Sruthi
    • 1
  • Sreeja Ashok
    • 1
  • M. V. Judy
    • 1
  1. 1.Department of Computer Science & I.TAmrita School of Arts & Sciences, Amrita Vishwa VidyapeethamKochiIndia

Personalised recommendations