Cluster Computing

, Volume 20, Issue 3, pp 2267–2281 | Cite as

A parallel framework for software defect detection and metric selection on cloud computing

  • Md Mohsin Ali
  • Shamsul HudaEmail author
  • Jemal Abawajy
  • Sultan Alyahya
  • Hmood Al-Dossari
  • John Yearwood


With the continued growth of Internet of Things (IoT) and its convergence with the cloud, numerous interoperable software are being developed for cloud. Therefore, there is a growing demand to maintain a better quality of software in the cloud for improved service. This is more crucial as the cloud environment is growing fast towards a hybrid model; a combination of public and private cloud model. Considering the high volume of the available software as a service (SaaS) in the cloud, identification of non-standard software and measuring their quality in the SaaS is an urgent issue. Manual testing and determination of the quality of the software is very expensive and impossible to accomplish it to some extent. An automated software defect detection model that is capable to measure the relative quality of software and identify their faulty components can significantly reduce both the software development effort and can improve the cloud service. In this paper, we propose a software defect detection model that can be used to identify faulty components in big software metric data. The novelty of our proposed approach is that it can identify significant metrics using a combination of different filters and wrapper techniques. One of the important contributions of the proposed approach is that we designed and evaluated a parallel framework of a hybrid software defect predictor in order to deal with big software metric data in a computationally efficient way for cloud environment. Two different hybrids have been developed using Fisher and Maximum Relevance (MR) filters with a Artificial Neural Network (ANN) based wrapper in the parallel framework. The evaluations are performed with real defect-prone software datasets for all parallel versions. Experimental results show that the proposed parallel hybrid framework achieves a significant computational speedup on a computer cluster with a higher defect prediction accuracy and smaller number of software metrics compared to the independent filter or wrapper approaches.



The authors would like to extend their sincere appreciation to the Deanship of Scientific Research at King Saud University for its participation in funding this research group (RGP-1436-039).


  1. 1.
    NCI: National computational infrastructure.
  2. 2.
    Abaei, G., Selamat, A., Fujita, H.: An empirical study based on semi-supervised hybrid self-organizing map for software fault prediction. Knowl.-Based Syst. 74, 28–39 (2015)CrossRefGoogle Scholar
  3. 3.
    Aparisi, F., Sanz, J.: Interpreting the out-of-control signals of multivariate control charts employing neural networks. Int. J. Comput. Electr. Autom. Control Inf. Eng. 4(1), 24–28 (2010)Google Scholar
  4. 4.
    Arar, O.F., Ayan, K.: Software defect prediction using cost-sensitive neural network. Appl. Soft Comput. 33(C), 263–277 (2015)Google Scholar
  5. 5.
    Asad, A.A., Alsmadi, I.: Evaluating the impact of software metrics on defects prediction, part 2. Comput. Sci. J. Mold. 22(1), 127–144 (2014)Google Scholar
  6. 6.
    Balagani, K.S., Phoha, V.V.: On the feature selection criterion based on an approximation of multidimensional mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 32(7), 1342–1343 (2010)CrossRefGoogle Scholar
  7. 7.
    Bayes, T.: An essay towards solving a problem in the doctrine of chances. Philos. Trans. R. Soc. Lond. 53, 370–418 (1763)CrossRefzbMATHGoogle Scholar
  8. 8.
    Catal, C., Diri, B.: Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf. Sci. 179(8), 1040–1058 (2009)CrossRefGoogle Scholar
  9. 9.
    Chang, C.P., Chu, C.P., Yeh, Y.F.: Integrating in-process software defect prediction with association mining to discover defect pattern. Inf. Softw. Technol. 51(2), 375–384 (2009)CrossRefGoogle Scholar
  10. 10.
    Compton, B.T., Withrow, C.: Prediction and control of ada software defects. J. Syst. Softw. 12(3), 199–207 (1990)CrossRefGoogle Scholar
  11. 11.
    Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines: And Other Kernel-based Learning Methods. Cambridge University Press, New York, NY (2000)CrossRefzbMATHGoogle Scholar
  12. 12.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (2001)zbMATHGoogle Scholar
  13. 13.
    Ebrahimi, N.B.: On the statistical analysis of the number of errors remaining in a software design document after inspection. IEEE Trans. Softw. Eng. 23(8), 529–532 (1997)CrossRefGoogle Scholar
  14. 14.
    Erturk, E., Sezer, E.A.: A comparison of some soft computing methods for software fault prediction. Expert Syst. Appl. 42(4), 1872–1879 (2015)CrossRefGoogle Scholar
  15. 15.
    Freund, Y.: Boosting a weak learning algorithm by majority. Inf. Comput. 121(2), 256–285 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Guo, L., Ma, Y., Cukic, B., Singh, H.: Robust prediction of fault-proneness by random forests. In: Proceedings of the 15th International Symposium on Software Reliability Engineering (ISSRE 2004). pp. 417–428 (2004)Google Scholar
  18. 18.
    Hassan, A.E.: Predicting faults using the complexity of code changes. In: Proceedings of the 31st International Conference on Software Engineering. pp. 78–88. IEEE Computer Society (2009)Google Scholar
  19. 19.
    Hsu, C.N., Huang, H.J., Schuschel, D.: The ANNIGMA-wrapper approach to fast feature selection for neural nets. IEEE Trans. Syst. Man Cybern. B 32(2), 207–212 (2002)CrossRefGoogle Scholar
  20. 20.
    Huda, S., Abdollahian, M., Mammadov, M., Yearwood, J., Ahmed, S., Sultan, I.: A hybrid wrapper-filter approach to detect the source(s) of out-of-control signals in multivariate manufacturing process. Eur. J. Oper. Res. 237(3), 857–870 (2014)CrossRefGoogle Scholar
  21. 21.
    Jiang, Y., Cukic, B.: Misclassification cost-sensitive fault prediction models. In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering. pp. 20:1–20:10. PROMISE ’09 (2009)Google Scholar
  22. 22.
    Jin, C., Jin, S.W.: Prediction approach of software fault-proneness based on hybrid artificial neural network and quantum particle swarm optimization. Appl. Soft Comput. 35, 717–725 (2015)CrossRefGoogle Scholar
  23. 23.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)CrossRefzbMATHGoogle Scholar
  24. 24.
    Kröse, B., Smagt, P.V.D.: An introduction to Neural Networks. The University of Amsterdam, Amsterdam (1993)Google Scholar
  25. 25.
    Laradji, I.H., Alshayeb, M., Ghouti, L.: Software defect prediction using ensemble learning on selected features. Inf. Softw. Technol. 58, 388–402 (2015)CrossRefGoogle Scholar
  26. 26.
    Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)CrossRefGoogle Scholar
  27. 27.
    Li, Z., Reformat, M.: A practical method for the software fault-prediction. In: Proceedings of the IEEE International Conference on Information Reuse and Integration (IRI 2007). pp. 659–666 (2007)Google Scholar
  28. 28.
    Malhotra, R.: A systematic review of machine learning techniques for software fault prediction. Appl. Soft Comput. 27, 504–518 (2015)CrossRefGoogle Scholar
  29. 29.
    Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2007)CrossRefGoogle Scholar
  30. 30.
    Munson, J.C., Khoshgoftaar, T.M.: Regression modelling of software quality: empirical investigation. Inf. Softw. Technol. 32(2), 106–114 (1990)CrossRefGoogle Scholar
  31. 31.
    Pelayo, L., Dick, S.: Applying novel resampling strategies to software defect prediction. In: Proceedings of the 2007 Annual Meeting of the North American Fuzzy Information Processing Society (NAFIPS 2007). pp. 69–72 (2007)Google Scholar
  32. 32.
    Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
  33. 33.
    Radjenović, D., Heričko, M., Torkar, R., Živkovič, A.: Software fault prediction metrics: a systematic literature review. Inf. Softw. Technol. 55(8), 1397–1418 (2013)Google Scholar
  34. 34.
    Rodger, J.A.: Toward reducing failure risk in an integrated vehicle health maintenance system. Expert Syst. Appl. 39(10), 9821–9836 (2012)CrossRefGoogle Scholar
  35. 35.
    Song, Q., Jia, Z., Shepperd, M., Ying, S., Liu, J.: A general software defect-proneness prediction framework. IEEE Trans. Softw. Eng. 37(3), 356–370 (2011)CrossRefGoogle Scholar
  36. 36.
    Song, Q., Shepperd, M., Cartwright, M., Mair, C.: Software defect association mining and defect correction effort prediction. IEEE Trans. Softw. Eng. 32(2), 69–82 (2006)CrossRefGoogle Scholar
  37. 37.
    Sutter, J.M., Kalivas, J.H.: Comparison of forward selection, backward elimination, and generalized simulated annealing for variable selection. Microchem. J. 47(1), 60–66 (1993)CrossRefGoogle Scholar
  38. 38.
    Wang, H., Khoshgoftaar, T.M., Hulse, J.V., Ga, K.: Metric selection for software defect prediction. Int. J. Softw. Eng. Knowl. Eng. 21(2), 237–257 (2011)CrossRefGoogle Scholar
  39. 39.
    Yadav, H.B., Yadav, D.K.: A fuzzy logic based approach for phase-wise software defects prediction using software metrics. Inf. Softw. Technol. 63, 44–57 (2015)CrossRefGoogle Scholar
  40. 40.
    Zhao, M., Wohlin, C., Ohlsson, N., Xie, M.: A comparison between software design and code metrics for the prediction of software fault content. Inf. Softw. Technol. 40(14), 801–809 (1998)CrossRefGoogle Scholar
  41. 41.
    Zheng, J.: Cost-sensitive boosting neural networks for software defect prediction. Expert Syst. Appl. 37(6), 4537–4543 (2010)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Md Mohsin Ali
    • 1
  • Shamsul Huda
    • 2
    Email author
  • Jemal Abawajy
    • 2
  • Sultan Alyahya
    • 3
  • Hmood Al-Dossari
    • 3
  • John Yearwood
    • 2
  1. 1.The Australian National UniversityCanberraAustralia
  2. 2.Deakin UniversityMelbourneAustralia
  3. 3.King Saud UniversityRiyadhSaudi Arabia

Personalised recommendations