Techniques Used for the Prediction of Number of Faults

  • Santosh Singh RathoreEmail author
  • Sandeep Kumar
Part of the SpringerBriefs in Computer Science book series (BRIEFSCOMPUTER)


Prediction of number of faults refers to the process of estimating/predicting a potential number of faults that can occur in each given software module [41]. A software module can be a class for object-oriented software, file for traditional software or any other independent component having a bunch of code bundles together.


  1. 1.
    Abdi, H.: Partial least square regression (PLS regression). Encycl. Res. Methods Soc. Sci. 6(4), 792–795 (2003)Google Scholar
  2. 2.
    Afzal, W., Torkar, R., Feldt, R.: Prediction of fault count data using genetic programming. In: Proceedings of IEEE International Multitopic Conference, INMIC, pp. 349–356 (2008)Google Scholar
  3. 3.
    Aljamaan, H., Elish, M.O., et al.: An empirical study of bagging and boosting ensembles for identifying faulty classes in object-oriented software. In: CIDM 2009, IEEE Symposium on Computational Intelligence and Data Mining, pp. 187–194 (2009)Google Scholar
  4. 4.
    Arar, O.F., Ayan, K.: Software defect prediction using cost-sensitive neural network. Appl. Soft Comput. 33, 263–277 (2015)CrossRefGoogle Scholar
  5. 5.
    Bal, P., Kumar, S.: Extreme learning machine based linear homogeneous ensemble for software fault prediction. In Proceedings of 13th International Conference on Software Technologies (ICSOFT 2018), pp. 69–78 (2018a)Google Scholar
  6. 6.
    Bal, P., Kumar, S.: Cross project software defect prediction using extreme learning machine: an ensemble based study. In: Proceedings of 13th International Conference on Software Technologies (ICSOFT 2018), pp. 320–327 (2018b)Google Scholar
  7. 7.
    Basak, D., Pal, S., Patranabis, D.C.: Support vector regression. Neural Inf. Process. Lett. Rev. 11(10), 203–224 (2007)Google Scholar
  8. 8.
    Bell, R.M., Ostrand, T.J., Weyuker, E.J.: Looking for bugs in all the right places. In Proceedings of the 2006 International Symposium on Software testing and Analysis, ACM, pp. 61–72 (2006)Google Scholar
  9. 9.
    Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)Google Scholar
  10. 10.
    Conte, S.D., Dunsmore, H.E., Shen, V.Y.: Software Engineering Metrics and Models. Benjamin-Cummings Publishing Co., Inc (1986)Google Scholar
  11. 11.
    Dietterich, T.G.: Ensemble methods in machine learning. In: International Workshop on Multiple Classifier Systems. Springer, Berlin, Heidelberg, pp. 1–15 (2000)Google Scholar
  12. 12.
    Elish, M.O., Aljamaan, H., Ahmad, I.: Three empirical studies on predicting software maintainability using ensemble methods. Soft Comput. 19(9), 1–14 (2015)Google Scholar
  13. 13.
    Fagundes, R.A., Souza, R.M., Cysneiros, F.J.: Zero-inflated prediction model in software-fault data. IET Softw. 10(1), 1–9 (2016)CrossRefGoogle Scholar
  14. 14.
    Freund, Y.: Boosting a weak learning algorithm by majority. In: Proceedings of COLT, vol. 90, pp. 202–216 (1990)Google Scholar
  15. 15.
    Gao, K., Khoshgoftaar, T.M.: A comprehensive empirical study of count models for software fault prediction. IEEE Trans. Reliab. 56(2), 223–236 (2007)CrossRefGoogle Scholar
  16. 16.
    Gardner, W., Mulvey, E.P., Shaw, E.C.: Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychol. Bull. 118(3), 392 (1995)CrossRefGoogle Scholar
  17. 17.
    Girard, D.A.: Asymptotic optimality of the fast randomized versions of GCV and CL in ridge regression and regularization. Ann. Stat. 19(4), 1950–1963 (1991)CrossRefGoogle Scholar
  18. 18.
    Graves, T.L., Karr, A.F., Marron, J.S., Siy, H.: Predicting fault incidence using software change history. IEEE Trans. Softw. Eng. 26(7), 653–661 (2000)CrossRefGoogle Scholar
  19. 19.
    Gyimothy, T., Ferenc, R., Siket, I.: Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng. 31(10), 897–910 (2005)CrossRefGoogle Scholar
  20. 20.
    Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S.: A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Softw. Eng. 38(6), 1276–1304 (2012)Google Scholar
  21. 21.
    Hedeker, D., Gibbons, R.D.: A random-effects ordinal regression model for multilevel analysis. Biometrics, 933–944 (1994)Google Scholar
  22. 22.
    Hilbe, J.M.: Negative Binomial Regression, 2nd edn. Jet Propulsion Laboratory, California Institute of Technology and Arizona State University (2012)Google Scholar
  23. 23.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)CrossRefGoogle Scholar
  24. 24.
    Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)CrossRefGoogle Scholar
  25. 25.
    Hosmer Jr, D.W., Lemeshow, S., & Sturdivant, R.X.: Applied logistic regression, vol. 398. Wiley (2013)Google Scholar
  26. 26.
    Janes, A., Scotoo, M., Pedrycz, W., Russo, B., Stefanovic, M., Succi, G.: Identification of defect-prone classes in telecommunication software systems using design metrics. Inf. Sci. 176(24), 3711–3734 (2006)CrossRefGoogle Scholar
  27. 27.
    Jiang, Y., Cukic, B., Ma, Y.: Techniques for evaluating fault prediction models. Empir. Softw. Eng. 13(5), 561–595 (2008)CrossRefGoogle Scholar
  28. 28.
    Jolliffe, I.T.: A note on the use of principal components in regression. Appl. Stat. 31(3), 300–303 (1982)Google Scholar
  29. 29.
    Kleinbaum, D.G., Klein, M.: Logistic Regression: A Self-learning Text. Springer Science & Business Media (2010)Google Scholar
  30. 30.
    Kutner, M.H., Nachtsheim, C., Neter, J.: Applied Linear Regression Models. McGraw-Hill/Irwin (2004)Google Scholar
  31. 31.
    Lambert, D.: Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34(1), 1–14 (1992)CrossRefGoogle Scholar
  32. 32.
    LeBlanc, M., Tibshirani, R.: Combining estimates in regression and classification. J. Am. Stat. Assoc. 91(436), 1641–1650 (1996)MathSciNetzbMATHGoogle Scholar
  33. 33.
    Li, W., Feng, J., Jiang, T.: IsoLasso: a LASSO regression approach to RNA-Seq based transcriptome assembly. In: International Conference on Research in Computational Molecular Biology. Springer, Berlin, Heidelberg, pp. 168–188 (2011)Google Scholar
  34. 34.
    Liu, R.X., Kuang, J., Gong, Q., Hou, X.L.: Principal component regression analysis with SPSS. Comput. Methods Programs Biomed. 71(2), 141–147 (2003)CrossRefGoogle Scholar
  35. 35.
    Mauša, G., Bogunović, N., Grbac, T.G., Bašić, B.D.: Rotation forest in software defect prediction. In: Proceedings of 4th Workshop on Software Quality Analysis, Monitoring, Improvement, and Applications SQAMIA, pp. 35 (2015)Google Scholar
  36. 36.
    Mendes-Moreira, J., Soares, C., Jorge, A.M., Sousa, J.F.D.: Ensemble approaches for regression: A survey. ACM Comput. Surv. (CSUR) 45(1), 10 (2012)CrossRefGoogle Scholar
  37. 37.
    Merz, C.J.: Classification and regression by combining models. PhD thesis, University of California Irvine (1998)Google Scholar
  38. 38.
    Mısırlı, A.T., Bener, A.B., Turhan, B.: An industrial case study of classifier ensembles for locating software defects. Softw. Qual. J. 19(3), 515–536 (2011)CrossRefGoogle Scholar
  39. 39.
    Mousavi, R., Eftekhari, M.: A new ensemble learning methodology based on hybridization of classifier ensemble selection approaches. Appl. Soft Comput. 37, 652–666 (2015)CrossRefGoogle Scholar
  40. 40.
    Ostrand, T.J., Weyuker, E.J., Bell, R.M.: Where the bugs are. ACM SIGSOFT Softw. Eng. Notes 29(4), 86–96 (2004)CrossRefGoogle Scholar
  41. 41.
    Ostrand, T.J., Weyuker, E.J., Bell, R.M.: Predicting the location and number of faults in large software systems. IEEE Trans. Softw. Eng. 31(4), 340–355 (2005)CrossRefGoogle Scholar
  42. 42.
    Ostrand, T.J., Weyuker, E.J., Bell, R.M.: Looking for bugs in all the right places. In: Proceedings of the International Symposium on Software Testing and Analysis, pp. 61–72 (2006)Google Scholar
  43. 43.
    Pai, G.J., Dugan, J.B.: Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Trans. Softw. Eng. 33(10), 675–686 (2007)CrossRefGoogle Scholar
  44. 44.
    Perrone, M.P., Cooper, L.N.: When networks disagree: ensemble methods for hybrid neural networks (No. TR-61). Brown Univ Providence RI Inst for Brain and Neural Systems (1992)Google Scholar
  45. 45.
    Rathore, S.S., Kumar, S.: Predicting number of faults in software system using genetic programming. Procedia Comput. Sci. 62, 303–311 (2015)CrossRefGoogle Scholar
  46. 46.
    Rathore S.S., & Kumar, S.: An empirical study of some software fault prediction techniques for the number of faults prediction. Soft Comput. 21(24), 7417–7434 (2017a)Google Scholar
  47. 47.
    Rathore, S.S., Kumar, S.: Linear and non-linear heterogeneous ensemble methods to predict the number of faults in software systems. Knowl. Based Syst. 119, 232–256 (2017b)Google Scholar
  48. 48.
    Rodriguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1619–1630 (2006)CrossRefGoogle Scholar
  49. 49.
    Siers, M.J., Islam, M.Z.: Software defect prediction using a cost sensitive decision forest and voting, and a potential solution to the class imbalance problem. Inf. Syst. 51, 62–71 (2015)CrossRefGoogle Scholar
  50. 50.
    Khoshgoftaar, T.M., Geleyn, E., Nguyen, L.: Empirical case studies of combining software quality classification models. In Proceedings of 3rd International Conference on Quality Software, pp. 40–49 (2003)Google Scholar
  51. 51.
    Khoshgoftaar, T.M., Gao, K.: Count models for software quality estimation. IEEE Trans. Reliab. 56(2), 212–222 (2007)CrossRefGoogle Scholar
  52. 52.
    Theil, H.: A rank-invariant method of linear and polynomial regression analysis. Henri Theil’s Contributions to Economics and Econometrics, pp. 345–381. Springer, Dordrecht (1992)CrossRefGoogle Scholar
  53. 53.
    Twala, B.: Predicting software faults in large space systems using machine learning techniques. Def. Sci. J. 61(4), 306–316 (2011)CrossRefGoogle Scholar
  54. 54.
    Ver Hoef, J.M., Boveng, P.L.: Quasi‐Poisson vs. negative binomial regression: how should we model overdispersed count data? Ecology 88(11), 2766–2772 (2007)Google Scholar
  55. 55.
    Veryard, R.: The Economics of Information Systems and Software. Butterworth-Heinemann (2014)Google Scholar
  56. 56.
    Wang, T., Li, W., Shi, H., Liu, Z.: Software defect prediction based on classifiers ensemble. J. Inf. Comput. Sci. 8(16), 4241–4254 (2011)Google Scholar
  57. 57.
    Wolpert, D.H.: Stacked generalization. Neural networks 5(2), 241–259 (1992)CrossRefGoogle Scholar
  58. 58.
    Ye, X., Bunescu, R., Liu, C.: Mapping bug reports to relevant files: a ranking model, a fine-grained benchmark, and feature evaluation. IEEE Trans. Softw. Eng. 42(4), 379–402 (2016)CrossRefGoogle Scholar
  59. 59.
    Yu, L.: Using negative binomial regression analysis to predict software faults: a study of apache ant. Int. J. Inf. Technol. Comput. Sci. 4(8), 63–70 (2012)Google Scholar
  60. 60.
    Zheng, J.: Cost-sensitive boosting neural networks for software defect prediction. Expert. Syst. Appl. 37(6), 4537–4543 (2010)CrossRefGoogle Scholar

Copyright information

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringABV-Indian Institute of Information Technology and Management GwaliorGwaliorIndia
  2. 2.Department of Computer Science and EngineeringIndian Institute of Technology RoorkeeRoorkeeIndia

Personalised recommendations