Skip to main content

The Stability of Threshold Values for Software Metrics in Software Defect Prediction

  • Conference paper
  • First Online:
Model and Data Engineering (MEDI 2017)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10563))

Included in the following conference series:

Abstract

Software metrics measure the complexity and quality in many empirical case studies. Recent studies have shown that threshold values can be detected for some metrics and used to predict defect-prone system modules. The goal of this paper is to empirically validate the stability of threshold values. Our aim is to analyze a wider set of software metrics than it has been previously reported and to perform the analysis in the context of different levels of data imbalance. We replicate the case study of deriving thresholds for software metrics using a statistical model based on logistic regression. Furthermore, we analyze threshold stability in the context of varying level of data imbalance. The methodology is validated using a great number of subsequent releases of open source projects. We revealed that threshold values of some metrics could be used to effectively predict defect-prone modules. Moreover, threshold values of some metrics may be influenced by the level of data imbalance. The results of this case study give a valuable insight into the importance of software metrics and the presented methodology may also be used by software quality assurance practitioners.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 20(6), 476–493 (1994)

    Article  Google Scholar 

  2. Briand, L.C., Daly, J.W., Wust, J.K.: A unified framework for coupling measurement in object-oriented systems. IEEE Trans. Softw. Eng. 25(1), 91–121 (1999)

    Article  Google Scholar 

  3. Basili, V.R., Briand, L.C., Melo, W.L.: A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Eng. 22(10), 751–761 (1996)

    Article  Google Scholar 

  4. Radjenović, D., Heričko, M., Torkar, R., Živković, A.: Software fault prediction metrics: a systematic literature review. Inf. Softw. Technol. 55(8), 1397–1418 (2013)

    Article  Google Scholar 

  5. Arisholm, E., Briand, L.C., Johannessen, E.B.: A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. J. Syst. Softw. 83(1), 2–17 (2010)

    Article  Google Scholar 

  6. Shatnawi, R., Li, W.: The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. J. Syst. Softw. 81(11), 1868–1882 (2008)

    Article  Google Scholar 

  7. Shatnawi, R.: A quantitative investigation of the acceptable risk levels of object-oriented metrics in open-source systems. IEEE Trans. Softw. Eng. 36(2), 216–225 (2010)

    Article  Google Scholar 

  8. Galinac Grbac, T., Huljenić, D.: On the probability distribution of faults in complex software systems. Inf. Softw. Technol. 58, 250–258 (2015)

    Article  Google Scholar 

  9. Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S.: A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Softw. Eng. 38(6), 1276–1304 (2012)

    Article  Google Scholar 

  10. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  11. Galinac Grbac, T., Runeson, P., Huljenić, D.: A second replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans. Softw. Eng. 39(4), 462–476 (2013)

    Article  Google Scholar 

  12. Mauša, G., Galinac Grbac, T.: Co-evolutionary multi-population genetic programming for classification in software defect prediction: an empirical case study. Appl. Soft Comput. 55, 331–351 (2017)

    Article  Google Scholar 

  13. Graning, L., Jin, Y., Sendhoff, B.: Generalization improvement in multi-objective learning. In: The 2006 IEEE International Joint Conference on Neural Network Proceedings, pp. 4839–4846 (2006)

    Google Scholar 

  14. Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. Springer, Heidelberg (2003)

    Book  MATH  Google Scholar 

  15. Martin, W.N., Lienig, J., Cohoon, J.P.: Island (migration) models: evolutionary algorithms based on punctuated equilibria. Handb. Evol. Comput. 6, 1–15 (1997)

    Google Scholar 

  16. Arar, O.F., Ayan, K.: Deriving thresholds of software metrics to predict faults on open source software. Expert Syst. Appl. 61(1), 106–121 (2016)

    Article  Google Scholar 

  17. Shatnawi, R.: Deriving metrics thresholds using log transformation. J. Softw.: Evol. Process 27(2), 95–113 (2015). JSME-14-0025.R2

    Google Scholar 

  18. Mauša, G., Galinac Grbac, T., Dalbelo Bašić, B.: A systemathic data collection procedure for software defect prediction. Comput. Sci. Inf. Syst. 13(1), 173–197 (2016)

    Article  Google Scholar 

  19. Oliveira, P., Valente, M.T., Lima, F.P.: Extracting relative thresholds for source code metrics. In: Proceedings of CSMR-WCRE, pp. 254–263 (2014)

    Google Scholar 

  20. Weiss, G.M.: Mining with rarity: a unifying framework. SIGKDD Explor. Newsl. 6(1), 7–19 (2004)

    Article  Google Scholar 

  21. Andersson, C., Runeson, P.: A replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans. Softw. Eng. 33(5), 273–286 (2007)

    Article  Google Scholar 

  22. Fenton, N.E., Ohlsson, N.: Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Softw. Eng. 26(8), 797–814 (2000)

    Article  Google Scholar 

  23. Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Evolving diverse ensembles using genetic programming for classification with unbalanced data. IEEE Trans. Evol. Comput. 17(3), 368–386 (2013)

    Article  Google Scholar 

  24. Mauša, G., Galinac Grbac, T., Dalbelo Bašić, B.: Software defect prediction with bug-code analyzer - a data collection tool demo. In: Proceedings of SoftCOM 2014 (2014)

    Google Scholar 

  25. Mauša, G., Perković, P., Galinac Grbac, T., Štajduhar, I.: Techniques for bug-code linking. In: Proceedings of SQAMIA 2014, pp. 47–55 (2014)

    Google Scholar 

  26. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd edn. Springer, Heidelberg (2009)

    Book  MATH  Google Scholar 

  27. Zimmermann, T., Nagappan, N.: Predicting defects using network analysis on dependency graphs. In: Proceedings of the 30th International Conference on Software Engineering. ICSE 2008, pp. 531–540. ACM, New York (2008)

    Google Scholar 

  28. Bender, R.: Quantitative risk assessment in epidemiological studies investigating threshold effects. Biometrical J. 41(3), 305–319 (1999)

    Article  MATH  Google Scholar 

  29. Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Reusing genetic programming for ensemble selection in classification of unbalanced data. IEEE Trans. Evol. Comput. 18, 893–908 (2013)

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported in part by Croatian Science Foundation’s funding of the project UIP-2014-09-7945 and by the University of Rijeka Research Grant 13.09.2.2.16.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Goran Mauša .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Mauša, G., Grbac, T.G. (2017). The Stability of Threshold Values for Software Metrics in Software Defect Prediction. In: Ouhammou, Y., Ivanovic, M., Abelló, A., Bellatreche, L. (eds) Model and Data Engineering. MEDI 2017. Lecture Notes in Computer Science(), vol 10563. Springer, Cham. https://doi.org/10.1007/978-3-319-66854-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66854-3_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66853-6

  • Online ISBN: 978-3-319-66854-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics