The Stability of Threshold Values for Software Metrics in Software Defect Prediction

Mauša, Goran; Grbac, Tihana Galinac

doi:10.1007/978-3-319-66854-3_7

Goran Mauša¹⁷ &
Tihana Galinac Grbac¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10563))

Included in the following conference series:

International Conference on Model and Data Engineering

840 Accesses
2 Citations

Abstract

Software metrics measure the complexity and quality in many empirical case studies. Recent studies have shown that threshold values can be detected for some metrics and used to predict defect-prone system modules. The goal of this paper is to empirically validate the stability of threshold values. Our aim is to analyze a wider set of software metrics than it has been previously reported and to perform the analysis in the context of different levels of data imbalance. We replicate the case study of deriving thresholds for software metrics using a statistical model based on logistic regression. Furthermore, we analyze threshold stability in the context of varying level of data imbalance. The methodology is validated using a great number of subsequent releases of open source projects. We revealed that threshold values of some metrics could be used to effectively predict defect-prone modules. Moreover, threshold values of some metrics may be influenced by the level of data imbalance. The results of this case study give a valuable insight into the importance of software metrics and the presented methodology may also be used by software quality assurance practitioners.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 20(6), 476–493 (1994)
Article Google Scholar
Briand, L.C., Daly, J.W., Wust, J.K.: A unified framework for coupling measurement in object-oriented systems. IEEE Trans. Softw. Eng. 25(1), 91–121 (1999)
Article Google Scholar
Basili, V.R., Briand, L.C., Melo, W.L.: A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Eng. 22(10), 751–761 (1996)
Article Google Scholar
Radjenović, D., Heričko, M., Torkar, R., Živković, A.: Software fault prediction metrics: a systematic literature review. Inf. Softw. Technol. 55(8), 1397–1418 (2013)
Article Google Scholar
Arisholm, E., Briand, L.C., Johannessen, E.B.: A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. J. Syst. Softw. 83(1), 2–17 (2010)
Article Google Scholar
Shatnawi, R., Li, W.: The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. J. Syst. Softw. 81(11), 1868–1882 (2008)
Article Google Scholar
Shatnawi, R.: A quantitative investigation of the acceptable risk levels of object-oriented metrics in open-source systems. IEEE Trans. Softw. Eng. 36(2), 216–225 (2010)
Article Google Scholar
Galinac Grbac, T., Huljenić, D.: On the probability distribution of faults in complex software systems. Inf. Softw. Technol. 58, 250–258 (2015)
Article Google Scholar
Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S.: A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Softw. Eng. 38(6), 1276–1304 (2012)
Article Google Scholar
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
Galinac Grbac, T., Runeson, P., Huljenić, D.: A second replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans. Softw. Eng. 39(4), 462–476 (2013)
Article Google Scholar
Mauša, G., Galinac Grbac, T.: Co-evolutionary multi-population genetic programming for classification in software defect prediction: an empirical case study. Appl. Soft Comput. 55, 331–351 (2017)
Article Google Scholar
Graning, L., Jin, Y., Sendhoff, B.: Generalization improvement in multi-objective learning. In: The 2006 IEEE International Joint Conference on Neural Network Proceedings, pp. 4839–4846 (2006)
Google Scholar
Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. Springer, Heidelberg (2003)
Book MATH Google Scholar
Martin, W.N., Lienig, J., Cohoon, J.P.: Island (migration) models: evolutionary algorithms based on punctuated equilibria. Handb. Evol. Comput. 6, 1–15 (1997)
Google Scholar
Arar, O.F., Ayan, K.: Deriving thresholds of software metrics to predict faults on open source software. Expert Syst. Appl. 61(1), 106–121 (2016)
Article Google Scholar
Shatnawi, R.: Deriving metrics thresholds using log transformation. J. Softw.: Evol. Process 27(2), 95–113 (2015). JSME-14-0025.R2
Google Scholar
Mauša, G., Galinac Grbac, T., Dalbelo Bašić, B.: A systemathic data collection procedure for software defect prediction. Comput. Sci. Inf. Syst. 13(1), 173–197 (2016)
Article Google Scholar
Oliveira, P., Valente, M.T., Lima, F.P.: Extracting relative thresholds for source code metrics. In: Proceedings of CSMR-WCRE, pp. 254–263 (2014)
Google Scholar
Weiss, G.M.: Mining with rarity: a unifying framework. SIGKDD Explor. Newsl. 6(1), 7–19 (2004)
Article Google Scholar
Andersson, C., Runeson, P.: A replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans. Softw. Eng. 33(5), 273–286 (2007)
Article Google Scholar
Fenton, N.E., Ohlsson, N.: Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Softw. Eng. 26(8), 797–814 (2000)
Article Google Scholar
Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Evolving diverse ensembles using genetic programming for classification with unbalanced data. IEEE Trans. Evol. Comput. 17(3), 368–386 (2013)
Article Google Scholar
Mauša, G., Galinac Grbac, T., Dalbelo Bašić, B.: Software defect prediction with bug-code analyzer - a data collection tool demo. In: Proceedings of SoftCOM 2014 (2014)
Google Scholar
Mauša, G., Perković, P., Galinac Grbac, T., Štajduhar, I.: Techniques for bug-code linking. In: Proceedings of SQAMIA 2014, pp. 47–55 (2014)
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd edn. Springer, Heidelberg (2009)
Book MATH Google Scholar
Zimmermann, T., Nagappan, N.: Predicting defects using network analysis on dependency graphs. In: Proceedings of the 30th International Conference on Software Engineering. ICSE 2008, pp. 531–540. ACM, New York (2008)
Google Scholar
Bender, R.: Quantitative risk assessment in epidemiological studies investigating threshold effects. Biometrical J. 41(3), 305–319 (1999)
Article MATH Google Scholar
Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Reusing genetic programming for ensemble selection in classification of unbalanced data. IEEE Trans. Evol. Comput. 18, 893–908 (2013)
Article Google Scholar

Download references

Acknowledgments

This work is supported in part by Croatian Science Foundation’s funding of the project UIP-2014-09-7945 and by the University of Rijeka Research Grant 13.09.2.2.16.

Author information

Authors and Affiliations

Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000, Rijeka, Croatia
Goran Mauša & Tihana Galinac Grbac

Authors

Goran Mauša
View author publications
You can also search for this author in PubMed Google Scholar
Tihana Galinac Grbac
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Goran Mauša .

Editor information

Editors and Affiliations

ISAE-ENSMA, Chasseneuil, France
Yassine Ouhammou
University of Novi Sad, Novi Sad, Serbia
Mirjana Ivanovic
UPC-Barcelona Tech, Barcelona, Spain
Alberto Abelló
ISAE-ENSMA, Chasseneuil, France
Ladjel Bellatreche

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mauša, G., Grbac, T.G. (2017). The Stability of Threshold Values for Software Metrics in Software Defect Prediction. In: Ouhammou, Y., Ivanovic, M., Abelló, A., Bellatreche, L. (eds) Model and Data Engineering. MEDI 2017. Lecture Notes in Computer Science(), vol 10563. Springer, Cham. https://doi.org/10.1007/978-3-319-66854-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-66854-3_7
Published: 06 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66853-6
Online ISBN: 978-3-319-66854-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics