Empirical validation of object-oriented metrics for predicting fault proneness models

Singh, Yogesh; Kaur, Arvinder; Malhotra, Ruchika

doi:10.1007/s11219-009-9079-6

Empirical validation of object-oriented metrics for predicting fault proneness models

Published: 01 July 2009

Volume 18, pages 3–35, (2010)
Cite this article

Software Quality Journal Aims and scope Submit manuscript

Yogesh Singh¹,
Arvinder Kaur¹ &
Ruchika Malhotra¹

1526 Accesses
148 Citations
Explore all metrics

Abstract

Empirical validation of software metrics used to predict software quality attributes is important to ensure their practical relevance in software organizations. The aim of this work is to find the relation of object-oriented (OO) metrics with fault proneness at different severity levels of faults. For this purpose, different prediction models have been developed using regression and machine learning methods. We evaluate and compare the performance of these methods to find which method performs better at different severity levels of faults and empirically validate OO metrics given by Chidamber and Kemerer. The results of the empirical study are based on public domain NASA data set. The performance of the predicted models was evaluated using Receiver Operating Characteristic (ROC) analysis. The results show that the area under the curve (measured from the ROC analysis) of models predicted using high severity faults is low as compared with the area under the curve of the model predicted with respect to medium and low severity faults. However, the number of faults in the classes correctly classified by predicted models with respect to high severity faults is not low. This study also shows that the performance of machine learning methods is better than logistic regression method with respect to all the severities of faults. Based on the results, it is reasonable to claim that models targeted at different severity levels of faults could help for planning and executing testing by focusing resources on fault-prone parts of the design and code that are likely to cause serious failures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sampling in software engineering research: a critical review and guidelines

Article 28 April 2022

Applications of AI in classical software engineering

Article Open access 26 July 2020

Software defect prediction: future directions and challenges

Article 27 February 2024

References

Afzal, W. (2007). Metrics in software test planning and test design processes. Ph.D. Disseration.
Aggarwal, K. K., Singh, Y., Kaur, A., & Malhotra, R. (2005). Software reuse metrics for object-oriented systems. In Proceedings of the Third ACIS Int’l Conference On Software Engineering Research, Management and Applications (SERA ‘05), 48–55.
Aggarwal, K. K., Singh, Y., Kaur, A., & Malhotra, R. (2006a). Empirical study of object-oriented metrics. Journal of Object Technology, 5(8), 149–173.
Google Scholar
Aggarwal, K. K., Singh, Y., Kaur, A., & Malhotra, R. (2006b). Investigating the effect of coupling metrics on fault proneness in object-oriented systems. Software Quality Professional, 8(4), 4–16.
Google Scholar
Aggarwal, K. K., Singh, Y., Kaur, A., & Malhotra, R. (2007). Application of artificial neural network for predicting fault proneness models. International conference on information systems, technology and management (ICISTM 2007), March 12–13, New Delhi, India.
Aggarwal, K. K., Singh, Y., Kaur, A., & Malhotra, R. (2009). Empirical analysis for investigating the effect of object-oriented metrics on fault proneness: A replicated case study. Software Process: Improvement and Practice, 16(1), 39–62. doi:10.1002/spip.389s.
Article Google Scholar
Barnett, V., & Price, T. (1995). Outliers in statistical data. London: Wiley.
Google Scholar
Basili, V., Briand, L., & Melo, W. (1996). A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering, 22(10), 751–761. doi:10.1109/32.544352.
Article Google Scholar
Bazman’s Testing Pages. (2006). A website containing articles and white papers on software testing. http://members.tripod.com/~bazman/classification.html?button7=Classification+of+Errors+by+Severity, November 2006.
Belsley, D., Kuh, E., & Welsch, R. (1980). Regression diagnostics: Identifying influential data and sources of collinearity. New York: Wiley.
MATH Google Scholar
Bieman, J., & Kang, B. (1995). Cohesion and reuse in an object-oriented system. In Proceedings of the ACM Symposium on Software Reusability (SSR’94), 259–262.
Binkley, A., & Schach, S. (1998). Validation of the coupling dependency metric as a risk predictor. In Proceedings of the International Conference on Software Engineering, 452–455.
Briand, L., Daly, W., & Wust, J. (1998). Unified framework for cohesion measurement in object-oriented systems. Empirical Software Engineering, 3(1), 65–117. doi:10.1023/A:1009783721306.
Article Google Scholar
Briand, L., Daly, W., & Wust, J. (1999). A unified framework for coupling measurement in object-oriented systems. IEEE Transactions on Software Engineering, 25(1), 91–121. doi:10.1109/32.748920.
Article Google Scholar
Briand, L., Daly, W., & Wust, J. (2000). Exploring the relationships between design measures and software quality. Journal of Systems and Software, 51(3), 245–273. doi:10.1016/S0164-1212(99)00102-8.
Article Google Scholar
Briand, L., Wüst, J., & Lounis, H. (2001). Replicated Case Studies for Investigating Quality Factors in Object-Oriented Designs, Empirical Software Engineering. International Journal (Toronto, Ont.), 6(1), 11–58.
MATH Google Scholar
Cartwright, M., & Shepperd, M. (1999). An empirical investigation of an object-oriented software system. IEEE Transactions on Software Engineering, 26(8), 786–796. doi:10.1109/32.879814.
Article Google Scholar
Chidamber, S., Darcy, D., & Kemerer, C. (1998). Managerial use of metrics for object-oriented software: An exploratory analysis. IEEE Transactions on Software Engineering, 24(8), 629–639. doi:10.1109/32.707698.
Article Google Scholar
Chidamber, S., & Kamerer, C. (1991). Towards a metrics suite for object oriented design. In Proceedings of the Conference on Object-Oriented Programming: Systems, Languages and Applications (OOPSLA’91). SIGPLAN Notices, 26(11), 197–211.
Chidamber, S., & Kamerer, C. (1994). A metrics suite for object-oriented design. IEEE Transactions on Software Engineering, 20(6), 476–493. doi:10.1109/32.295895.
Article Google Scholar
Dreiseitl, S., & Ohno-Machado, L. (2002). Logistic regression and artificial neural network classification models: A methodology review. Journal of Biomedical Informatics, 35, 352–359. doi:10.1016/S1532-0464(03)00034-0.
Article Google Scholar
Duman, E. (2006). Comparison of decision tree algorithms in identifying bank customers who are likely to buy credit cards. Seventh international Baltic conference on databases and information systems, Kaunas, Lithuania, July 3–6, 2006.
Eftekhar, B., Mohammad, K, Ardebili, H., Ghodsi, M., & Ketabchi, E. (2005). Comparision of artificial neural network and logistic regression models for prediction of mortality in head truma based on initial clinical data. BMC Medical Informatics and Decision Making, 5(3), 3. doi: 10.1186/1472-6947-5-3.
Article Google Scholar
El Emam, K., Benlarbi, S., Goel, N., & Rai, S. (1999). A validation of object-oriented metrics. Technical report ERB-1063, NRC.
El Emam, K., Benlarbi, S., Goel, N., & Rai, S. (2001). The confounding effect of class size on the validity of object-oriented metrics. IEEE Transactions on Software Engineering, 27(7), 630–650. doi:10.1109/32.935855.
Article Google Scholar
Fenton, N., & Neil, M. (1999). A critique of software defect prediction models. IEEE Transactions on Software Engineering, 25(3), 1–15.
Google Scholar
Gyimothy, T., Ferenc, R., & Siket, I. (2005). Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Transactions on Software Engineering, 31(10), 897–910. doi:10.1109/TSE.2005.112.
Article Google Scholar
Hair, J., Anderson, R., & Tatham, W. (2006). Black multivariate data analysis. London: Pearson Education.
Google Scholar
Han, J., & Kamber, M. (2001). Data mining: Concepts and techniques. India: Harchort India Private Limited.
Hanley, J., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic ROC curve. Radiology, 143, 29–36.
Google Scholar
Harrison, R., Counsell, S. J., & Nithi, R. V. (1998). An evaluation of MOOD set of object-oriented software metrics. IEEE Transactions on Software Engineering, 24(6), 491–496. doi:10.1109/32.689404.
Article Google Scholar
Henderson-Sellers, B. (1996). Object-oriented metrics, measures of complexity. Englewood Cliffs, NJ: Prentice Hall.
Google Scholar
Hitz, M., & Montazeri, B. (1995). Measuring coupling and cohesion in object-oriented systems. In Proceedings of the International Symposium on Applied Corporate Computing, Monterrey, Mexico.
Hopkins, W. G. (2003). A new view of statistics. Sport Science. http://www.sportsci.org/resource/stats/.
Horch, J. (2003). Practical guide to software quality management (2nd ed.). London: Artech House.
Google Scholar
Hosmer, D., & Lemeshow, S. (1989). Applied logistic regression. New York: Wiley.
Google Scholar
IEEE Std. 1044-1993. (1994). IEEE standard classification for software anomalies.
Khoshgaftaar, T. M., Allen, E. D., Hudepohl, J. P., & Aud, S. J. (1997). Application of neural networks to software quality modeling of a very large telecommunications system. IEEE Transactions on Neural Networks, 8(4), 902–909. doi:10.1109/72.595888.
Article Google Scholar
Khoshgoftaar, T., Geleyn, E., Nguyen, L., & Bullard, L. (2002). Cost-sensitive boosting in software quality modeling. In Proceedings of 7th IEEE International Symposium on High Assurance Systems Engineering, 51–60.
Laird, L., & Brennan, M. (2006). Software measurement and estimation: A practical approach. NJ: Wiley.
Google Scholar
Lake, A., & Cook, C. (1994). Use of factor analysis to develop OOP software complexity metrics. In Proceedings of the 6th Annual Oregon Workshop on Software Metrics, Silver Falls, Oregon.
Lee, Y., Liang, B., Wu, S., & Wang, F. (1995). Measuring the coupling and cohesion of an object-oriented program based on information flow. In Proceedings of the International Conference on Software Quality, Maribor, Slovenia.
Li, W., & Henry, S. (1993). Object-oriented metrics that predict maintainability. Journal of Systems and Software, 23(2), 111–122. doi:10.1016/0164-1212(93)90077-B.
Article Google Scholar
Lorenz, M., & Kidd, J. (1994). Object-oriented software metrics. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Lovin, C., & Yaptangco, T. (2006). Best practices: Measuring the success of enterprise testing. Dell Power Solutions. pp. 101–103.
Marini, F., Bucci, R., Magri, A. L., & Magri, A. D. (2008). Artificial neural networks in chemometrics: History, examples and perspectives. Microchemical Journal, 88(2), 178–185. doi:10.1016/j.microc.2007.11.008.
Article Google Scholar
Menzies, T., Greenwald, J., & Frank, A. (2007). Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 32(11), 1–12.
Google Scholar
NASA. (2004). Metrics data repository. http://www.mdp.ivv.nasa.gov.
Olague, H., Etzkorn, L., Gholston, S., & Quattlebaum, S. (2007). Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Transactions on Software Engineering, 33(8), 402–419. doi:10.1109/TSE.2007.1015.
Article Google Scholar
Pai, G. (2007). Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Transactions on Software Engineering, 33(10), 675–686. doi:10.1109/TSE.2007.70722.
Article Google Scholar
Phadke, A., & Allen, E. (2005). Predicting risky modules in open-source software for high-performance computing. In Proceedings of Second International Workshop on Software Engineering for High Performance Computing System Applications, 60–64.
Porter, A., & Selby, R. (1990). Empirically guided software development using metric-based classification trees. IEEE Software, 7(2), 46–54. doi:10.1109/52.50773.
Google Scholar
Promise. http://promisedata.org/repository/.
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society. Series A (General), 36, 111–147.
MATH Google Scholar
Tang, M. H., Kao, M. H., & Chen, M. H. (1999). An empirical study on object-oriented metrics. In Proceedings of Metrics, 242–249.
Tegarden, D., Sheetz, S., & Monarchi, D. (1995). A software complexity model of object-oriented systems. Decision Support Systems, 13(3–4), 241–262. doi:10.1016/0167-9236(93)E0045-F.
Article Google Scholar
Tian, J. (2005). Software quality engineering: Testing, quality assurance, and quantifiable improvement. NJ: Wiley.
Google Scholar
Yu, P., Systa, T., & Muller, H. (2002). Predicting fault-proneness using OO metrics: An industrial case study. In Proceedings of Sixth European Conference on Software Maintenance and Reengineering, Budapest, Hungary, 99–107.
Zhou, Y., & Leung, H. (2006). Empirical analysis of object-oriented design metrics for predicting high severity faults. IEEE Transactions on Software Engineering, 32(10), 771–784. doi:10.1109/TSE.2006.102.
Article Google Scholar

Download references

Author information

Authors and Affiliations

University School of Information Technology, GGS Indraprastha University, Delhi, 110403, India
Yogesh Singh, Arvinder Kaur & Ruchika Malhotra

Authors

Yogesh Singh
View author publications
You can also search for this author in PubMed Google Scholar
Arvinder Kaur
View author publications
You can also search for this author in PubMed Google Scholar
Ruchika Malhotra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruchika Malhotra.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Singh, Y., Kaur, A. & Malhotra, R. Empirical validation of object-oriented metrics for predicting fault proneness models. Software Qual J 18, 3–35 (2010). https://doi.org/10.1007/s11219-009-9079-6

Download citation

Published: 01 July 2009
Issue Date: March 2010
DOI: https://doi.org/10.1007/s11219-009-9079-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Empirical validation of object-oriented metrics for predicting fault proneness models

Abstract

Access this article

Similar content being viewed by others

Sampling in software engineering research: a critical review and guidelines

Applications of AI in classical software engineering

Software defect prediction: future directions and challenges

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Empirical validation of object-oriented metrics for predicting fault proneness models

Abstract

Access this article

Similar content being viewed by others

Sampling in software engineering research: a critical review and guidelines

Applications of AI in classical software engineering

Software defect prediction: future directions and challenges

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation