Empirical validation of software metrics used to predict software quality attributes is important to ensure their practical relevance in software organizations. The aim of this work is to find the relation of object-oriented (OO) metrics with fault proneness at different severity levels of faults. For this purpose, different prediction models have been developed using regression and machine learning methods. We evaluate and compare the performance of these methods to find which method performs better at different severity levels of faults and empirically validate OO metrics given by Chidamber and Kemerer. The results of the empirical study are based on public domain NASA data set. The performance of the predicted models was evaluated using Receiver Operating Characteristic (ROC) analysis. The results show that the area under the curve (measured from the ROC analysis) of models predicted using high severity faults is low as compared with the area under the curve of the model predicted with respect to medium and low severity faults. However, the number of faults in the classes correctly classified by predicted models with respect to high severity faults is not low. This study also shows that the performance of machine learning methods is better than logistic regression method with respect to all the severities of faults. Based on the results, it is reasonable to claim that models targeted at different severity levels of faults could help for planning and executing testing by focusing resources on fault-prone parts of the design and code that are likely to cause serious failures.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Price includes VAT (USA)
Tax calculation will be finalised during checkout.
Afzal, W. (2007). Metrics in software test planning and test design processes. Ph.D. Disseration.
Aggarwal, K. K., Singh, Y., Kaur, A., & Malhotra, R. (2005). Software reuse metrics for object-oriented systems. In Proceedings of the Third ACIS Int’l Conference On Software Engineering Research, Management and Applications (SERA ‘05), 48–55.
Aggarwal, K. K., Singh, Y., Kaur, A., & Malhotra, R. (2006a). Empirical study of object-oriented metrics. Journal of Object Technology, 5(8), 149–173.
Aggarwal, K. K., Singh, Y., Kaur, A., & Malhotra, R. (2006b). Investigating the effect of coupling metrics on fault proneness in object-oriented systems. Software Quality Professional, 8(4), 4–16.
Aggarwal, K. K., Singh, Y., Kaur, A., & Malhotra, R. (2007). Application of artificial neural network for predicting fault proneness models. International conference on information systems, technology and management (ICISTM 2007), March 12–13, New Delhi, India.
Aggarwal, K. K., Singh, Y., Kaur, A., & Malhotra, R. (2009). Empirical analysis for investigating the effect of object-oriented metrics on fault proneness: A replicated case study. Software Process: Improvement and Practice, 16(1), 39–62. doi:10.1002/spip.389s.
Barnett, V., & Price, T. (1995). Outliers in statistical data. London: Wiley.
Basili, V., Briand, L., & Melo, W. (1996). A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering, 22(10), 751–761. doi:10.1109/32.544352.
Bazman’s Testing Pages. (2006). A website containing articles and white papers on software testing. http://members.tripod.com/~bazman/classification.html?button7=Classification+of+Errors+by+Severity, November 2006.
Belsley, D., Kuh, E., & Welsch, R. (1980). Regression diagnostics: Identifying influential data and sources of collinearity. New York: Wiley.
Bieman, J., & Kang, B. (1995). Cohesion and reuse in an object-oriented system. In Proceedings of the ACM Symposium on Software Reusability (SSR’94), 259–262.
Binkley, A., & Schach, S. (1998). Validation of the coupling dependency metric as a risk predictor. In Proceedings of the International Conference on Software Engineering, 452–455.
Briand, L., Daly, W., & Wust, J. (1998). Unified framework for cohesion measurement in object-oriented systems. Empirical Software Engineering, 3(1), 65–117. doi:10.1023/A:1009783721306.
Briand, L., Daly, W., & Wust, J. (1999). A unified framework for coupling measurement in object-oriented systems. IEEE Transactions on Software Engineering, 25(1), 91–121. doi:10.1109/32.748920.
Briand, L., Daly, W., & Wust, J. (2000). Exploring the relationships between design measures and software quality. Journal of Systems and Software, 51(3), 245–273. doi:10.1016/S0164-1212(99)00102-8.
Briand, L., Wüst, J., & Lounis, H. (2001). Replicated Case Studies for Investigating Quality Factors in Object-Oriented Designs, Empirical Software Engineering. International Journal (Toronto, Ont.), 6(1), 11–58.
Cartwright, M., & Shepperd, M. (1999). An empirical investigation of an object-oriented software system. IEEE Transactions on Software Engineering, 26(8), 786–796. doi:10.1109/32.879814.
Chidamber, S., Darcy, D., & Kemerer, C. (1998). Managerial use of metrics for object-oriented software: An exploratory analysis. IEEE Transactions on Software Engineering, 24(8), 629–639. doi:10.1109/32.707698.
Chidamber, S., & Kamerer, C. (1991). Towards a metrics suite for object oriented design. In Proceedings of the Conference on Object-Oriented Programming: Systems, Languages and Applications (OOPSLA’91). SIGPLAN Notices, 26(11), 197–211.
Chidamber, S., & Kamerer, C. (1994). A metrics suite for object-oriented design. IEEE Transactions on Software Engineering, 20(6), 476–493. doi:10.1109/32.295895.
Dreiseitl, S., & Ohno-Machado, L. (2002). Logistic regression and artificial neural network classification models: A methodology review. Journal of Biomedical Informatics, 35, 352–359. doi:10.1016/S1532-0464(03)00034-0.
Duman, E. (2006). Comparison of decision tree algorithms in identifying bank customers who are likely to buy credit cards. Seventh international Baltic conference on databases and information systems, Kaunas, Lithuania, July 3–6, 2006.
Eftekhar, B., Mohammad, K, Ardebili, H., Ghodsi, M., & Ketabchi, E. (2005). Comparision of artificial neural network and logistic regression models for prediction of mortality in head truma based on initial clinical data. BMC Medical Informatics and Decision Making, 5(3), 3. doi: 10.1186/1472-6947-5-3.
El Emam, K., Benlarbi, S., Goel, N., & Rai, S. (1999). A validation of object-oriented metrics. Technical report ERB-1063, NRC.
El Emam, K., Benlarbi, S., Goel, N., & Rai, S. (2001). The confounding effect of class size on the validity of object-oriented metrics. IEEE Transactions on Software Engineering, 27(7), 630–650. doi:10.1109/32.935855.
Fenton, N., & Neil, M. (1999). A critique of software defect prediction models. IEEE Transactions on Software Engineering, 25(3), 1–15.
Gyimothy, T., Ferenc, R., & Siket, I. (2005). Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Transactions on Software Engineering, 31(10), 897–910. doi:10.1109/TSE.2005.112.
Hair, J., Anderson, R., & Tatham, W. (2006). Black multivariate data analysis. London: Pearson Education.
Han, J., & Kamber, M. (2001). Data mining: Concepts and techniques. India: Harchort India Private Limited.
Hanley, J., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic ROC curve. Radiology, 143, 29–36.
Harrison, R., Counsell, S. J., & Nithi, R. V. (1998). An evaluation of MOOD set of object-oriented software metrics. IEEE Transactions on Software Engineering, 24(6), 491–496. doi:10.1109/32.689404.
Henderson-Sellers, B. (1996). Object-oriented metrics, measures of complexity. Englewood Cliffs, NJ: Prentice Hall.
Hitz, M., & Montazeri, B. (1995). Measuring coupling and cohesion in object-oriented systems. In Proceedings of the International Symposium on Applied Corporate Computing, Monterrey, Mexico.
Hopkins, W. G. (2003). A new view of statistics. Sport Science. http://www.sportsci.org/resource/stats/.
Horch, J. (2003). Practical guide to software quality management (2nd ed.). London: Artech House.
Hosmer, D., & Lemeshow, S. (1989). Applied logistic regression. New York: Wiley.
IEEE Std. 1044-1993. (1994). IEEE standard classification for software anomalies.
Khoshgaftaar, T. M., Allen, E. D., Hudepohl, J. P., & Aud, S. J. (1997). Application of neural networks to software quality modeling of a very large telecommunications system. IEEE Transactions on Neural Networks, 8(4), 902–909. doi:10.1109/72.595888.
Khoshgoftaar, T., Geleyn, E., Nguyen, L., & Bullard, L. (2002). Cost-sensitive boosting in software quality modeling. In Proceedings of 7th IEEE International Symposium on High Assurance Systems Engineering, 51–60.
Laird, L., & Brennan, M. (2006). Software measurement and estimation: A practical approach. NJ: Wiley.
Lake, A., & Cook, C. (1994). Use of factor analysis to develop OOP software complexity metrics. In Proceedings of the 6th Annual Oregon Workshop on Software Metrics, Silver Falls, Oregon.
Lee, Y., Liang, B., Wu, S., & Wang, F. (1995). Measuring the coupling and cohesion of an object-oriented program based on information flow. In Proceedings of the International Conference on Software Quality, Maribor, Slovenia.
Li, W., & Henry, S. (1993). Object-oriented metrics that predict maintainability. Journal of Systems and Software, 23(2), 111–122. doi:10.1016/0164-1212(93)90077-B.
Lorenz, M., & Kidd, J. (1994). Object-oriented software metrics. Englewood Cliffs, NJ: Prentice-Hall.
Lovin, C., & Yaptangco, T. (2006). Best practices: Measuring the success of enterprise testing. Dell Power Solutions. pp. 101–103.
Marini, F., Bucci, R., Magri, A. L., & Magri, A. D. (2008). Artificial neural networks in chemometrics: History, examples and perspectives. Microchemical Journal, 88(2), 178–185. doi:10.1016/j.microc.2007.11.008.
Menzies, T., Greenwald, J., & Frank, A. (2007). Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 32(11), 1–12.
NASA. (2004). Metrics data repository. http://www.mdp.ivv.nasa.gov.
Olague, H., Etzkorn, L., Gholston, S., & Quattlebaum, S. (2007). Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Transactions on Software Engineering, 33(8), 402–419. doi:10.1109/TSE.2007.1015.
Pai, G. (2007). Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Transactions on Software Engineering, 33(10), 675–686. doi:10.1109/TSE.2007.70722.
Phadke, A., & Allen, E. (2005). Predicting risky modules in open-source software for high-performance computing. In Proceedings of Second International Workshop on Software Engineering for High Performance Computing System Applications, 60–64.
Porter, A., & Selby, R. (1990). Empirically guided software development using metric-based classification trees. IEEE Software, 7(2), 46–54. doi:10.1109/52.50773.
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society. Series A (General), 36, 111–147.
Tang, M. H., Kao, M. H., & Chen, M. H. (1999). An empirical study on object-oriented metrics. In Proceedings of Metrics, 242–249.
Tegarden, D., Sheetz, S., & Monarchi, D. (1995). A software complexity model of object-oriented systems. Decision Support Systems, 13(3–4), 241–262. doi:10.1016/0167-9236(93)E0045-F.
Tian, J. (2005). Software quality engineering: Testing, quality assurance, and quantifiable improvement. NJ: Wiley.
Yu, P., Systa, T., & Muller, H. (2002). Predicting fault-proneness using OO metrics: An industrial case study. In Proceedings of Sixth European Conference on Software Maintenance and Reengineering, Budapest, Hungary, 99–107.
Zhou, Y., & Leung, H. (2006). Empirical analysis of object-oriented design metrics for predicting high severity faults. IEEE Transactions on Software Engineering, 32(10), 771–784. doi:10.1109/TSE.2006.102.
About this article
Cite this article
Singh, Y., Kaur, A. & Malhotra, R. Empirical validation of object-oriented metrics for predicting fault proneness models. Software Qual J 18, 3 (2010). https://doi.org/10.1007/s11219-009-9079-6
- Software quality
- Empirical validation
- Fault prediction
- Receiver operating characteristics analysis