Abstract
This paper presents an empirical case study that predicted faults in modules based on the total information content of the operators. This metric is closely related to Harrison's average information content classification (AICC), which is the entropy of the operators. Most information theory-based metrics proposed in the literature have not been subjected to empirical predictive studies of real-world software systems. In contrast, this study shows that a simple information theory-based metric can be more useful for prediction of software quality than comparable metrics based on counts in the context of a commercial software development organization.
Three models were considered, all based on operators as an abstraction of software. The model based on information content of the operators made more accurate predictions than two similar models based on the number of operators and the number of unique operators. The purpose of this paper is a fair comparison of the three metrics, rather than developing an optimal model. We have long advocated multivariate models for industrial use. The case study considered three large commercial systems, written in assembly language, and developed consecutively by professional programmers. The first system was used to estimate parameters of the models. The subsequent two were used to evaluate the accuracy of model predictions.
Similar content being viewed by others
References
Alexander, R.T., Bieman, J.M., and Viega, J. 2000. Coping with java programming stress, Computer 33(4): 30-38.
Allen, E.B. 1995. Information Theory and Software Measurement, Ph.D. thesis. Florida Atlantic University, Boca Raton, FL (Advised by Taghi M. Khoshgoftaar).
Berlinger, E. 1980. An information theory based complexity measure. Proc. National Computer Conf., Vol. 49, Anaheim, CA, pp. 773-779.
Cook, C.R. 1991. Information theory metric for assembly language. In: Proc. Annual Oregon Workshop on Software Metrics, Silver Falls, OR.
Cook, M.L. 1982. Software metrics: An introduction and annotated bibliography, ACM Software Eng. Notes 7(2): 41-60.
Côté, V., Bourque, P., Oligny, S., and Rivard, N. 1988. Software metrics: An overview of recent results, J. Syst. Software 8(2): 121-131.
Cover, T.M. and Thomas, J.A. 1991. Elements of Information Theory, New York, John Wiley & Sons.
Davis, J.S. and LeBlanc, R.J. 1988. A study of the applicability of complexity measures, IEEE Trans. Software Eng. 14(9): 1366-1372.
Evanco, W.M. and Agresti, W.W. 1994. A composite complexity approach for software defect modeling, Software Quality J. 3(1): 27-44.
Halstead, M.H. 1977. Elements of Software Science, New York, Elsevier.
Harrison, W. 1984. Bibliography on Software Complexity, ACM SIGPLAN Notices 19(2): 17-27.
Harrison, W. 1992. An entropy-based measure of software complexity, IEEE Trans. Software Eng. 18(11): 1025-1029.
Hellerman, L. 1972. A measure of computational work, IEEE Trans. Computers c-21(5): 439-446.
Henry, S. and Kafura, D. 1981. Software structure metrics based on information flow, IEEE Trans. Software Eng. SE-7(5): 510-518.
Khoshgoftaar, T.M. and Allen, E.B. 1994. Applications of information theory to software engineering measurement, Software Quality J. 3(2): 79-103.
Khoshgoftaar, T.M., Allen, E.B., Kalaichelvan, K.S., and Goel, N. 1996. Early quality prediction: A case study in telecommunications, IEEE Software 13(1): 65-71.
Khoshgoftaar, T.M., Szabo, R.M., and Guasti, P.J., 1995. Exploring the behavior of neural network software quality models, Software Engineering J. 10(3): 89-96.
Khoshgoftaar, T.M., Szabo, R.M., and Woodcock, T.G. 1994. An empirical study of program quality during testing and maintenance, Software Quality J. 3(3): 137-151.
Kitchenham, B.A., Pfleeger, S.L., and Fenton, N.E. 1995. Towards a framework for software measurement validation, IEEE Trans. Software Eng. 21(12): 929-944. (See comments in Kitchenham et al., 1997; Morasca et al., 1997.)
Kitchenham, B.A., Pfleeger, S.L., and Fenton, N.E. 1997. Reply to: Comments on “Towards a framework for software measurement validation,” IEEE Trans. Software Engineering 23(3): 189. (See Kitchenham et al., 1995; Morasca et al., 1997; Weyuker, 1988.)
Li, W. and Henry, S. 1993. Maintenance metrics for the object oriented paradigm. Proc. First Int. Software Metrics Symp., Baltimore, MD, pp. 52-60.
Mohanty, S.N. 1979. Models and measurements for quality assessment of software, Computing Surveys 11(3): 251-275.
Morasca, S., Briand, L.C., Basili, V.R., Weyuker, E.J., and Zelkowitz, M.V. 1997. Comments on “Towards a framework for software measurement validation,” IEEE Trans. Software Engineering 23(3): 187-188. (See Kitchenham et al., 1995; Weyuker, 1988.)
Munson, J.C., and Khoshgoftaar, T.M. 1989. The dimensionality of program complexity. Proc. 11th Int. Conf. Software Engineering, Pittsburgh, PA, pp. 245-253.
Myers, R.H. 1990. Classical and Modern Regression with Applications, Duxbury Series, Boston, PWS-KENT Publishing.
Porter, A.A. and Selby, R.W., 1990, Empirically guided software development using metric-based classification trees, IEEE Software 7(2): 46-54.
Schneidewind, N.F. 1992. Methodology for validating software metrics, IEEE Trans. Software Engineering 18(5): 410-422.
Shannon, C.E. and Weaver, W. 1949. The mathematical theory of communication. Urbana, IL, University of Illinois Press.
van Emden, M.H. 1971. An Analysis of Complexity, No. 35 in Mathematical Centre Tracts, Amsterdam, Mathematisch Centrum.
Votta, L.G. and Porter, A.A. 1995. Experimental software engineering: A report on the state of the art. Proc. Seventeenth Int. Conf. on Software Engineering, Seattle, WA, pp. 277-279.
Waguespack, Jr., L.J. and Badlani, S. 1987. Software complexity assessments: An introduction and annotated bibliography, ACM Software Eng. Notes 12(4): 52-71.
Weyuker, E.J. 1988. Evaluating software complexity measures, IEEE Trans. Software Engineering 14(9): 1357-1365.
Zhuo, F., Lowther, B., Oman, P., and Hagemeister, J. 1993. Constructing and testing software maintainability assessment models. Proc. 1st Int. Software Metrics Symp., Baltimore, MD, pp. 61-70.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Khoshgoftaar, T.M., Allen, E.B. Empirical Assessment of a Software Metric: The Information Content of Operators. Software Quality Journal 9, 99–112 (2001). https://doi.org/10.1023/A:1016622818771
Issue Date:
DOI: https://doi.org/10.1023/A:1016622818771