Abstract
Many software measures have been forwarded on the simple basis of a high linear correlation coefficient with some measurable quantities. The linear correlation coefficient is an unreliable statistic for deciding whether an observed correlation indicates significant association. Several published software measure experiments collected upwards of 20 different measurements or have fourteen or fewer observations. With considerable data from small samples, the probability of “discovering” a “significant” correlation is high. We present a computer simulation experiment where the correlation between sets of randomly generated numbers is calculated. We also look at randomly generated numbers in the ranges that would be expected in Halstead’s Software Science measures. Our results show that the average maximum linear correlation for randomly generated numbers is.70 or higher if the sample size is low compared to the number of variables. Alternative statistical approaches to obtain meaningful significant results is presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Bibliography
Albrecht, Allan J., and John E. Gaffney, Jr. “Software Function, Source Lines of Code and Development Effort Prediction: A Software Science Validation.” IEEE Transactions of Software Engineering. Vol. SE-9 (1983) pp. 639–648.
Baker, Albert L., James M. Bieman, David A. Gustafson, and Austin C. Melton. “Modeling and Measuring the Software Development Process.” Proc. of the Twentieth Hawaii International Conference on Systems Sciences. (January 1987) pp 23–30.
Basili, Victor R., and Richard W. Reiter, and Tsai-Yun Phillips. “Metric Analysis and Data Validation Across FORTRAN Projects.” IEEE Transactions of Software Engineering. Vol. SE-9 (1983) pp. 652–663.
Card, David N., and William W. Agresti. “Measuring Software Design Complexity.” Journal of Systems and Software. Vol. 8 (1988) pp. 185–197.
Courtney, Richard E. and David A. Gustafson. Preliminary Study of Shotgun Correlations in Software Measures. Tech Report TR-CS-90-9 Department of Computing and Information Sciences, Kansas State University, Manhattan, KS. (1990).
Edwards, William R, Chi-Ming Chung, and Ming-Gaey Yang. “A Study of Data Flow and Testing-Specific Metrics.” in Proc. of 11th Minnowbrook Workshop on Software Reliability. (1988).
Halstead, M.H. Elements of Software Science. New York: North-Holland (Elsevier Computer Science Library), 1977.
Hwang, Chern-Hwang. “An Empirical Investigation of Halstead’s Software Length Formula.” Masters Report; Kansas State University (1988).
Kearney, Joseph K., Robert L. Sedlmeyer, William B. Thompson, Michael A. Gray, and Michael A. Adler. “Software Complexity Measurement.” Communications of the ACM. Vol. 29 (November 1986) pp. 1044–1050.
Kitchenham, Barbara A. and N. R. Taylor. “Software Project Development Cost Estimation.” Journal of Systems and Software. Vol. 5 (1985) pp. 267–278.
McCabe, T.J. “A Complexity Measure.” IEEE Transactions of Software Engineering. Vol. SE-2 (1976) pp. 308–320.
van der Poel, Klaas G., and Stephen R. Schach. “A Software Metric for Cost Estimation and Efficiency Measurement in Data Processing System Development.” Journal of Systems and Software. Vol. 3 (1983) pp. 187–191.
Press, Flannery, Teukolsky and Vetterling. Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press (1988).
Takahasi, Mueno and Yuji Kamayachi. “An Empirical Study of a Model for Program Error Prediction.” IEEE Transactions of Software Engineering. Vol SE-15, (1989) pp. 82–86.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1992 Springer-Verlag New York, Inc.
About this paper
Cite this paper
Courtney, R.E., Gustafson, D.A. (1992). The Fallacy of Shotgun Correlations for Software Measures. In: Page, C., LePage, R. (eds) Computing Science and Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-2856-1_45
Download citation
DOI: https://doi.org/10.1007/978-1-4612-2856-1_45
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-97719-5
Online ISBN: 978-1-4612-2856-1
eBook Packages: Springer Book Archive