Abstract
A general formula (α) of which a special case is the Kuder-Richardson coefficient of equivalence is shown to be the mean of all split-half coefficients resulting from different splittings of a test. α is therefore an estimate of the correlation between two random samples of items from a universe of items like those in the test. α is found to be an appropriate index of equivalence and, except for very short tests, of the first-factor concentration in the test. Tests divisible into distinct subtests should be so divided before using the formula. The index\(\bar r_{ij} \), derived from α, is shown to be an index of inter-item homogeneity. Comparison is made to the Guttman and Loevinger approaches. Parallel split coefficients are shown to be unnecessary for tests of common types. In designing tests, maximum interpretability of scores is obtained by increasing the first-factor concentration in any separately-scored subtest and avoiding substantial group-factor clusters within a subtest. Scalability is not a requisite.
This is a preview of subscription content, access via your institution.
References
Brogden, H. E. Variation in test validity with variation in the distribution of item difficulties, number of items, and degree of their intercorrelation.Psychometrika, 1946,11, 197–214.
Brown, W. Some experimental results in the correlation of mental abilities.Brit. J. Psychol., 1910,3, 296–322.
Brownell, W. A. On the accuracy with which reliability may be measured by correlating test halves.J. exper. Educ., 1933,1, 204–215.
Burt, C. The influence of differential weighting.Brit. J. Psychol., Stat. Sect., 1950,3, 105–128.
Clark, E. L. Methods of splitting vs. samples as sources of instability in test-reliability coefficients.Harvard educ. Rev., 1949,19, 178–182.
Coombs, C. H. The concepts of reliability and homogeneity.Educ. psychol. Meas., 1950,10, 43–56.
Cronbach, L. J. On estimates of test reliability.J. educ. Psychol., 1943,34, 485–494.
Cronbach, L. J. A case study of the split-half reliability coefficient.J. educ. Psychol., 1946,37, 473–480.
Cronbach, L. J. Test “reliability”: its meaning and determination.Psychometrika, 1947,12, 1–16.
Dressel, P. L. Some remarks on the Kuder-Richardson reliability coefficient.Psychometrika, 1940,5, 305–310.
Ferguson, G. The factorial interpretation of test difficulty.Psychometrika, 1941,6, 323–329.
Ferguson, G. The reliability of mental tests. London: Univ. of London Press, 1941.
Festinger, L. The treatment of qualitative data by “scale analysis.”Psychol. Bull., 1947,44, 149–161.
Goodenough, F. L. A critical note on the use of the term “reliability” in mental measurement.J. educ. Psychol., 1936,27, 173–178.
Guilford, J. P., ed. Printed classification tests. Report No. 5, Army Air Forces Aviation Psychology Program. Washington: U. S. Govt. Print. Off., 1947.
Guilford, J. P. Fundamental statistics in psychology and education. Second ed. New York: McGraw-Hill, 1950.
Guilford, J. P., and Michael, W. B. Changes in common-factor loadings as tests are altered homogeneously in length.Psychometrika, 1950,15, 237–249.
Gulliksen, H. Theory of mental tests. New York: Wiley, 1950.
Guttman, L. A basis for analyzing test-retest reliability.Psychometrika, 1945,10, 255–282.
Hoyt, C. Test reliability estimated by analysis of variance.Psychometrika, 1941,6, 153–160.
Humphreys, L. G. Test homogeneity and its measurement.Amer. Psychologist, 1949,4, 245.
Jackson, R. W., and Ferguson, G. A. Studies on the reliability of tests. Bull. No. 12, Dept. of Educ. Res., University of Toronto, 1941.
Kelley, T. L. Note on the reliability of a test: a reply to Dr. Crum's criticism.J. educ. Psychol., 1924,15, 193–204.
Kelley, T. L. Statistical method. New York: Macmillan, 1924.
Kelley, T. L. The reliability coefficient.Psychometrika, 1942,7, 75–83.
Kuder, G. F., and Richardson, M. W. The theory of the estimation of test reliability.Psychometrika, 1937,2, 151–160.
Loevinger, J. A systematic approach to the construction and evaluation of tests of ability.Psychol. Monogr., 1947,61, No. 4.
Loevinger, J. The technic of homogeneous tests compared with some aspects of “scale analysis” and factor analysis.Psychol. Bull., 1948,45, 507–529.
Mosier, C. I. A short cut in the estimation of split-halves coefficients.Educ. psychol. Meas., 1941,1, 407–408.
Richardson, M. Combination of measures, pp. 379–401 in Horst, P. (Ed.) The prediction of personal adjustment. New York: Social Science Res. Council, 1941.
Rulon, P. J. A simplified procedure for determining the reliability of a test by split-halves.Harvard educ. Rev., 1939,9, 99–103.
Shannon, C. E. The mathematical theory of communication. Urbana: Univ. of Ill. Press, 1949.
Spearman, C. Correlation calculated with faulty data.Brit. J. Psychol., 1910,3, 271–295.
Stouffer, S. A., et. al. Measurement and prediction. Princeton: Princeton Univ. Press, 1950.
Thurstone, L. L., and Thurstone, T. G. Factorial studies of intelligence, p. 37. Chicago: Univ. of Chicago Press, 1941.
Tucker, L. R. Maximum validity of a test with equivalent items.Psychometrika, 1946,11, 1–13.
Vernon, P. E. An application of factorial analysis to the study of test items.Brit. J. Psychol., Stat. Sec., 1950,3, 1–15.
Wherry, R. J., and Gaylord, R. H. The concept of test and item reliability in relation to factor pattern.Psychometrika, 1943,8, 247–264.
Woodbury, M. A. On the standard length of a test. Res. Bull. 50–53, Educ. Test. Service, 1950.
Author information
Authors and Affiliations
Additional information
The assistance of Dora Damrin and Willard Warrington is gratefully acknowledged. Miss Damrin took major responsibility for the empirical studies reported. This research was supported by the Bureau of Research and Service, College of Education.
Rights and permissions
About this article
Cite this article
Cronbach, L.J. Coefficient alpha and the internal structure of tests. Psychometrika 16, 297–334 (1951). https://doi.org/10.1007/BF02310555
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02310555