Psychometrika

, Volume 16, Issue 3, pp 297–334 | Cite as

Coefficient alpha and the internal structure of tests

  • Lee J. Cronbach
Article

Abstract

A general formula (α) of which a special case is the Kuder-Richardson coefficient of equivalence is shown to be the mean of all split-half coefficients resulting from different splittings of a test. α is therefore an estimate of the correlation between two random samples of items from a universe of items like those in the test. α is found to be an appropriate index of equivalence and, except for very short tests, of the first-factor concentration in the test. Tests divisible into distinct subtests should be so divided before using the formula. The index\(\bar r_{ij} \), derived from α, is shown to be an index of inter-item homogeneity. Comparison is made to the Guttman and Loevinger approaches. Parallel split coefficients are shown to be unnecessary for tests of common types. In designing tests, maximum interpretability of scores is obtained by increasing the first-factor concentration in any separately-scored subtest and avoiding substantial group-factor clusters within a subtest. Scalability is not a requisite.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Brogden, H. E. Variation in test validity with variation in the distribution of item difficulties, number of items, and degree of their intercorrelation.Psychometrika, 1946,11, 197–214.Google Scholar
  2. 2.
    Brown, W. Some experimental results in the correlation of mental abilities.Brit. J. Psychol., 1910,3, 296–322.Google Scholar
  3. 3.
    Brownell, W. A. On the accuracy with which reliability may be measured by correlating test halves.J. exper. Educ., 1933,1, 204–215.Google Scholar
  4. 4.
    Burt, C. The influence of differential weighting.Brit. J. Psychol., Stat. Sect., 1950,3, 105–128.Google Scholar
  5. 5.
    Clark, E. L. Methods of splitting vs. samples as sources of instability in test-reliability coefficients.Harvard educ. Rev., 1949,19, 178–182.Google Scholar
  6. 6.
    Coombs, C. H. The concepts of reliability and homogeneity.Educ. psychol. Meas., 1950,10, 43–56.Google Scholar
  7. 7.
    Cronbach, L. J. On estimates of test reliability.J. educ. Psychol., 1943,34, 485–494.Google Scholar
  8. 8.
    Cronbach, L. J. A case study of the split-half reliability coefficient.J. educ. Psychol., 1946,37, 473–480.Google Scholar
  9. 9.
    Cronbach, L. J. Test “reliability”: its meaning and determination.Psychometrika, 1947,12, 1–16.CrossRefGoogle Scholar
  10. 10.
    Dressel, P. L. Some remarks on the Kuder-Richardson reliability coefficient.Psychometrika, 1940,5, 305–310.CrossRefGoogle Scholar
  11. 11.
    Ferguson, G. The factorial interpretation of test difficulty.Psychometrika, 1941,6, 323–329.Google Scholar
  12. 12.
    Ferguson, G. The reliability of mental tests. London: Univ. of London Press, 1941.Google Scholar
  13. 13.
    Festinger, L. The treatment of qualitative data by “scale analysis.”Psychol. Bull., 1947,44, 149–161.Google Scholar
  14. 14.
    Goodenough, F. L. A critical note on the use of the term “reliability” in mental measurement.J. educ. Psychol., 1936,27, 173–178.Google Scholar
  15. 15.
    Guilford, J. P., ed. Printed classification tests. Report No. 5, Army Air Forces Aviation Psychology Program. Washington: U. S. Govt. Print. Off., 1947.Google Scholar
  16. 16.
    Guilford, J. P. Fundamental statistics in psychology and education. Second ed. New York: McGraw-Hill, 1950.Google Scholar
  17. 17.
    Guilford, J. P., and Michael, W. B. Changes in common-factor loadings as tests are altered homogeneously in length.Psychometrika, 1950,15, 237–249.CrossRefPubMedGoogle Scholar
  18. 18.
    Gulliksen, H. Theory of mental tests. New York: Wiley, 1950.Google Scholar
  19. 19.
    Guttman, L. A basis for analyzing test-retest reliability.Psychometrika, 1945,10, 255–282.CrossRefGoogle Scholar
  20. 20.
    Hoyt, C. Test reliability estimated by analysis of variance.Psychometrika, 1941,6, 153–160.CrossRefGoogle Scholar
  21. 21.
    Humphreys, L. G. Test homogeneity and its measurement.Amer. Psychologist, 1949,4, 245.Google Scholar
  22. 22.
    Jackson, R. W., and Ferguson, G. A. Studies on the reliability of tests. Bull. No. 12, Dept. of Educ. Res., University of Toronto, 1941.Google Scholar
  23. 23.
    Kelley, T. L. Note on the reliability of a test: a reply to Dr. Crum's criticism.J. educ. Psychol., 1924,15, 193–204.Google Scholar
  24. 24.
    Kelley, T. L. Statistical method. New York: Macmillan, 1924.Google Scholar
  25. 25.
    Kelley, T. L. The reliability coefficient.Psychometrika, 1942,7, 75–83.CrossRefGoogle Scholar
  26. 26.
    Kuder, G. F., and Richardson, M. W. The theory of the estimation of test reliability.Psychometrika, 1937,2, 151–160.Google Scholar
  27. 27.
    Loevinger, J. A systematic approach to the construction and evaluation of tests of ability.Psychol. Monogr., 1947,61, No. 4.Google Scholar
  28. 28.
    Loevinger, J. The technic of homogeneous tests compared with some aspects of “scale analysis” and factor analysis.Psychol. Bull., 1948,45, 507–529.Google Scholar
  29. 29.
    Mosier, C. I. A short cut in the estimation of split-halves coefficients.Educ. psychol. Meas., 1941,1, 407–408.Google Scholar
  30. 30.
    Richardson, M. Combination of measures, pp. 379–401 in Horst, P. (Ed.) The prediction of personal adjustment. New York: Social Science Res. Council, 1941.Google Scholar
  31. 31.
    Rulon, P. J. A simplified procedure for determining the reliability of a test by split-halves.Harvard educ. Rev., 1939,9, 99–103.Google Scholar
  32. 32.
    Shannon, C. E. The mathematical theory of communication. Urbana: Univ. of Ill. Press, 1949.Google Scholar
  33. 33.
    Spearman, C. Correlation calculated with faulty data.Brit. J. Psychol., 1910,3, 271–295.Google Scholar
  34. 34.
    Stouffer, S. A., et. al. Measurement and prediction. Princeton: Princeton Univ. Press, 1950.Google Scholar
  35. 35.
    Thurstone, L. L., and Thurstone, T. G. Factorial studies of intelligence, p. 37. Chicago: Univ. of Chicago Press, 1941.Google Scholar
  36. 36.
    Tucker, L. R. Maximum validity of a test with equivalent items.Psychometrika, 1946,11, 1–13.CrossRefGoogle Scholar
  37. 37.
    Vernon, P. E. An application of factorial analysis to the study of test items.Brit. J. Psychol., Stat. Sec., 1950,3, 1–15.Google Scholar
  38. 38.
    Wherry, R. J., and Gaylord, R. H. The concept of test and item reliability in relation to factor pattern.Psychometrika, 1943,8, 247–264.CrossRefGoogle Scholar
  39. 39.
    Woodbury, M. A. On the standard length of a test. Res. Bull. 50–53, Educ. Test. Service, 1950.Google Scholar

Copyright information

© Psychometric Society 1951

Authors and Affiliations

  • Lee J. Cronbach
    • 1
  1. 1.University of IllinoisUSA

Personalised recommendations