Criterion-Referenced Assessment of Individual Differences

  • Ronald K. Hambleton
Part of the Perspectives on Individual Differences book series (PIDF)


The field of criterion-referenced testing has developed quickly since the first papers on the topic by Glaser (1963) and Popham and Husek (1969). Glaser, and later, Popham and Husek, were interested in assessment methods that could provide information on which to base a number of individual and programmatic decisions arising in connection with specific instructional objectives or competencies. Norm-referenced tests were judged to be inappropriate because they provide information that facilitates comparisons among examinees on broad traits or constructs. These tests were not intended to measure specific objectives. And even if items in a norm-referenced test could be matched to objectives, typically there would be too few test items per objective to permit valid criterion-referenced test score interpretations.


Test Score Test Item Content Validity Item Statistic Educational Measurement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Berk, R. A. Determination of optimal cutting scores in criterion-referenced measurement. Journal of Experimental Education, 1976, 45, 4–9.Google Scholar
  2. Berk, R. A. (Ed.). Criterion-referenced measurement: The state of the art. Baltimore, Md.: Johns Hopkins University Press, 1980a.Google Scholar
  3. Berk, R. A. A consumer’s guide to criterion-referenced test reliability. Journal of Educational Measurement, 1980b, 17, 323–349.CrossRefGoogle Scholar
  4. Campbell, D. T., & Fiske, D. W. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 1959, 56, 81–105.PubMedCrossRefGoogle Scholar
  5. Cox, R. C., & Vargas, J. S. A comparison of item selection techniques for norm-referenced and criterion-referenced tests. Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, 1966.Google Scholar
  6. Cronbach, L. J. Test validation. In R. L. Thorndike (Ed.), Educational measurement. Washington, D.C.: American Council on Education, 1971.Google Scholar
  7. Fitzpatrick, A. R. The meaning of content validity. Applied Psychological Measurement, 1983, 7, 3–13.CrossRefGoogle Scholar
  8. Glaser, R. Instructional technology and the measurement of learning outcomes. American Psychologist, 1963, 18, 519–521.CrossRefGoogle Scholar
  9. Gray, W. M. A comparison of Piagetian theory and criterion-referenced measurement. Review of Educational Research, 1978, 48, 223–249.Google Scholar
  10. Haladyna, T., & Roid, G. The role of instructional sensitivity in the empirical review of criterion-referenced test items. Journal of Educational Measurement, 1981, 18, 39–53.CrossRefGoogle Scholar
  11. Hambleton, R. K. Test score validity and standard-setting methods. In R. A. Berk (Ed.), Criterion-referenced measurement: The state of the art. Baltimore, Md.: Johns Hopkins University Press, 1980.Google Scholar
  12. Hambleton, R. K. Advances in criterion-referenced testing technology. In C. R. Reynolds & T. B. Gutkin (Eds.), The handbook of school psychology. New York: Wiley, 1982.Google Scholar
  13. Hambleton, R. K. Validating the test scores. In R. Berk (Ed.), A guide to criterion-referenced test construction. Baltimore, Md.: Johns Hopkins University Press, 1984.Google Scholar
  14. Hambleton, R. K., & deGruijter, D. N. M. Application of item response models to criterion-referenced test item selection. Journal of Educational Measurement, 1983, 20, 355–367.CrossRefGoogle Scholar
  15. Hambleton, R. K., & Eignor, D. R. Guidelines for evaluating criterion-referenced tests and test manuals. Journal of Educational Measurement, 1978, 15, 321–327.CrossRefGoogle Scholar
  16. Hambleton, R. K., & Novick, M. R. Toward an integration of theory and method for criterion-referenced tests. Journal of Educational Measurement, 1973, 10, 159–170.CrossRefGoogle Scholar
  17. Hambleton, R. K., & Powell, S. A framework for viewing the process of standard setting. Evaluation and the Health Professions, 1983, 6, 3–24.CrossRefGoogle Scholar
  18. Hambleton, R. K., Swaminathan, H., Algina, J., & Coulson, D. B. Criterion-referenced testing and measurement: A review of technical issues and developments. Review of Educational Research, 1978, 48, 1–47.Google Scholar
  19. Kane, M. T. The validity of licensure examinations. American Psychologist, 1982, 37, 911–918.CrossRefGoogle Scholar
  20. Kirsch, I., & Guthrie, J. T. Construct validity of functional reading tests. Journal of Educational Measurement, 1980, 17, 81–93.CrossRefGoogle Scholar
  21. Linn, R. L. Issues of validity in measurement for competency-based programs. In M. A. Bunda & J. R. Sanders (Eds.), Practices and problems in competency-based measurement. Washington, D.C.: National Council on Measurement in Education, 1979.Google Scholar
  22. Linn, R. L. Issues of validity for criterion-referenced measures. Applied Psychological Measurement, 1980, 4, 547–561.CrossRefGoogle Scholar
  23. Lord, F. M., & Novick, M. R. Statistical theories of mental test scores. Reading, Mass.: Addison-Wesley, 1968.Google Scholar
  24. Madaus, G. (Ed.). The courts, validity, and minimum competency testing. Boston, Mass.: Kluwer-Nijhoff Publishing Co., 1983.Google Scholar
  25. Messick, S. A. The standard problem: Meaning and values in measurement and evaluation. American Psychologist, 1975, 30, 955–966.CrossRefGoogle Scholar
  26. Millman, J. Criterion-referenced measurement. In W. J. Popham (Ed.), Evaluation in education: Current applications. Berkeley, Calif.: McCutchan, 1974.Google Scholar
  27. Popham, W. J. An approaching peril: Cloud referenced tests. Phi Delta Kappan, 1974, 56, 614–615.Google Scholar
  28. Popham, W. J. Criterion-referenced measurement. Englewood Cliffs, N.J.: Prentice-Hall, 1978.Google Scholar
  29. Popham, W. J., & Husek, T. R. Implications of criterion-referenced measurement. Journal of Educational Measurement, 1969, 6, 1–9.CrossRefGoogle Scholar
  30. Roid, G., & Haladyna, T. A technology for item writing. New York: Academic Press, 1982.Google Scholar
  31. Rovinelli, R. J., & Hambleton, R. K. On the use of content specialists in the assessment of criterion-referenced test item validity. Dutch Journal of Educational Research, 1977, 2, 49–60.Google Scholar
  32. Subkoviak, M. J. Estimating reliability for the single administration of a mastery test. Journal of Educational Measurement, 1976, 13, 265–276.CrossRefGoogle Scholar
  33. van der Linden, W. J. Decision models for use with criterion-referenced tests. Applied Psychological Measurement, 1980, 4, 469–492.CrossRefGoogle Scholar
  34. Ward, W. C., Frederiksen, N., & Carlson, S. B. Construct validity of free-response and machine-scorable forms of a test. Journal of Educational Measurement, 1980, 17, 11–29.CrossRefGoogle Scholar

Copyright information

© Plenum Press, New York 1985

Authors and Affiliations

  • Ronald K. Hambleton
    • 1
  1. 1.Laboratory of Psychometric and Evaluative ResearchUniversity of MassachusettsAmherstUSA

Personalised recommendations