Criterion-Referenced Assessment of Individual Differences
Abstract
The field of criterion-referenced testing has developed quickly since the first papers on the topic by Glaser (1963) and Popham and Husek (1969). Glaser, and later, Popham and Husek, were interested in assessment methods that could provide information on which to base a number of individual and programmatic decisions arising in connection with specific instructional objectives or competencies. Norm-referenced tests were judged to be inappropriate because they provide information that facilitates comparisons among examinees on broad traits or constructs. These tests were not intended to measure specific objectives. And even if items in a norm-referenced test could be matched to objectives, typically there would be too few test items per objective to permit valid criterion-referenced test score interpretations.
Keywords
Test Score Test Item Content Validity Item Statistic Educational MeasurementPreview
Unable to display preview. Download preview PDF.
References
- Berk, R. A. Determination of optimal cutting scores in criterion-referenced measurement. Journal of Experimental Education, 1976, 45, 4–9.Google Scholar
- Berk, R. A. (Ed.). Criterion-referenced measurement: The state of the art. Baltimore, Md.: Johns Hopkins University Press, 1980a.Google Scholar
- Berk, R. A. A consumer’s guide to criterion-referenced test reliability. Journal of Educational Measurement, 1980b, 17, 323–349.CrossRefGoogle Scholar
- Campbell, D. T., & Fiske, D. W. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 1959, 56, 81–105.PubMedCrossRefGoogle Scholar
- Cox, R. C., & Vargas, J. S. A comparison of item selection techniques for norm-referenced and criterion-referenced tests. Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, 1966.Google Scholar
- Cronbach, L. J. Test validation. In R. L. Thorndike (Ed.), Educational measurement. Washington, D.C.: American Council on Education, 1971.Google Scholar
- Fitzpatrick, A. R. The meaning of content validity. Applied Psychological Measurement, 1983, 7, 3–13.CrossRefGoogle Scholar
- Glaser, R. Instructional technology and the measurement of learning outcomes. American Psychologist, 1963, 18, 519–521.CrossRefGoogle Scholar
- Gray, W. M. A comparison of Piagetian theory and criterion-referenced measurement. Review of Educational Research, 1978, 48, 223–249.Google Scholar
- Haladyna, T., & Roid, G. The role of instructional sensitivity in the empirical review of criterion-referenced test items. Journal of Educational Measurement, 1981, 18, 39–53.CrossRefGoogle Scholar
- Hambleton, R. K. Test score validity and standard-setting methods. In R. A. Berk (Ed.), Criterion-referenced measurement: The state of the art. Baltimore, Md.: Johns Hopkins University Press, 1980.Google Scholar
- Hambleton, R. K. Advances in criterion-referenced testing technology. In C. R. Reynolds & T. B. Gutkin (Eds.), The handbook of school psychology. New York: Wiley, 1982.Google Scholar
- Hambleton, R. K. Validating the test scores. In R. Berk (Ed.), A guide to criterion-referenced test construction. Baltimore, Md.: Johns Hopkins University Press, 1984.Google Scholar
- Hambleton, R. K., & deGruijter, D. N. M. Application of item response models to criterion-referenced test item selection. Journal of Educational Measurement, 1983, 20, 355–367.CrossRefGoogle Scholar
- Hambleton, R. K., & Eignor, D. R. Guidelines for evaluating criterion-referenced tests and test manuals. Journal of Educational Measurement, 1978, 15, 321–327.CrossRefGoogle Scholar
- Hambleton, R. K., & Novick, M. R. Toward an integration of theory and method for criterion-referenced tests. Journal of Educational Measurement, 1973, 10, 159–170.CrossRefGoogle Scholar
- Hambleton, R. K., & Powell, S. A framework for viewing the process of standard setting. Evaluation and the Health Professions, 1983, 6, 3–24.CrossRefGoogle Scholar
- Hambleton, R. K., Swaminathan, H., Algina, J., & Coulson, D. B. Criterion-referenced testing and measurement: A review of technical issues and developments. Review of Educational Research, 1978, 48, 1–47.Google Scholar
- Kane, M. T. The validity of licensure examinations. American Psychologist, 1982, 37, 911–918.CrossRefGoogle Scholar
- Kirsch, I., & Guthrie, J. T. Construct validity of functional reading tests. Journal of Educational Measurement, 1980, 17, 81–93.CrossRefGoogle Scholar
- Linn, R. L. Issues of validity in measurement for competency-based programs. In M. A. Bunda & J. R. Sanders (Eds.), Practices and problems in competency-based measurement. Washington, D.C.: National Council on Measurement in Education, 1979.Google Scholar
- Linn, R. L. Issues of validity for criterion-referenced measures. Applied Psychological Measurement, 1980, 4, 547–561.CrossRefGoogle Scholar
- Lord, F. M., & Novick, M. R. Statistical theories of mental test scores. Reading, Mass.: Addison-Wesley, 1968.Google Scholar
- Madaus, G. (Ed.). The courts, validity, and minimum competency testing. Boston, Mass.: Kluwer-Nijhoff Publishing Co., 1983.Google Scholar
- Messick, S. A. The standard problem: Meaning and values in measurement and evaluation. American Psychologist, 1975, 30, 955–966.CrossRefGoogle Scholar
- Millman, J. Criterion-referenced measurement. In W. J. Popham (Ed.), Evaluation in education: Current applications. Berkeley, Calif.: McCutchan, 1974.Google Scholar
- Popham, W. J. An approaching peril: Cloud referenced tests. Phi Delta Kappan, 1974, 56, 614–615.Google Scholar
- Popham, W. J. Criterion-referenced measurement. Englewood Cliffs, N.J.: Prentice-Hall, 1978.Google Scholar
- Popham, W. J., & Husek, T. R. Implications of criterion-referenced measurement. Journal of Educational Measurement, 1969, 6, 1–9.CrossRefGoogle Scholar
- Roid, G., & Haladyna, T. A technology for item writing. New York: Academic Press, 1982.Google Scholar
- Rovinelli, R. J., & Hambleton, R. K. On the use of content specialists in the assessment of criterion-referenced test item validity. Dutch Journal of Educational Research, 1977, 2, 49–60.Google Scholar
- Subkoviak, M. J. Estimating reliability for the single administration of a mastery test. Journal of Educational Measurement, 1976, 13, 265–276.CrossRefGoogle Scholar
- van der Linden, W. J. Decision models for use with criterion-referenced tests. Applied Psychological Measurement, 1980, 4, 469–492.CrossRefGoogle Scholar
- Ward, W. C., Frederiksen, N., & Carlson, S. B. Construct validity of free-response and machine-scorable forms of a test. Journal of Educational Measurement, 1980, 17, 11–29.CrossRefGoogle Scholar