Advances in Criterion-Referenced Measurement

Hambleton, Ronald K.; Rogers, H. Jane

doi:10.1007/978-94-009-2195-5_1

Ronald K. Hambleton &
H. Jane Rogers

Part of the book series: Evaluation in Education and Human Services Series ((EEHS,volume 28))

377 Accesses
12 Citations

Abstract

One of the major changes in the testing field over the last 20 years has been the increased interest in and use of criterion-referenced tests (CRT). Criterion-referenced tests provide a basis for assessing the performance of examinees in relation to well-defined domains of content rather than in relation to other examinees, as with norm-referenced tests. Criterionreferenced tests are now widely used (1) in the armed services, to assess the competencies of servicemen; (2) in industry, to assess the job skills of employees and to evaluate the results of training programs; (3) in the licensing and certification fields, to distinguish “masters” from “nonmasters” in over 900 professions in the United States alone; and (4) in educational settings such as schools, colleges, and universities, to assess the performance levels of students on competencies of interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1985). Standards for educational and psychological testing. Washington, DC: APA.
Google Scholar
Angoff, W. H. (1971). Scales, norms, and equivalent scores. In R. L. Thorndike (ed.), Educational measurement, 2nd ed. Washington, DC: American Council on Education, pp. 508–600.
Google Scholar
Berk, R. A. (1976). Determination of optimal cutting scores in criterion-referenced measurement. Journal of Experimental Education 45:4–9.
Google Scholar
Berk, R. A. (1980). A consumer’s guide to criterion-referenced test reliability. Journal of Educational Measurement 17:323–349.
Article Google Scholar
Berk, R. A. (ed.). (1984a). A guide to criterion-referenced test construction. Baltimore, MD: The Johns Hopkins University Press.
Google Scholar
Berk, R. A. (1984b). Conducting the item analysis. In R. A. Berk (ed.), A guide to criterion-referenced test construction. Baltimore, MD: The Johns Hopkins University Press, pp. 97–143.
Google Scholar
Berk, R. A. (1986). A consumer’s guide to setting performance standards on criterion-referenced tests. Review of Educational Research 56(1):137–172.
Article Google Scholar
Block, J. H. (1972). Student learning and the setting of mastery performance standards. Educational Horizons 50:183–190.
Google Scholar
Campbell, D.T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin 56:81–105.
Article Google Scholar
Carlson, S. B. (1985). Creative classroom testing: 10 designs for assessment and instruction. Princeton, NJ: Educational Testing Service.
Google Scholar
Carver, R. P. (1974). Two dimensions of tests: psychometric and edumetric. American Psychologist 29:512–518.
Article Google Scholar
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20:37–46.
Article Google Scholar
Department of the Army. (1986). Skill qualification test and common task test development policy and procedures (TRADOC Reg. 351–2). Fort Monroe, VA: U.S. Army Training and Doctrine Command.
Google Scholar
Ebel, R. L. (1972). Essentials of educational measurement. Englewood Cliffs, NJ: Prentice-Hall
Google Scholar
Ebel, R. L. (1978). The case for norm-referenced measurements. Educational Researcher 7:3–5.
Google Scholar
Eignor, D. R., & Hambleton, R. K. (1979). Effects of test length and advancement score on several criterion-referenced test reliability and validity indices (Laboratory of Psychometric and Evaluative Research Report No. 86). Amherst, MA: School of Education, University of Massachusetts.
Google Scholar
Fitzpatrick, A.R. (1983). The meaning of content validity. Applied Psychological Measurement 7:3–13.
Article Google Scholar
Glaser, R. (1963). Instructional technology and the measurement of learning outcomes: Some questions. American Psychologist 18:519–521.
Article Google Scholar
Glass, G.V. (1978). Standards and criteria. Journal of Educational Measurement 15:237–261.
Article Google Scholar
Gray, W. M. (1978). A comparison of Piagetian theory and criterion-referenced measurement. Review of Educational Research 48:223–249.
Article Google Scholar
Gulliksen, H. (1986). Perspective on educational measurement. Applied Psychological Measurement 10(2):109–132.
Article Google Scholar
Haertel, E. (1985). Construct validity and criterion-referenced testing. Review of Educational Research 55(1):23–46.
Article Google Scholar
Haladyna, T. M., & Downing, S. M. (1989a). A taxonomy of multiple-choice item-writing rules. Applied Measurement in Education 2(1):37–50.
Article Google Scholar
Haladyna, T. M., & Downing, S. M. (1989b). Validity of a taxonomy of multiple-choice item-writing rules. Applied Measurement in Education 2(1):51–78.
Article Google Scholar
Haladyna, T. M., & Shindoll, R. R (1989). Item shells: A method for writing effective multiple-choice test items. Evaluation & the Health Professions 12(1):97–106.
Article Google Scholar
Hambleton, R. K. (ed.). (1980). Contributions to criterion-referenced testing technology. Applied Psychological Measurement 4(4):421–581.
Google Scholar
Hambleton, R. K. (1982). Advances in criterion-referenced testing technology. In C. R. Reynolds & T. Gutkin (eds.), Handbook of school psychology. New York: Wiley, pp. 351–379.
Google Scholar
Hambleton, R. K. (1984a). Validating the test scores. In R. A. Berk (ed.), A guide to criterion-referenced test construction. Baltimore, MD: The Johns Hopkins University Press, pp. 199–230.
Google Scholar
Hambleton, R. K. (1984b). Determining test lengths. In R. A. Berk (ed.), A guide to criterion-referenced test construction. Baltimore, MD: The Johns Hopkins University Press, pp. 144–168.
Google Scholar
Hambleton, R. K. (1985). Criterion-referenced assessment of individual differences. In C. R. Reynolds & V. L. Willson (eds.), Methodological and statistical advances in the study of individual differences. New York: Plenum Press, pp. 393–424.
Google Scholar
Hambleton, R. K. (1989). Principles and selected applications of item response theory. In R. L. Linn (ed.), Educational measurement, 3rd ed. New York: Macmillan, pp. 147–200.
Google Scholar
Hambleton, R. K. (1990). A practical guide to criterion-referenced testing. Boston, MA: Kluwer Academic Publishers.
Google Scholar
Hambleton, R. K., & Eignor, D.R. (1978). Guidelines for evaluating criterion-referenced tests and test manuals. Journal of Educational Measurement 15:321–327.
Article Google Scholar
Hambleton, R. K., Mills, C.N., & Simon, R. (1983). Determining the lengths for criterion-referenced tests. Journal of Educational Measurement 20(1):27–38.
Article Google Scholar
Hambleton, R. K., & Novick, M. R. (1973). Toward an integration of theory and method for criterion-referenced tests. Journal of Educational Measurement 10:159–171.
Article Google Scholar
Hambleton, R. K., & Powell, S. (1983). A framework for viewing the process of standard-setting. Evaluation & the Health Professions 6:3–24.
Article Google Scholar
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer Academic Publishers.
Book Google Scholar
Hambleton, R.K., Swaminathan, H., Algina, J., & Coulson, D.B. (1978). Criterion-referenced testing and measurement: A review of technical issues and developments. Review of Educational Research 48:1–47.
Article Google Scholar
Huynh, H. (1976). On the reliability of decisions in domain-referenced testing. Journal of Educational Measurement 13:253–264.
Article Google Scholar
Jaeger, R. M. (1978). A proposal for setting a standard on the North Carolina High School Competency Test. Paper presented at the Spring meeting of the North Carolina Association for Research in Education, Chapel Hill.
Google Scholar
Jaeger, R. M. (1989). Certification of student competence. In R.L. Linn (ed.), Educational measurement, 3rd ed. New York: Macmillan, pp. 485–514.
Google Scholar
Kane, M. T. (1982). The validity of licensure examinations. American Psychologist 37:911–918.
Article Google Scholar
Kirsch, I., & Guthrie, J.T. (1980). Construct validity of functional reading tests. Journal of Educational Measurement 17:81–93.
Article Google Scholar
Linn, R. L. (1979). Issues of validity in measurement for competency-based programs. In M.A. Bunda & J.R. Sanders (eds.), Practice and problems in competency-based measurement. Washington, DC: National Council on Measurement in Education, pp. 108–123.
Google Scholar
Linn, R. L. (1980). Issues of validity for criterion-referenced measures. Applied Psychological Measurement 4:547–561.
Article Google Scholar
Linn, R. L. (ed.) (1989). Educational measurement, 3rd ed. New York: Macmillan.
Google Scholar
Livingston, S.A. (1975). A utility-based approach to the evaluation of pass/fail testing decision procedures (Report No. COPA-75-01). Princeton, NJ: Center for Occupational and Professional Assessment, Educational Testing Service.
Google Scholar
Livingston, S.A. (1976). Choosing minimum passing scores by stochastic approximation techniques (Report No. COPA-76-02). Princeton, NJ: Center for Occupational and Professional Assessment, Educational Testing Service.
Google Scholar
Livingston, S.A. (1989). New Jersey College Outcomes Evaluation Program: A report on the development of the general intellectual skills assessment. Princeton, NJ: Educational Testing Service.
Google Scholar
Livingston, S.A., & Zieky, M.J. (1982). Passing scores: A manual for setting standards of performance on educational and occupational tests. Princeton, NJ: Educational Testing Service.
Google Scholar
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Google Scholar
Madaus, G. (ed.). (1983). The courts, validity, and minimum competency. Boston, MA: Kluwer Academic Publishers.
Google Scholar
Mager, R. F. (1962). Preparing instructional objectives. Palo Alto, CA: Fearon Publishers.
Google Scholar
Meskauskas, J. A. (1976). Evaluation models for criterion-referenced testing: Views regarding mastery and standard-setting. Review of Educational Research 46:133–158.
Article Google Scholar
Messick, S. A. (1975). The standard problem: Meaning and values in measurement and evaluation. American Psychologist 30:955–966.
Article Google Scholar
Messick, S. A. (1989). Validity. In R. L. Linn (ed.), Educational measurement, 3rd ed. New York: Macmillan, pp. 13–104.
Google Scholar
Millman, J. (1973). Passing scores and test lengths for domain-referenced measures. Review of Educational Research 43:205–216.
Article Google Scholar
Millman, J. (1974). Criterion-referenced measurement. In W.J. Popham (ed.), Evaluation in education: Current applications. Berkeley, CA: McCutchan Publishing, pp. 311–397.
Google Scholar
Millman, J., & Westman, R. S. (1989). Computer-assisted writing of achievement test items: Toward a future technology. Journal of Educational Measurement 26(2): 177–190.
Article Google Scholar
Nedelsky, L. (1954). Absolute grading standards for objective tests. Educational and Psychological Measurement 14:3–19.
Article Google Scholar
Nitko, A. J. (1980). Distinguishing the many varieties of criterion-referenced tests. Review of Educational Research 50:461–485.
Article Google Scholar
Norcini, J. J., Hancock, E.W., Webster, G.D., Grosso, L. J., & Shea, J. A. (1988). A criterion-referenced examination of physician competence. Evaluation & the Health Professions 11(1):98–112.
Article Google Scholar
Popham, W. J. (1974). An approaching peril: Cloud-referenced tests. Phi Delta Kappan 56:614–615.
Google Scholar
Popham, W. J. (1978a). Criterion-referenced measurement. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Popham, W. J. (1978b). Setting performance standards. Los Angeles, CA: Instructional Objectives Exchange.
Google Scholar
Popham, W. J. (1978c). As always, provocative. Journal of Educational Measurement 15:297–300.
Article Google Scholar
Popham, W. J. (1978d). The case for criterion-referenced measurements. Educational Researcher 7:6–10.
Article Google Scholar
Popham, W. J. (1984). Specifying the domain of content or behaviors. In R. Berk (ed.), A guide to criterion-referenced test construction. Baltimore, MD: The Johns Hopkins University Press.
Google Scholar
Popham, W. J. (1987). Preparing policy-makers for standard-setting on high-stakes tests. Educational Evaluation and Policy Analysis 9:77–82.
Google Scholar
Popham, W. J., & Husek, T. R. (1969). Implications of criterion-referenced measurement. Journal of Educational Measurement 6:1–9.
Article Google Scholar
Roid, G.H., & Haladyna, T. M. (1982). A technology for test-item writing. New York: Academic Press.
Google Scholar
Rovinelli, R. J., & Hambleton, R.K. (1977). On the use of content specialists in the assessment of criterion-referenced test item validity. Tijdschrift voor Onderwijsresearch 2:49–60.
Google Scholar
Scheuneman, J. D., & Bleistein, C. A. (1989). A consumer’s guide to statistics for identifying differential item functioning. Applied Measurement in Education 2(3):255–275.
Article Google Scholar
Schoon, C.G., Gullion, C.M., & Ferrara, P. (1979). Bayesian statistics, credentialing examinations, and the determination of passing points. Evaluation & the Health Professions 2:181–201.
Article Google Scholar
Shepard, L. A. (1984). Setting performance standards. In R.A. Berk (ed.), A guide to criterion-referenced test construction. Baltimore, MD: The Johns Hopkins University Press, pp. 169–198.
Google Scholar
Subkoviak, M. (1976). Estimating reliability from a single administration of a criterion-referenced test. Journal of Educational Measurement 13:265–275.
Article Google Scholar
Subkoviak, M. (1988). A practitioner’s guide to computation and interpretation of reliability indices for mastery tests. Journal of Educational Measurement 25:47–55.
Article Google Scholar
Swaminathan, H., Hambleton, R. K., & Algina, J. (1974). Reliability of criterion-referenced tests: A decision-theoretic formulation. Journal of Educational Measurement 11:263–268.
Article Google Scholar
van der Linden, W.J. (1980). Decision models for use with criterion-referenced tests. Applied Psychological Measurement 4:469–492.
Article Google Scholar
van der Linden, W.J. (1981). A latent trait look at pre-test-post-test validation of criterion-referenced test items. Review of Educational Research 51(3):379–402.
Article Google Scholar
van der Linden, W.J. & Mellenbergh, G. J. (1977). Optimal cutting scores using a linear loss function. Applied Psychological Measurement 11:593–599.
Article Google Scholar
Ward, W. C, Frederiksen, N., & Carlson, S. B. (1980). Construct validity of free-response and machine-scorable forms of a test. Journal of Educational Measurement 17:11–29.
Article Google Scholar
Wieberg, H.J.W., Neeb, K.E., & Schott, F. (1984). Empirical comparison of trained and non-trained teachers in constructing criterion-referenced items. Studies in Educational Evaluation 10:199–204.
Article Google Scholar
Wilcox, R. (1976). A note on the length and passing score of a mastery test. Journal of Educational Statistics 1:359–364.
Article Google Scholar
Wilcox, R. (1980). Determining the length of a criterion-referenced test. Applied Psychological Measurement 4(4):425–446.
Article Google Scholar
Woodruff, D.J., & Sawyer, R. L. (1989). Estimating measures of pass-fail reliability from parallel half tests. Applied Psychological Measurement 13(1):33–43.
Article Google Scholar

Download references

Authors

Ronald K. Hambleton
View author publications
You can also search for this author in PubMed Google Scholar
H. Jane Rogers
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ronald K. Hambleton Jac N. Zaal

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hambleton, R.K., Rogers, H.J. (1991). Advances in Criterion-Referenced Measurement. In: Hambleton, R.K., Zaal, J.N. (eds) Advances in Educational and Psychological Testing: Theory and Applications. Evaluation in Education and Human Services Series, vol 28. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-2195-5_1

Download citation

DOI: https://doi.org/10.1007/978-94-009-2195-5_1
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-7484-1
Online ISBN: 978-94-009-2195-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics