Abstract
The study evaluated the reliability of pass and fail classifications for several teacher certification tests. Since these tests are used in the context of a cut score to classify examinees as pass and fail, evaluating the accuracy and consistency of these classifications is important. The classification accuracy and consistency statistics were estimated using the RELCLASS software. Results indicated the following. (1) The 29 teacher certification tests that were examined had a relatively high classification accuracy (0.827 to 0.999) and consistency (0.760 to 0.999). (2) Both classification accuracy and consistency increased as the difference between the mean and cut score increased. (3) Classification accuracy and consistency was higher for multiple-choice (MC) as compared to tests consisting of only constructed-response (CR) items or a combination of CR and MC items.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12646-012-0147-9/MediaObjects/12646_2012_147_Fig1_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12646-012-0147-9/MediaObjects/12646_2012_147_Fig2_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12646-012-0147-9/MediaObjects/12646_2012_147_Fig3_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12646-012-0147-9/MediaObjects/12646_2012_147_Fig4_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12646-012-0147-9/MediaObjects/12646_2012_147_Fig5_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12646-012-0147-9/MediaObjects/12646_2012_147_Fig6_HTML.gif)
Similar content being viewed by others
Notes
Researchers interested in conducting similar studies can use the procedural steps documented in Livingston and Lewis (1995) for computing the reliability of classification.
References
Anderson, D. O., & Schneider, C. (October 2002). Reliability of tests used for classification. Paper presented at the annual conference of the Northeastern Educational Research Association, Kerhonkson, NY.
Breyer, F. J., & Lewis, C. (1994). Pass-fail reliability for tests with cut scores: A simplified method (ETS Research Report No. 94–39). Princeton: Educational Testing Service.
Lee, W., Hanson, B. A., & Brennan, R. L. (2000). Procedures for computing classification and accuracy indices with multiple categories (ACT Research Report No. 2000–10). Iowa city: American College Testing.
Livingston, S. A., & Lewis, C. (1995). Estimating the consistency and accuracy of classifications based on test scores. Journal of Educational Measurement, 32(2), 179–197.
Livingston, S. A., & Wingersky, M. S. (1979). Assessing the reliability of tests used to make pass/fail decisions. Journal of Educational Measurement, 16(4), 247–260.
Mroczka, R. C. (2000). RELCLASS-COMP (SOSA8P Version 4.11). Princeton: Educational Testing Service.
Subkoviak, M. J. (1984). Estimating the reliability of mastery-nonmastery classifications. In R. A. Berk (Ed.), A guide to criterion-referenced test construction. Baltimore: John Hopkins University Press.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Puhan, G., Gall, L. Reliability of Pass and Fail Decisions on Tests Employing Cut Scores. Psychol Stud 57, 273–282 (2012). https://doi.org/10.1007/s12646-012-0147-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12646-012-0147-9