An approximation of theK outN reliability of a test, and a scoring procedure for determining which items an examinee knows

Wilcox, Rand R.

doi:10.1007/BF02294016

An approximation of theK outN reliability of a test, and a scoring procedure for determining which items an examinee knows

Published: June 1983

Volume 48, pages 211–222, (1983)
Cite this article

Psychometrika Aims and scope Submit manuscript

Rand R. Wilcox¹

61 Accesses
8 Citations
Explore all metrics

Abstract

Consider any scoring procedure for determining whether an examinee knows the answer to a test item. Letx _i = 1 if a correct decision is made about whether the examinee knows the ith item; otherwisex _i = 0. Thek out ofn reliability of a test isρ _k = Pr (Σx _i ≥k). That is,ρ _k is the probability of making at leastk correct decisions for a typical (randomly sampled) examinee. This paper proposes an approximation ofρ _k that can be estimated with an answer-until-correct test. The paper also suggests a scoring procedure that might be used whenρ _k is judged to be too small under a conventional scoring rule where it is decided an examinee knows if and only if the correct response is given.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

References

Ashler, D. Biserial estimators in the presence of guessing.Journal of Educational Statistics, 1979,4, 325–355.
Google Scholar
Bahadur, R. R. A representation of the joint distribution of responses ton dichotomous items. In H. Solomon (Ed.)Studies in Item Analysis and Prediction. Stanford: Stanford University Press, 1961.
Google Scholar
Barlow, R., Bartholomew, D., Bremner, J., & Brunk, H.Statistical inference under order restrictions. New York: Wiley, 1972.
Google Scholar
Bliss, L. B. A test of Lord's assumption regarding examinee guessing behavior on multiple-choice tests using elementary school students.Journal of Educational Measurement, 1980,17, 147–153.
Google Scholar
Coombs, C. H., Milholland, J. E., & Womer, F. B. The assessment of partial information.Educational and Psychological Measurement, 1956,16, 13–37.
Google Scholar
Copas, J. B. On symmetric compound decision rules for dichotomies.Annals of Statistics, 1974,2, 199–204.
Google Scholar
Cross, L. H., & Frary, R. B. An empirical test of Lord's theoretical results regarding formula-scoring of multiple-choice tests.Journal of Educational Measurement, 1977,14, 313–321.
Google Scholar
Dayton, C. M., & Macready, G. B. A probabilistic model for validation of behavioral hierarchies.Psychometrika, 1976,41, 189–204.
Google Scholar
Dillon, W. R., & Goldstein, M. On the performance of some multinomial classification rules.Journal of the American Statistical Association, 1978,73, 305–313.
Google Scholar
Gilbert, E. S. On discrimination using qualitative variables.Journal of the American Statistical Association, 1968,63, 1399–1412.
Google Scholar
Macready, G. B., & Dayton, C. M. The use of probabilistic models in the assessment of mastery.Journal of Educational Statistics, 1977,2, 99–120.
Google Scholar
Moore, II, D. H. Evaluation of five discrimination procedures for binary variables.Journal of the American Statistical Association, 1973,68, 399–404.
Google Scholar
Robertson, T. Testing for and against an order restriction on multinomial parameters.Journal of the American Statistical Association, 1978,73, 197–202.
Google Scholar
Tong, Y. L.Probability inequalities in multivariate distributions. New York: Academic Press, 1980.
Google Scholar
van den Brink, W. P., & Koele, P. Item sampling, guessing and decision-making in achievement testing.British Journal of Mathematical and Statistical Psychology, 1980,33, 104–108.
Google Scholar
Weitzman, R. A. Ideal multiple-choice items.Journal of the American Statistical Association, 1970,65, 71–89.
Google Scholar
Wilcox, R. R. Determining the length of a criterion-referenced test.Applied Psychological Measurement, 1980,4, 425–446.
Google Scholar
Wilcox, R. R. Some empirical and theoretical results on an answer-until-correct scoring procedure.British Journal of Mathematical and Statistical Psychology, 1982,35, 57–70. (a)
Google Scholar
Wilcox, R. R. Some new results on an answer-until-correct scoring procedure.Journal of Educational Measurement, 1982,19, 67–74. (b)
Google Scholar
Wilcox, R. R. Using results onk out ofn system reliability to study and characterize tests.Educational and Psychological Measurement, 1982,42, 153–165. (c)
Google Scholar
Wilcox, R. R. Bounds on thek out ofn reliability of a test, and an exact test for hierarchically related items.Applied Psychological Measurement, in press. (a)
Wilcox, R. R. How do examinees behave when taking multiple-choice tests?Applied Psychological Measurement, in press. (b)

Download references

Author information

Authors and Affiliations

Department of Psychology, SGM621, University of Southern California, 90089, Los Angeles, California
Rand R. Wilcox

Authors

Rand R. Wilcox
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wilcox, R.R. An approximation of theK outN reliability of a test, and a scoring procedure for determining which items an examinee knows. Psychometrika 48, 211–222 (1983). https://doi.org/10.1007/BF02294016

Download citation

Revised: 05 January 1982
Issue Date: June 1983
DOI: https://doi.org/10.1007/BF02294016

Key Word

latent class models

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An approximation of theK outN reliability of a test, and a scoring procedure for determining which items an examinee knows

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key Word

Navigation

An approximation of theK outN reliability of a test, and a scoring procedure for determining which items an examinee knows

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key Word

Search

Navigation