Abstract
According to Wollack and Schoenig (Cheating, in: Frey BB (ed) The SAGE encyclopedia of educational research, measurement, and evaluation, Sage, Thousand Oaks, pp 260–265, 2018), score differencing is one of six types of statistical methods used to detect test fraud. In this paper, we suggest the use of Bayes factors (e.g., Kass and Raftery in J Am Stat Assoc 90:773–795, 1995) for score differencing. A simulation study shows that the suggested approach performs slightly better than an existing frequentist approach. We demonstrate the usefulness of the suggested approach using a real data set that involves actual test fraud.
Similar content being viewed by others
Notes
Note that the term “score differencing” was used in only one of these references. However, the methods suggested in these references are various versions of “score differencing.”
Note that \(\int _{\theta _1=-\infty }^{\theta _1=\infty }\int _{\theta _2=-\infty }^{\theta _2=\infty }2 \phi (\theta _1)\frac{1}{\sqrt{10}}\phi (\frac{\theta _2}{\sqrt{10}})I(\theta _2\ge \theta _1)\mathrm{d}\theta _1\mathrm{d}\theta _2=1\).
Sinharay (2017a) noted this phenomenon that occurs when \(\hat{\theta }_1\) and \(\hat{\theta }_2\) are very close—a conclusion of no significant score difference is made for the corresponding examinees.
That can be achieved by using 1.64 as the cutoff for the SLR statistic and a simulation-based cutoff for the Bayes factor.
Though, in those simulations, we noticed a slight tendency of the Bayes factor increasing with an increase in the prior variances of the ability distributions.
Sinharay and Johnson (2020) made some progress regarding the use of the posterior probability for score differencing.
References
Allen J, Ghattas A (2016) Estimating the probability of traditional copying, conditional on answer-copying statistics. Appl Psycho Meas 40:258–273
Chen W-H, Thissen D (1997) Local dependence indexes for item pairs using item response theory. J Educ Behav Stat 22:265–289
Cizek GJ, Wollack JA (2017) Handbook of detecting cheating on tests. Routledge, Washington, DC
Cox DR (2006) Principles of statistical inference. Cambridge University Press, New York
Drasgow F, Levine MV, Williams EA (1985) Appropriateness measurement with polychotomous item response models and standardized indices. Br J Math Stat Psychol 38:67–86
Finkelman M, Weiss DJ, Kim-Kang G (2010) Item selection and hypothesis testing for the adaptive measurement of change. Appl Psychol Meas 34:238–254
Fischer GH (2003) The precision of gain scores under an item response theory perspective: a comparison of asymptotic and exact conditional inference about change. Appl Psychol Meas 27:3–26
Fox J-P, Mulder J, Sinharay S (2017) Bayes factor covariance testing in item response models. Psychometrika 82:979–1006
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2014) Bayesian data analysis, 3rd edn. Chapman and Hall, New York
Gu X, Mulder J, Deković M, Hoijtink H (2014) Bayesian evaluation of inequality constrained hypotheses. Psychol Methods 19:511–527
Guo J, Drasgow F (2010) Identifying cheating on unproctored internet tests: the Z-test and the likelihood ratio test. Int J Sel Assess 18:351–364
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36
Hoijtink H, Mulder J, van Lissa C, Gu X (2019) A tutorial on testing hypotheses using the Bayes factor. Psychol Methods. https://doi.org/10.1037/met0000201
Jeffreys H (1961) Theory of probability. Oxford University Press, Oxford
Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795
Klugkist I, Laudy O, Hoijtink H (2005) Inequality constrained analysis of variance: a Bayesian approach. Psychol Methods 10:477–493
Masson MEJ (2011) A tutorial on a practical Bayesian alternative to null-hypothesis significance testing. Behav Res Methods 43:679–690
Morey RD, Romeijn J-W, Rouder JN (2016) The philosophy of Bayes factors and the quantification of statistical evidence. J Math Psychol 72:6–18
Mulder J, Klugkist I, van de Schoot R, Meeus WHJ, Selfhout M, Hoijtink H (2009) Bayesian model selection of informative hypotheses for repeated measurements. J Math Psychol 53:530–546
Orlando M, Thissen D (2000) Likelihood-based item-fit indices for dichotomous item response theory models. Appl Psychol Meas 24:50–64
Schönbrodt FD, Wagenmakers E-J, Zehetleitner M, Perugini M (2017) Sequential hypothesis testing with Bayes factors: efficiently testing mean differences. Psychol Methods 22:322–339
Sinharay S (2017a) Detection of item preknowledge using likelihood ratio test and score test. J Educ Behav Stat 42:46–68
Sinharay S (2017b) Which statistic should be used to detect item preknowledge when the set of compromised items is known? Appl Psychol Meas 41:403–421
Sinharay S (2018) Application of Bayesian methods for detecting fraudulent behavior on tests. Meas Interdiscip Res Perspect 16:100–113
Sinharay S, Jensen JL (2019) Higher-order asymptotics and its application to testing the equality of the examinee ability over two sets of items. Psychometrika 84:484–510
Sinharay S, Johnson MS (2020) The use of the posterior probability in score differencing. J Educ Behav Stat (in press)
Sinharay S, Duong MQ, Wood SW (2017) A new statistic for detection of aberrant answer changes. J Educ Meas 54:200–217
Skorupski WP, Wainer H (2017) The case for Bayesian methods when investigating test fraud. In: Cizek GJ, Wollack JA (eds) Handbook of detecting cheating on tests. Routledge, Washington, DC, pp 214–231
Stern HS (2005) Model inference or model selection: discussion of Klugkist, Laudy, and Hoijtink (2005). Psychol Methods 10:494–499
Tendeiro JN, Meijer RR (2014) Detection of invalid test scores: the usefulness of simple nonparametric statistics. J Educ Meas 51:239–259
Tijmstra J, Hoijtink H, Sijtsma K (2015) Evaluating manifest monotonicity using bayes factors. Psychometrika 80:880–896
van der Linden WJ (2009) A bivariate lognormal response-time model for the detection of collusion between test takers. J Educ Behav Stat 34:378–394
van der Linden WJ, Lewis C (2015) Bayesian checks on cheating on tests. Psychometrika 80:689–706
Verhagen J, Levy R, Millsap RE, Fox J-P (2016) Evaluating evidence for invariant items: a Bayes factor applied to testing measurement invariance in IRT models. J Math Psychol 72:171–182
Wagenmakers E-J (2007) A practical solution to the pervasive problems of p values. Psychon Bull Rev 14:779–804
Wang X, Liu Y, Hambleton RK (2017) Detecting item preknowledge using a predictive checking method. Appl Psychol Meas 41:243–263
Wang X, Liu Y, Robin F, Guo H (2019) A comparison of methods for detecting examinee preknowledge of items. Int J Test 19:207–226
Warm TA (1989) Weighted likelihood estimation of ability in item response theory. Psychometrika 54:427–450
Wasserman L (2004) All of statistics: a concise course in statistical inference. Springer, New York
Wasserstein RL, Lazar NA (2016) The ASA’s statement on p-values: context, process, and purpose. Am Stat 70:129–133
Wetzels R, Matzke D, Lee MD, Rouder JN, Iverson GJ, Wagenmakers E-J (2011) Statistical evidence in experimental psychology. Perspect Psychol Sci 6:291–298
Wollack JA, Schoenig RW (2018) Cheating. In: Frey BB (ed) The SAGE encyclopedia of educational research, measurement, and evaluation. Sage, Thousand Oaks, pp 260–265
Wollack JA, Cohen AS, Eckerly CA (2015) Detecting test tampering using item response theory. Educ Psychol Meas 75:931–953
Acknowledgements
The authors wish to express sincere appreciation and gratitude to Wim van der Linden and Kazuo Shigemasu, the editors. The authors thank Sooyeon Kim, Carol Eckerly, and Daniel McCaffrey for their helpful comments on an earlier version. Any opinions expressed in this publication are those of the authors and not necessarily of ETS or of Institute of Education Sciences. The research was supported by the Institute of Education Sciences, US Department of Education, through Grant R305D170026.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Kazuo Shigemasu.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Any opinions expressed in this publication are those of the authors and not necessarily of Educational Testing Service or Institute of Education Sciences.
About this article
Cite this article
Sinharay, S., Johnson, M.S. Detecting test fraud using Bayes factors. Behaviormetrika 47, 339–354 (2020). https://doi.org/10.1007/s41237-020-00113-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41237-020-00113-9