Detecting test fraud using Bayes factors

Sinharay, Sandip; Johnson, Matthew S.

doi:10.1007/s41237-020-00113-9

Detecting test fraud using Bayes factors

Invited Paper
Published: 02 May 2020

Volume 47, pages 339–354, (2020)
Cite this article

Behaviormetrika Aims and scope Submit manuscript

363 Accesses
3 Citations
4 Altmetric
Explore all metrics

Abstract

According to Wollack and Schoenig (Cheating, in: Frey BB (ed) The SAGE encyclopedia of educational research, measurement, and evaluation, Sage, Thousand Oaks, pp 260–265, 2018), score differencing is one of six types of statistical methods used to detect test fraud. In this paper, we suggest the use of Bayes factors (e.g., Kass and Raftery in J Am Stat Assoc 90:773–795, 1995) for score differencing. A simulation study shows that the suggested approach performs slightly better than an existing frequentist approach. We demonstrate the usefulness of the suggested approach using a real data set that involves actual test fraud.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian Checks on Cheating on Tests

Article 11 June 2014

What do the experts know? Calibration, precision, and the wisdom of crowds among forensic handwriting experts

Article 17 April 2018

Cheating Detection Method Based on Improved Cognitive Diagnosis Model

Notes

Note that the term “score differencing” was used in only one of these references. However, the methods suggested in these references are various versions of “score differencing.”
Note that \(\int _{\theta _1=-\infty }^{\theta _1=\infty }\int _{\theta _2=-\infty }^{\theta _2=\infty }2 \phi (\theta _1)\frac{1}{\sqrt{10}}\phi (\frac{\theta _2}{\sqrt{10}})I(\theta _2\ge \theta _1)\mathrm{d}\theta _1\mathrm{d}\theta _2=1\).
Sinharay (2017a) noted this phenomenon that occurs when \(\hat{\theta }_1\) and \(\hat{\theta }_2\) are very close—a conclusion of no significant score difference is made for the corresponding examinees.
That can be achieved by using 1.64 as the cutoff for the SLR statistic and a simulation-based cutoff for the Bayes factor.
Though, in those simulations, we noticed a slight tendency of the Bayes factor increasing with an increase in the prior variances of the ability distributions.
Sinharay and Johnson (2020) made some progress regarding the use of the posterior probability for score differencing.

References

Allen J, Ghattas A (2016) Estimating the probability of traditional copying, conditional on answer-copying statistics. Appl Psycho Meas 40:258–273
Article Google Scholar
Chen W-H, Thissen D (1997) Local dependence indexes for item pairs using item response theory. J Educ Behav Stat 22:265–289
Article Google Scholar
Cizek GJ, Wollack JA (2017) Handbook of detecting cheating on tests. Routledge, Washington, DC
Google Scholar
Cox DR (2006) Principles of statistical inference. Cambridge University Press, New York
Book MATH Google Scholar
Drasgow F, Levine MV, Williams EA (1985) Appropriateness measurement with polychotomous item response models and standardized indices. Br J Math Stat Psychol 38:67–86
Article Google Scholar
Finkelman M, Weiss DJ, Kim-Kang G (2010) Item selection and hypothesis testing for the adaptive measurement of change. Appl Psychol Meas 34:238–254
Article Google Scholar
Fischer GH (2003) The precision of gain scores under an item response theory perspective: a comparison of asymptotic and exact conditional inference about change. Appl Psychol Meas 27:3–26
Article MathSciNet Google Scholar
Fox J-P, Mulder J, Sinharay S (2017) Bayes factor covariance testing in item response models. Psychometrika 82:979–1006
Article MathSciNet MATH Google Scholar
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2014) Bayesian data analysis, 3rd edn. Chapman and Hall, New York
MATH Google Scholar
Gu X, Mulder J, Deković M, Hoijtink H (2014) Bayesian evaluation of inequality constrained hypotheses. Psychol Methods 19:511–527
Article Google Scholar
Guo J, Drasgow F (2010) Identifying cheating on unproctored internet tests: the Z-test and the likelihood ratio test. Int J Sel Assess 18:351–364
Article Google Scholar
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36
Article Google Scholar
Hoijtink H, Mulder J, van Lissa C, Gu X (2019) A tutorial on testing hypotheses using the Bayes factor. Psychol Methods. https://doi.org/10.1037/met0000201
Article Google Scholar
Jeffreys H (1961) Theory of probability. Oxford University Press, Oxford
MATH Google Scholar
Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795
Article MathSciNet MATH Google Scholar
Klugkist I, Laudy O, Hoijtink H (2005) Inequality constrained analysis of variance: a Bayesian approach. Psychol Methods 10:477–493
Article Google Scholar
Masson MEJ (2011) A tutorial on a practical Bayesian alternative to null-hypothesis significance testing. Behav Res Methods 43:679–690
Article Google Scholar
Morey RD, Romeijn J-W, Rouder JN (2016) The philosophy of Bayes factors and the quantification of statistical evidence. J Math Psychol 72:6–18
Article MathSciNet MATH Google Scholar
Mulder J, Klugkist I, van de Schoot R, Meeus WHJ, Selfhout M, Hoijtink H (2009) Bayesian model selection of informative hypotheses for repeated measurements. J Math Psychol 53:530–546
Article MathSciNet MATH Google Scholar
Orlando M, Thissen D (2000) Likelihood-based item-fit indices for dichotomous item response theory models. Appl Psychol Meas 24:50–64
Article Google Scholar
Schönbrodt FD, Wagenmakers E-J, Zehetleitner M, Perugini M (2017) Sequential hypothesis testing with Bayes factors: efficiently testing mean differences. Psychol Methods 22:322–339
Article Google Scholar
Sinharay S (2017a) Detection of item preknowledge using likelihood ratio test and score test. J Educ Behav Stat 42:46–68
Article Google Scholar
Sinharay S (2017b) Which statistic should be used to detect item preknowledge when the set of compromised items is known? Appl Psychol Meas 41:403–421
Article Google Scholar
Sinharay S (2018) Application of Bayesian methods for detecting fraudulent behavior on tests. Meas Interdiscip Res Perspect 16:100–113
Article Google Scholar
Sinharay S, Jensen JL (2019) Higher-order asymptotics and its application to testing the equality of the examinee ability over two sets of items. Psychometrika 84:484–510
Article MathSciNet MATH Google Scholar
Sinharay S, Johnson MS (2020) The use of the posterior probability in score differencing. J Educ Behav Stat (in press)
Sinharay S, Duong MQ, Wood SW (2017) A new statistic for detection of aberrant answer changes. J Educ Meas 54:200–217
Article Google Scholar
Skorupski WP, Wainer H (2017) The case for Bayesian methods when investigating test fraud. In: Cizek GJ, Wollack JA (eds) Handbook of detecting cheating on tests. Routledge, Washington, DC, pp 214–231
Google Scholar
Stern HS (2005) Model inference or model selection: discussion of Klugkist, Laudy, and Hoijtink (2005). Psychol Methods 10:494–499
Article Google Scholar
Tendeiro JN, Meijer RR (2014) Detection of invalid test scores: the usefulness of simple nonparametric statistics. J Educ Meas 51:239–259
Article Google Scholar
Tijmstra J, Hoijtink H, Sijtsma K (2015) Evaluating manifest monotonicity using bayes factors. Psychometrika 80:880–896
Article MathSciNet MATH Google Scholar
van der Linden WJ (2009) A bivariate lognormal response-time model for the detection of collusion between test takers. J Educ Behav Stat 34:378–394
Article Google Scholar
van der Linden WJ, Lewis C (2015) Bayesian checks on cheating on tests. Psychometrika 80:689–706
Article MathSciNet MATH Google Scholar
Verhagen J, Levy R, Millsap RE, Fox J-P (2016) Evaluating evidence for invariant items: a Bayes factor applied to testing measurement invariance in IRT models. J Math Psychol 72:171–182
Article MathSciNet MATH Google Scholar
Wagenmakers E-J (2007) A practical solution to the pervasive problems of p values. Psychon Bull Rev 14:779–804
Article Google Scholar
Wang X, Liu Y, Hambleton RK (2017) Detecting item preknowledge using a predictive checking method. Appl Psychol Meas 41:243–263
Article Google Scholar
Wang X, Liu Y, Robin F, Guo H (2019) A comparison of methods for detecting examinee preknowledge of items. Int J Test 19:207–226
Article Google Scholar
Warm TA (1989) Weighted likelihood estimation of ability in item response theory. Psychometrika 54:427–450
Article MathSciNet Google Scholar
Wasserman L (2004) All of statistics: a concise course in statistical inference. Springer, New York
Book MATH Google Scholar
Wasserstein RL, Lazar NA (2016) The ASA’s statement on p-values: context, process, and purpose. Am Stat 70:129–133
Article MathSciNet Google Scholar
Wetzels R, Matzke D, Lee MD, Rouder JN, Iverson GJ, Wagenmakers E-J (2011) Statistical evidence in experimental psychology. Perspect Psychol Sci 6:291–298
Article Google Scholar
Wollack JA, Schoenig RW (2018) Cheating. In: Frey BB (ed) The SAGE encyclopedia of educational research, measurement, and evaluation. Sage, Thousand Oaks, pp 260–265
Google Scholar
Wollack JA, Cohen AS, Eckerly CA (2015) Detecting test tampering using item response theory. Educ Psychol Meas 75:931–953
Article Google Scholar

Download references

Acknowledgements

The authors wish to express sincere appreciation and gratitude to Wim van der Linden and Kazuo Shigemasu, the editors. The authors thank Sooyeon Kim, Carol Eckerly, and Daniel McCaffrey for their helpful comments on an earlier version. Any opinions expressed in this publication are those of the authors and not necessarily of ETS or of Institute of Education Sciences. The research was supported by the Institute of Education Sciences, US Department of Education, through Grant R305D170026.

Author information

Authors and Affiliations

Educational Testing Service, Princeton, NJ, USA
Sandip Sinharay & Matthew S. Johnson

Authors

Sandip Sinharay
View author publications
You can also search for this author in PubMed Google Scholar
Matthew S. Johnson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sandip Sinharay.

Additional information

Communicated by Kazuo Shigemasu.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Any opinions expressed in this publication are those of the authors and not necessarily of Educational Testing Service or Institute of Education Sciences.

About this article

Cite this article

Sinharay, S., Johnson, M.S. Detecting test fraud using Bayes factors. Behaviormetrika 47, 339–354 (2020). https://doi.org/10.1007/s41237-020-00113-9

Download citation

Received: 20 September 2019
Accepted: 09 April 2020
Published: 02 May 2020
Issue Date: July 2020
DOI: https://doi.org/10.1007/s41237-020-00113-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting test fraud using Bayes factors

Abstract

Access this article

Similar content being viewed by others

Bayesian Checks on Cheating on Tests

What do the experts know? Calibration, precision, and the wisdom of crowds among forensic handwriting experts

Cheating Detection Method Based on Improved Cognitive Diagnosis Model

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

About this article

Cite this article

Keywords

Navigation

Detecting test fraud using Bayes factors

Abstract

Access this article

Similar content being viewed by others

Bayesian Checks on Cheating on Tests

What do the experts know? Calibration, precision, and the wisdom of crowds among forensic handwriting experts

Cheating Detection Method Based on Improved Cognitive Diagnosis Model

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

About this article

Cite this article

Share this article

Keywords

Search

Navigation