Skip to main content

Modeling Covarying Responses in Complex Tasks

  • 394 Accesses

Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS,volume 393)

Abstract

In testing situations, participants are often asked for supplementary responses in addition to the primary response of interest, which may include quantities like confidence or reported difficulty. These additional responses can be incorporated into a psychometric model either as a predictor of the main response or as a secondary response. In this paper we explore both of these approaches for incorporating participant’s reported difficulty into a psychometric model using an error rate study of fingerprint examiners. Participants were asked to analyze print pairs and make determinations about the source, which can be scored as correct or incorrect decisions. Additionally, participants were asked to report the difficulty of the print pair on a five point scale. In this paper, we model (a) the responses of individual examiners without incorporating reported difficulty using a Rasch model, (b) the responses using their reported difficulty as a predictor, and (c) the responses and their reported difficulty as a multivariate response variable. We find that approach (c) results in more balanced classification errors, but incorporating reported difficulty using either approach does not lead to substantive changes in proficiency or difficulty estimates. These results suggest that, while there are individual differences in reported difficulty, these differences appear to be unrelated to examiners’ proficiency in correctly distinguishing matched from non-matched fingerprints.

Keywords

  • Item response theory
  • Forensic science
  • Bayesian statistics

This work was partially funded by the Center for Statistics and Applications in Forensic Evidence (CSAFE) through Cooperative Agreements 70NANB15H176 and 70NANB20H019 between NIST and Iowa State University, which includes activities carried out at Carnegie Mellon University, Duke University, University of California Irvine, University of Virginia, West Virginia University, University of Pennsylvania, Swarthmore College and University of Nebraska, Lincoln.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    These latent evaluation categories may vary depending on different laboratory practices. We use the categories that were recorded in the Black Box study (Ulery et al., 2011).

  2. 2.

    Individualizations are no longer recommended in practice, in favor of ‘identification’ or ‘same source’ conclusions. Since the data used in this paper was collected in 2011 and used the ‘Individualization’ terminology, this is what we use throughout. See Friction Ridge Subcommittee of the Organization of Scientific Area Committees for Forensic Science (2017, 2019) for further discussion and current recommendations.

References

  • AAAS. (2017). Forensic science assessments: A quality and gap analysis - latent fingerprint examination. Tech. rep., (prepared by William Thompson, John Black, Anil Jain, and Joseph Kadane)

    Google Scholar 

  • Batchelder, W. H., & Romney, A. K. (1988). Test theory without an answer key. Psychometrika, 53(1), 71–92.

    CrossRef  MathSciNet  Google Scholar 

  • Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01

    CrossRef  Google Scholar 

  • Bürkner, P. C. (2017). brms: An R package for Bayesian multilevel models using Stan. Journal of Statistical Software, 80(1), 1–28.

    CrossRef  Google Scholar 

  • Bürkner, P. C. (2019). Bayesian item response modeling in R with brms and Stan. Preprint, arXiv:190509501.

    Google Scholar 

  • De Boeck, P., & Partchev, I. (2012). IRTrees: Tree-based item response models of the GLMM family. Journal of Statistical Software, Code Snippets, 48(1), 1–28. https://doi.org/10.18637/jss.v048.c01, https://www.jstatsoft.org/v048/c01

  • De Boeck, P., & Wilson, M. (2004). Explanatory item response models: A generalized linear and nonlinear approach. New York: Springer.

    CrossRef  Google Scholar 

  • Dror, I. E., & Scurich, N. (2020). (Mis) use of scientific measurements in forensic science. Forensic Science International: Synergy, 2, 333–338.

    Google Scholar 

  • Eldridge, H., De Donno, M., & Champod, C. (2021). Testing the accuracy and reliability of palmar friction ridge comparisons–a black box study. Forensic Science International, 318, 110457.

    CrossRef  Google Scholar 

  • Ferrando, P. J., & Lorenzo-Seva, U. (2007). An item response theory model for incorporating response time data in binary personality items. Applied Psychological Measurement, 31(6), 525–543. https://doi.org/10.1177/0146621606295197

    CrossRef  MathSciNet  Google Scholar 

  • Fischer, G. H., & Molenaar, I. W. (2012). Rasch models: Foundations, recent developments, and applications. New York: Springer Science & Business Media.

    Google Scholar 

  • Friction Ridge Subcommittee of the Organization of Scientific Area Committees for Forensic Science. (2017). Guideline for the articulation of the decision-making process leading to an expert opinion of source identification in friction ridge examinations. Online; accessed September 15, 2021.

    Google Scholar 

  • Friction Ridge Subcommittee of the Organization of Scientific Area Committees for Forensic Science. (2019). Friction ridge process map (current practice). Online; accessed September 15, 2021.

    Google Scholar 

  • Hofmann, H., Carriquiry, A., & Vanderplas, S. (2020). Treatment of inconclusives in the AFTE range of conclusions. Law, Probability and Risk, 19(3–4), 317–364.

    Google Scholar 

  • Holland, P. W., & Wainer, H. (2012). Differential item functioning. Routledge.

    CrossRef  Google Scholar 

  • Jeon, M., De Boeck, P., & van der Linden, W. (2017). Modeling answer change behavior: An application of a generalized item response tree model. Journal of Educational and Behavioral Statistics, 42(4), 467–490.

    CrossRef  Google Scholar 

  • Koehler, J. J. (2007). Fingerprint error rates and proficiency tests: What they are and why they matter. Hastings LJ, 59, 1077.

    Google Scholar 

  • Luby, A. (2019). Decision making in forensic identification tasks. In S. Tyner & H. Hofmann (Eds.), Open forensic science in R (Chap. 13). rOpenSci, US.

    Google Scholar 

  • Luby, A., Mazumder, A., & Junker, B. (2020). Psychometric analysis of forensic examiner behavior. Behaviormetrika, 47, 355–384.

    CrossRef  Google Scholar 

  • Luby, A., Mazumder, A., & Junker, B. (2021). Psychometrics for forensic fingerprint comparisons. In Quantitative psychology (pp. 385–397). Springer.

    Google Scholar 

  • Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149–174.

    CrossRef  Google Scholar 

  • R Core Team. (2013). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org/

  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Chicago: University of Chicago Press.

    Google Scholar 

  • Sinharay, S., Johnson, M. S., & Stern, H. S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30(4), 298–321.

    CrossRef  MathSciNet  Google Scholar 

  • Stan Development Team. (2018a). RStan: The R interface to Stan. r package version 2.18.2. http://mc-stan.org/

  • Stan Development Team. (2018b). Stan modeling language users guide and reference manual. http://mc-stan.org

  • Thissen, D. (1983). 9 - timed testing: An approach using item response theory. In D. J. Weiss (Ed.), New horizons in testing (pp. 179–203). San Diego: Academic.

    Google Scholar 

  • Ulery, B. T., Hicklin, R. A., Buscaglia, J., & Roberts, M. A. (2011). Accuracy and reliability of forensic latent fingerprint decisions. Proceedings of the National Academy of Sciences, 108(19), 7733–7738.

    CrossRef  Google Scholar 

  • Ulery, B. T., Hicklin, R. A., Buscaglia, J., & Roberts, M. A. (2012). Repeatability and reproducibility of decisions by latent fingerprint examiners. PloS One, 7(3), e32800.

    CrossRef  Google Scholar 

  • van der Linden, W. J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31(2), 181–204.

    CrossRef  Google Scholar 

  • van der Linden, W. J., Klein Entink, R. H., & Fox, J. P. (2010). IRT parameter estimation with response times as collateral information. Applied Psychological Measurement, 34(5), 327–347.

    CrossRef  Google Scholar 

  • Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing, 27(5), 1413–1432.

    CrossRef  MathSciNet  Google Scholar 

  • Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning Research, 11(Dec), 3571–3594.

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amanda Luby .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Luby, A., Thompson, R.E. (2022). Modeling Covarying Responses in Complex Tasks. In: Wiberg, M., Molenaar, D., González, J., Kim, JS., Hwang, H. (eds) Quantitative Psychology. IMPS 2021. Springer Proceedings in Mathematics & Statistics, vol 393. Springer, Cham. https://doi.org/10.1007/978-3-031-04572-1_6

Download citation