Skip to main content

Abstract

The concepts and methods of psychometrics originated under trait and behavioral psychology, with relatively simple data, used mainly for purposes of prediction and selection. Ideas emerged over that nevertheless hold value for the new psychological perspectives, contexts of use, and forms of data and analytic tools we are now seeing. In this chapter we review some fundamental models and ideas from psychometrics that are be profitably reconceived, extended, and augmented in in the new world of assessment. Methods we address include classical test theory, generalizability theory, item response theory, latent class models, cognitive diagnosis models, factor analysis, hierarchical models, and Bayesian networks. Key concepts are these: (1) The essential nature of psychometric models (observations, constructs, latent variables, and probability-based reasoning). (2) The interplay of design and discovery in assessment. (3) Understanding the measurement issues of validity, reliability, comparability, generalizability, and fairness as social values that pertain even as forms of data, analysis, context, and purpose evolve.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Most of the other methodological chapters provide data and computer code for examples. The R or Python codes for those chapters can be found at the GitHub repository of this book, https://github.com/jgbrainstorm/computational_psychometrics. This chapter is instead meant to survey a large number of models and discuss underlying concepts. Fortunately, the literature offers many examples, tutorials, and more technical presentations on psychometric models. Many useful R packages are freely available for the models we discuss. The CRAN project web site maintains a comprehensive listing and brief descriptions of such resources, at https://cran.r-project.org/web/views/Psychometrics.html.

  2. 2.

    Note that for some of the models, this notation is not the one typically used within that modeling framework.

References

  • Almond, R. G., Mislevy, R. J., Steinberg, L. S., Yan, D., & Williamson, D. M. (2015). Bayesian networks in educational assessment. New York: Springer.

    Book  Google Scholar 

  • Bartholomew, D. J. (1980). Factor analysis for categorical data. Journal of the Royal Statistical Society. Series B (Methodological), 42(3), 293–321.

    Article  Google Scholar 

  • Behrens, J. T., & DiCerbo, K. E. (2014). Harnessing the currents of the digital ocean. In J. Larusson & B. White (Eds.), Learning analytics (pp. 39–60). New York, NY: Springer.

    Chapter  Google Scholar 

  • Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick, Statistical theories of mental test scores (pp. 395–479). Oxford, UK: Addison-Wesley.

    Google Scholar 

  • Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27(6), 395–414.

    Article  Google Scholar 

  • Bradlow, E. T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64(2), 153–168.

    Article  Google Scholar 

  • Breese, J. S., Goldman, R. P., & Wellman, M. P. (1994). Introduction to the special section on knowledge-based construction of probabilistic and decision models. IEEE Transactions on Systems, Man, and Cybernetics, 24(11), 1577–1579.

    Google Scholar 

  • Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105(3), 456.

    Article  Google Scholar 

  • Chen, Y., Culpepper, S. A., Chen, Y., & Douglas, J. (2018). Bayesian estimation of the DINA Q matrix. Psychometrika, 83(1), 89–108.

    Article  Google Scholar 

  • Cronbach, L., Gleser, G., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley.

    Google Scholar 

  • De Boeck, P., & Partchev, I. (2012). IRTrees: Tree-based item response models of the GLMM family. Journal of Statistical Software, 48, 1–28.

    Article  Google Scholar 

  • De Boeck, P., & Wilson, M. (2004). Explanatory item response models. New York: Springer.

    Book  Google Scholar 

  • De la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179–199.

    Article  Google Scholar 

  • Embretson, S. E. (1991). A multidimensional latent trait model for measuring learning and change. Psychometrika, 56(3), 495–515.

    Article  Google Scholar 

  • Epskamp, S., Maris, G., Waldorp, L. J., & Borsboom, D. (2018). Network psychometrics. In P. Irwing, D. Hughes, & T. Booth (Eds.), The Wiley handbook of psychometric testing. New York: Elsevier.

    Google Scholar 

  • Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37(6), 359–374.

    Article  Google Scholar 

  • Fox, J.-P. (2010). Bayesian item response modeling: Theory and applications. New York: Springer Science & Business Media.

    Book  Google Scholar 

  • Gao, X., Shavelson, R. J., & Baxter, G. P. (1994). Generalizability of large-scale performance assessments in science: Promises and problems. Applied Measurement in Education, 7(4), 323–342.

    Article  Google Scholar 

  • Geerlings, H., Glas, C. A., & van der Linden, W. J. (2011). Modeling rule-based item generation. Psychometrika, 76(2), 337.

    Article  Google Scholar 

  • Gelman, A., Stern, H. S., Carlin, J. B., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis. New York: Chapman and Hall/CRC.

    Book  Google Scholar 

  • Gulliksen, H. (2013). Theory of mental tests. New York: Routledge.

    Book  Google Scholar 

  • Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191.

    Article  Google Scholar 

  • Jansen, B. R., & van der Maas, H. L. (2002). The development of children’s rule use on the balance scale task. Journal of Experimental Child Psychology, 81(4), 383–416.

    Article  Google Scholar 

  • Jones, L. V., & Thissen, D. (2007). A history and overview of psychometrics. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics (Vol. 26, pp. 1–27). New York: Elsevier.

    Google Scholar 

  • Joreskog, K. G., Sorbom, D., & Magidson, J. (1979). Advances in factor analysis and structural equation models. New York, NY: New York University Press.

    Google Scholar 

  • Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258–272.

    Article  Google Scholar 

  • Lazarsfeld, P. F. (1959). Latent structure analysis. Psychology: A Study of a Science, 3, 476–543.

    Google Scholar 

  • Levy, R. (2009). The rise of Markov chain Monte Carlo estimation for psychometric modeling. Journal of Probability and Statistics, 2009, Article ID 537139.

    Google Scholar 

  • Levy, R., & Mislevy, R. J. (2016). Bayesian psychometric modeling. New York: Chapman and Hall/CRC.

    Google Scholar 

  • Little, R. J., & Rubin, D. B. (2014). Statistical analysis with missing data. Hoboken, NJ: Wiley.

    Google Scholar 

  • Liu, J., Xu, G., & Ying, Z. (2012). Data-driven learning of Q-matrix. Applied Psychological Measurement, 36(7), 548–564.

    Article  Google Scholar 

  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Oxford, UK: Addison-Wesley.

    Google Scholar 

  • Luo, Y., & Jiao, H. (2018). Using the Stan program for Bayesian item response theory. Educational and Psychological Measurement, 78(3), 384–408.

    Article  Google Scholar 

  • Magis, D., Béland, S., Tuerlinckx, F., & De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42(3), 847–862.

    Article  Google Scholar 

  • Marsman, M., Borsboom, D., Kruis, J., Epskamp, S., van Bork, R. van, Waldorp, L., …Maris, G. (2018). An introduction to network psychometrics: Relating Ising network models to item response theory models. Multivariate Behavioral Research, 53(1), 15–35.

    Google Scholar 

  • Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of Educational Research, 13(2), 127–143.

    Article  Google Scholar 

  • Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525–543.

    Article  Google Scholar 

  • Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23.

    Article  Google Scholar 

  • Millsap, R. E. (2012). Statistical approaches to measurement invariance. New York: Routledge.

    Book  Google Scholar 

  • Mislevy, R. J. (2016). Missing responses in item response modeling. In W. J. van der Linden (Ed.), Handbook of item response theory, volume two: Statistical tools (pp. 171–194). Boca Raton, FL: Chapman Hall/CRC Press.

    Google Scholar 

  • Mislevy, R. J., Behrens, J. T., DiCerbo, K. E., & Levy, R. (2012). Design and discovery in educational assessment: Evidence-centered design, psychometrics, and educational data mining. Journal of Educational Data Mining, 4(1), 11–48.

    Google Scholar 

  • Mislevy, R. J., & Gitomer, D. H. (1995). The role of probability-based inference in an intelligent tutoring system. User Modeling and User-Adapted Interaction, 5(3), 253–282.

    Google Scholar 

  • Mislevy, R. J., & Wu, P.-K. (1996). Missing responses and IRT ability estimation: Omits, choice, time limits, and adaptive testing. Tech. Rep. No. RR-96-30-ONR. Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Murphy, K. P. (2002). Dynamic Bayesian networks: Representation, inference and learning. Unpublished doctoral dissertation, University of California at Berkeley.

    Google Scholar 

  • National Research Council. (2001). Knowing what students know: The science and design of educational assessment. Washington, D.C.: National Academies Press.

    Google Scholar 

  • Patz, R. J., & Junker, B. W. (1999). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24(2), 146–178.

    Article  Google Scholar 

  • Pourret, O., Naïm, P., & Marcot, B. (2008). Bayesian networks: A practical guide to applications. John Wiley & Sons.

    Book  Google Scholar 

  • Rabe-Hesketh, S., & Skrondal, A. (2004). Generalized latent variable modeling: Multilevel, longitudinal, and structural equation models. New York: Chapman and Hall/CRC.

    Google Scholar 

  • Raju, N. S. (1988). The area between two item characteristic curves. Psychometrika, 53(4), 495–502.

    Article  Google Scholar 

  • Rasch, G. (1960). Probabilistic models for some intelligence and achievement tests. Copenhagen: Danish Institute for Educational Research.

    Google Scholar 

  • Reckase, M. D. (2009). Multidimensional item response theory models. In M. D. Reckase (Ed.), Multidimensional item response theory (pp. 79–112). New York: Springer.

    Chapter  Google Scholar 

  • Romero, C., Ventura, S., Pechenizkiy, M., & Baker, R. S. (2010). Handbook of educational data mining. New York: Chapman and Hall/CRC press.

    Book  Google Scholar 

  • Rupp, A., Templin, J., & Henson, R. (2010). Diagnostic assessment: Theory, methods, and applications. New York: Guilford.

    Google Scholar 

  • Scalise, K. (2017). Hybrid measurement models for technology-enhanced assessments through mIRT-bayes. International Journal of Statistics and Probability, 6(3), 168.

    Article  Google Scholar 

  • Shute, V. J. (2011). Stealth assessment in computer-based games to support learning. In J. D. Fletcher & S. Tobias (Eds.), Computer games and instruction (pp. 503–524). Charlotte, NC: Information Age Press.

    Google Scholar 

  • Spearman, C. (1904). “General Intelligence,” objectively determined and measured. The American Journal of Psychology, 15(2), 201–292.

    Article  Google Scholar 

  • Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361–370.

    Article  Google Scholar 

  • Takane, Y., & De Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52(3), 393–408.

    Article  Google Scholar 

  • Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20(4), 345–354.

    Article  Google Scholar 

  • Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 147–169). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.

    Google Scholar 

  • Tucker, L. R. (1955). The objective definition of simple structure in linear factor analysis. Psychometrika, 20(3), 209–225.

    Article  Google Scholar 

  • VanLehn, K., Lynch, C., Schulze, K., Shapiro, J. A., Shelby, R., Taylor, L.,…Wintersgill, M. (2005). The Andes physics tutoring system: Lessons learned. International Journal of Artificial Intelligence in Education, 15(3), 147–204.

    Google Scholar 

  • von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61(2), 287–307.

    Article  Google Scholar 

  • Wang, X., Berger, J. O., Burdick, D. S., et al. (2013). Bayesian analysis of dynamic item response models in educational testing. The Annals of Applied Statistics, 7(1), 126–153.

    Article  Google Scholar 

  • Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applications. Cambridge, UK: Cambridge University Press.

    Book  Google Scholar 

  • Way, W. D., Ansley, T. N., & Forsyth, R. A. (1988). The comparative effects of compensatory and noncompensatory two-dimensional data on unidimensional IRT estimates. Applied Psychological Measurement, 12(3), 239–252.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert J. Mislevy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Mislevy, R.J., Bolsinova, M. (2021). Concepts and Models from Psychometrics. In: von Davier, A.A., Mislevy, R.J., Hao, J. (eds) Computational Psychometrics: New Methodologies for a New Generation of Digital Learning and Assessment. Methodology of Educational Measurement and Assessment. Springer, Cham. https://doi.org/10.1007/978-3-030-74394-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-74394-9_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-74393-2

  • Online ISBN: 978-3-030-74394-9

  • eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics