Skip to main content
Springer Nature Link
Log in
Menu
Find a journal Publish with us Track your research
Search
Cart
  1. Home
  2. Psychometrika
  3. Article

Future of Psychometrics: Ask What Psychometrics Can Do for Psychology

  • Published: 03 December 2011
  • Volume 77, pages 4–20, (2012)
  • Cite this article
Download PDF
Psychometrika Aims and scope Submit manuscript
Future of Psychometrics: Ask What Psychometrics Can Do for Psychology
Download PDF
  • Klaas Sijtsma1 
  • 4407 Accesses

  • 34 Citations

  • 3 Altmetric

  • Explore all metrics

Abstract

I address two issues that were inspired by my work on the Dutch Committee on Tests and Testing (COTAN). The first issue is the understanding of problems test constructors and researchers using tests have of psychometric knowledge. I argue that this understanding is important for a field, like psychometrics, for which the dissemination of psychometric knowledge among test constructors and researchers in general is highly important. The second issue concerns the identification of psychometric research topics that are relevant for test constructors and test users but in my view do not receive enough attention in psychometrics. I discuss the influence of test length on decision quality in personnel selection and quality of difference scores in therapy assessment, and theory development in test construction and validity research. I also briefly mention the issue of whether particular attributes are continuous or discrete.

Article PDF

Download to read the full article text

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

  • American Educational Research Association, American Psychological Association & National Council on Measurement in Education (1999). Standards for educational and psychological testing. Washington: American Educational Research Association.

    Google Scholar 

  • Atkins, D.C., Bedics, J.D., McGlinchey, J.B., & Beauchaine, T.P. (2005). Assessing clinical significance: does it matter which method we use? Journal of Consulting and Clinical Psychology, 73, 982–989.

    Article  PubMed  Google Scholar 

  • Bauer, S., Lambert, M.J., & Nielsen, S.L. (2004). Clinical significance methods: a comparison of statistical techniques. Journal of Personality Assessment, 82, 60–70.

    Article  PubMed  Google Scholar 

  • Bentler, P.A., & Woodward, J.A. (1980). Inequalities among lower bounds to reliability: with applications to test construction and factor analysis. Psychometrika, 45, 249–267.

    Article  Google Scholar 

  • Boring, E.G. (1923). Intelligence as the tests test it. New Republic, 35, 35–37.

    Google Scholar 

  • Borsboom, D., Cramer, A.O.J., Kievit, R.A., Zand Scholten, A., & Franić, S. (2009). The end of construct validity. In R.W. Lissitz (Ed.), The concept of validity. Revisions, new directions, and applications (pp. 135–170). Charlotte: Information Age Publishing, Inc.

    Google Scholar 

  • Borsboom, D., Mellenbergh, G.J., & van Heerden, J. (2004). The concept of validity. Psychological review, 111, 1061–1071.

    Article  PubMed  Google Scholar 

  • Bouwmeester, S., Vermunt, J.K., & Sijtsma, K. (2007). Development and individual differences in transitive reasoning: a fuzzy trace theory approach. Developmental Review, 27, 41–74.

    Article  Google Scholar 

  • Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3, 296–322.

    Google Scholar 

  • Cronbach, L.J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.

    Article  Google Scholar 

  • Cronbach, L.J., & Furby, L. (1970). How we should measure “change”—or should we? Psychological Bulletin, 74, 68–80.

    Article  Google Scholar 

  • De Boeck, P., & Wilson, M. (2004). Explanatory item response models. A generalized linear and nonlinear approach. New York: Springer.

    Google Scholar 

  • Denollet, J. (2000). Type D personality: a potential risk facor refined. Journal of Psychosomatic Research, 49, 255–266.

    Article  PubMed  Google Scholar 

  • Denollet, J. (2005). DS14: standard assessment of negative affectivity, social inhibition, and Type D personality. Psychosomatic Medicine, 67, 89–97.

    Article  PubMed  Google Scholar 

  • Emons, W.H.M., Denollet, J., Sijtsma, K., & Pedersen, S.S. (2011). Dimensional and categorical approaches to the Type D personality construct (in preparation).

  • Emons, W.H.M., Sijtsma, K., & Meijer, R.R. (2007). On the consistency of individual classification using short scales. Psychological Methods, 12, 105–120.

    Article  PubMed  Google Scholar 

  • Evers, A., Sijtsma, K., Lucassen, W., & Meijer, R.R. (2010). The Dutch review process for evaluating the quality of psychological tests: history, procedure and results. International Journal of Testing, 10, 295–317.

    Article  Google Scholar 

  • Ferguson, E., et al. (2009). A taxometric analysis of Type D personality. Psychosomatic Medicine, 71, 981–986.

    Article  PubMed  Google Scholar 

  • Fischer, G.H. (1995). The linear logistic test model. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models. Foundations, recent developments and applications (pp. 131–155). New York: Springer.

    Google Scholar 

  • Green, S.A., & Yang, Y. (2009). Commentary on coefficient alpha: a cautionary tale. Psychometrika, 74, 121–135.

    Article  Google Scholar 

  • Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10, 255–282.

    Article  PubMed  Google Scholar 

  • Hermans, H.J.M. (2011). Prestatie Motivatie Test voor Kinderen 2 (PMT-K-2) (Performance motivation test for children 2). Amsterdam: Pearson Assessment.

    Google Scholar 

  • Jacobson, N.S., & Truax, P. (1991). Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. Journal of Consulting and Clinical Psychology, 59, 12–19.

    Article  PubMed  Google Scholar 

  • Jansen, B.R.J., & Van der Maas, H.L.J. (1997). Statistical test of the rule assessment methodology by latent class analysis. Developmental Review, 17, 321–357.

    Article  Google Scholar 

  • Jansen, B.R.J., & Van der Maas, H.L.J. (2002). The development of children’s rule use on the balance scale task. Journal of Experimental Child Psychology, 81, 383–416.

    Article  PubMed  Google Scholar 

  • Kapinga, T.J. (2010). Drempelonderzoek. Didactische plaatsbepaling binnen het voortgezet onderwijs en praktijkonderwijs. 5 e versie 2010 (Threshold investigation. Didactical location within secondary education and practical education. 5th Version 2010). Ridderkerk: 678 Onderwijs Advisering.

    Google Scholar 

  • Korkman, M., Kirk, U., & Kemp, S. (2010). NEPSY-II-NL. Nederlandstalige bewerking (A developmental neuropsycological assessment, II, Dutch version). Amsterdam: Pearson Assessment.

    Google Scholar 

  • Kruyen, P.M., Emons, W.H.M., & Sijtsma, K. (in press). Test length and decision quality in personnel selection: when is short too short? International Journal of Testing.

  • Lissitz, R.W. (2009). The concept of validity. Revisions, new directions, and applications. Charlotte: Information Age Publishing, Inc.

    Google Scholar 

  • Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.

    Google Scholar 

  • Mellenbergh, G.J. (1996). Measurement precision in test score and item response models. Psychological Methods, 1, 293–299.

    Article  Google Scholar 

  • Mellenbergh, G.J. (1999). A note on simple gain score precision. Applied Psychological Measurement, 23, 87–89.

    Google Scholar 

  • Michell, J. (1999). Measurement in psychology. A critical history of a methodological concept. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Nicewander, W.A., & Price, J.M. (1983). Reliability of measurement and the power of statistical tests: some new results. Psychological Bulletin, 94, 524–533.

    Article  Google Scholar 

  • Novick, M.R., & Lewis, C. (1967). Coefficient alpha and the reliability of composite measurements. Psychometrika, 32, 1–13.

    Article  PubMed  Google Scholar 

  • Ogles, B.M., Lunnen, K.M., & Bonesteel, K. (2001). Clinical significance: history, application, and current practice. Clinical Psychology Review, 21, 421–446.

    Article  PubMed  Google Scholar 

  • Raykov, T. (2001). Bias of coefficient α for fixed congeneric measures with correlated errors. Applied Psychological Measurement, 25, 69–76.

    Article  Google Scholar 

  • Reise, S.P., & Haviland, M.G. (2005). Item response theory and the measurement of clinical change. Journal of Personality Assessment, 84, 228–238.

    Article  PubMed  Google Scholar 

  • Ruscio, J., Haslam, N., & Ruscio, A.M. (2006). Introduction to the taxometric method: a practical guide. Mahwah: Erlbaum.

    Google Scholar 

  • Samejima, F. (1969). Psychometrika monograph: Vol. 17. Estimation of latent ability using a response pattern of graded scores. Richmond: Psychometric Society.

    Google Scholar 

  • Schlichting, L., & Lutje Spelberg, H. (2010). Schlichting Test voor Taalproductie—II (Schlichting test for language production—II). Houten: Bohn Stafleu van Loghum.

    Google Scholar 

  • Siegler, R.S. (1981). Developmental sequences within and between concepts. Monographs of the Society for Research in Child Development, 46(2, Serial No. 189).

  • Sijtsma, K. (2009a). On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika, 74, 107–120.

    Article  PubMed  Google Scholar 

  • Sijtsma, K. (2009b). Reliability beyond theory and into practice. Psychometrika, 74, 169–173.

    Article  PubMed  Google Scholar 

  • Sijtsma, K. (2011). Psychological measurement between physics and statistics. Theory & Psychology.

  • Sijtsma, K., & Emons, W.H.M. (2011). Advice on total-score reliability issues in psychosomatic measurement. Journal of Psychosomatic Research, 70, 565–572.

    Article  PubMed  Google Scholar 

  • Singh, S. (1997). Fermat’s last theorem. London: Harper Perennial.

    Google Scholar 

  • Spearman, C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3, 271–295.

    Google Scholar 

  • Stevens, S.S. (1946). On the theory of scales of measurement. Science, 103, 677–680.

    Article  Google Scholar 

  • Smits, D.J.M., & De Boeck, P. (2003). A componential IRT model for guilt. Multivariate Behavioral Research, 38, 161–188.

    Article  Google Scholar 

  • Ten Berge, J.M.F., Snijders, T.A.B., & Zegers, F.E. (1981). Computational aspects of the greatest lower bound to the reliability and constrained minimum trace factor analysis. Psychometrika, 46, 201–213.

    Article  Google Scholar 

  • Van Breukelen, G.J.P., & Vlaeyen, J.W.S. (2005). Norming clinical questionnaires with multiple regression: the pain cognition list. Psychological Assessment, 17, 336–344.

    Article  PubMed  Google Scholar 

  • Van Maanen, L., Been, P.H., & Sijtsma, K. (1989). Problem solving strategies and the linear logistic test model. In E.E.C.I. Roskam (Ed.), Mathematical psychology in progress (pp. 267–287). New York: Springer.

    Google Scholar 

  • Verguts, T., & De Boeck, P. (2002). The induction of solution rules in Raven’s progressive matrices test. European Journal of Cognitive Psychology, 14, 521–547.

    Article  Google Scholar 

  • Zachary, R.A., & Gorsuch, R.L. (1985). Continuous norming: implications for the WAIS-R. Journal of Clinical Psychology, 41, 86–94.

    Article  PubMed  Google Scholar 

  • Zhu, J., & Chen, H.-Y. (2011). Utility of inferential norming with smaller sample sizes. Journal of Psychoeducational Assessment. doi:10.1177/0734282910396323.

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Department of Methodology and Statistics, TSB, Tilburg University, PO Box 90153, 5000 LE, Tilburg, The Netherlands

    Klaas Sijtsma

Authors
  1. Klaas Sijtsma
    View author publications

    You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Klaas Sijtsma.

Additional information

This article is based on the author’s Presidential Address, presented at the International Meeting of the Psychometric Society 2011, July 18–22, 2011, Hong Kong, China.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sijtsma, K. Future of Psychometrics: Ask What Psychometrics Can Do for Psychology. Psychometrika 77, 4–20 (2012). https://doi.org/10.1007/s11336-011-9242-4

Download citation

  • Published: 03 December 2011

  • Issue Date: January 2012

  • DOI: https://doi.org/10.1007/s11336-011-9242-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Key words

  • change assessment
  • decision quality based on short tests
  • didactics of psychometrics
  • personnel selection
  • test-quality assessment
  • test validity
  • theory construction
Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Advertisement

Search

Navigation

  • Find a journal
  • Publish with us
  • Track your research

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Journal finder
  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support
  • Cancel contracts here

65.109.116.201

Not affiliated

Springer Nature

© 2024 Springer Nature