Psychometrics and its discontents: an historical perspective on the discourse of the measurement tradition

Schoenherr, Jordan Richard; Hamstra, Stanley J.

doi:10.1007/s10459-015-9623-z

Psychometrics and its discontents: an historical perspective on the discourse of the measurement tradition

Reflections
Published: 25 August 2015

Volume 21, pages 719–729, (2016)
Cite this article

Advances in Health Sciences Education Aims and scope Submit manuscript

Jordan Richard Schoenherr^1,2 &
Stanley J. Hamstra³

2218 Accesses
18 Citations
123 Altmetric
17 Mentions
Explore all metrics

Abstract

Psychometrics has recently undergone extensive criticism within the medical education literature. The use of quantitative measurement using psychometric instruments such as response scales is thought to emphasize a narrow range of relevant learner skills and competencies. Recent reviews and commentaries suggest that a paradigm shift might be presently underway. We argue for caution, in that the psychometrics approach and the quantitative account of competencies that it reflects is based on a rich discussion regarding measurement and scaling that led to the establishment of this paradigm. Rather than reflecting a homogeneous discipline focused on core competencies devoid of consideration of context, the psychometric community has a history of discourse and debate within the field, with an acknowledgement that the techniques and instruments developed within psychometrics are heuristics that must be used pragmatically.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Psychometrics in action, science as practice

Article 27 July 2017

Bias Assessment and Prevention in Noncognitive Outcome Measures in Context Assessments

We need more replication research – A case for test-retest reliability

Article Open access 07 April 2017

References

Bessel, F. W. (1823). Astronomische Beobachtungen auf der Koniglichen Universitäts—Sternwarte in Konigsberg, vol 8, pp. III–VIII.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105.
Article Google Scholar
Cattell, J. M. (1890). Mental tests and measurements. Mind, 15, 373–380.
Article Google Scholar
Cook, D. A., & Beckman, T. J. (2006). Current concepts in validity and reliability for psychometric instruments: Theory and application. The American Journal of Medicine, 119, e7–e16.
Article Google Scholar
Cook, D. A., Zendejas, B., Hamstra, S. J., Hatala, R., & Brydges, R. (2014). What counts as validity evidence? Examples and prevalence in a systematic review of simulation-based assessment. Advances in Health Sciences Education, 19(2), 233–250.
Article Google Scholar
Coombs, C. H. (1953). Theory and methods of social measurement. In L. Festinger & D. Katz (Eds.), Research methods in the behavioral sciences. New York: Holt, Rinehart, & Winston.
Google Scholar
Coombs, C. H. (1960). A theory of data. Psychological Review, 67, 143–159.
Article Google Scholar
Coombs, C. H. (1964). A theory of data. New York: Wiley.
Google Scholar
Cronbach, L. J. (1975). Beyond the two disciplines of scientific psychology. American Psychologist, 30, 116–127.
Article Google Scholar
Dudek, N., Marks, M., & Regehr, G. (2005). Failing to fail: The perspectives of clinical supervisors. Academic Medicine, 80(10 Suppl.), S84–S87.
Article Google Scholar
Frank, J. R., Snell, L. S., Cate, O. T., Holmboe, E. S., Carraccio, C., Swing, S. R., et al. (2010). Competency-based medical education: Theory to practice. Medical Teacher, 32(8), 638–645.
Article Google Scholar
Gigerenzer, G., & Sturm, T. (2005). Tools = theories = data? On some circular dynamics in cognitive science. In M. G. Ash & T. Sturm (Eds.), Psychology’s territories: Historical and contemporary perspectives from different disciplines (pp. 305–342). London: Lawrence Erlbaum Associates.
Google Scholar
Gigerenzer, G., Swijtink, Z., Porter, T., Daston, L., Beatty, J., & Krüger, L. (1989). The empire of chance: How probability changed science and everyday life. Cambridge: Cambridge University Press.
Book Google Scholar
Gingerich, A., Regehr, G., & Eva, K. W. (2011). Rater-based assessments as social judgments: Rethinking the etiology of rater errors. Academic Medicine, 86, S1–S7.
Article Google Scholar
Ginsburg, S., Regehr, G., Hatala, R., McNaughton, N., Frohna, A., Hodges, B., et al. (2000). Context, conflict, and resolution: A new conceptual framework for evaluating professionalism. Academic Medicine, 75, S6–S11.
Article Google Scholar
Gofton, W., Dudek, N. L., Wood, T. J., Balaa, F., & Hamstra, S. J. (2012). The Ottawa Surgical Competency Operating Room Evaluation (O-SCORE): A tool to assess surgical competence. Academic Medicine, 87, 1401–1407.
Article Google Scholar
Gregory, R. J. (1992). Psychological testing: History, principles, and applications. Needham Heights: Allyn & Bacon.
Google Scholar
Guinote, A. (2013). Social power and cognition. In D. E. Carlston (Ed.), The Oxford Handbook of Social Cognition. New York: Oxford University Press.
Hamstra, S. J. (2014). Designing and selecting assessment instruments: Focusing on competencies. In G. Bandiera & D. Dath (Eds.), The royal college program directors handbook: A practical guide for leading an exceptional program. Royal College of Physicians and Surgeons of Canada: Ottawa, ON.
Google Scholar
Hodges, B. (2013). Assessment in the post-psychometric era: Learning to love the subjective and collective. Medical Teacher, 35, 564–568.
Article Google Scholar
Hoffmann, C. (2007). Constant differences: Friedrich Wilhelm Bessel, the concept of the observer in early Nineteenth-Century practical astronomy and the history of the personal equation. The British Journal for the History of Science, 40, 333–365.
Article Google Scholar
Hölder, O. (1901). Die Axiome der Quantitat und die Lehre vom Mass. Berichte iiber die Verbandlungen der Koniglich Sacbsischen Gesellschaft der Wissenschajten pu Leippig, Mathematisch-Physische Klasse, 53, 146.
Google Scholar
Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527–535.
Article Google Scholar
Kim, J., Neilipovitz, D., Cardinal, P., & Chiu, M. (2009). A comparison of global rating scale and checklist scores in the validation of an evaluation tool to assess performance in the resuscitation of critically ill patients during simulated emergencies. Simulation in Healthcare, 4, 6–16.
Article Google Scholar
Kline, P. (2000). The Handbook of Psychological Testing. London: Routledge.
Google Scholar
Krantz, D. H., Luce, R. D., Suppes, P., & Tversky, A. (1971). Foundations of measurement (Vol. 1). New York: Academic Press.
Google Scholar
Kuhn, T. S. (1962). The structure of scientific revolutions. Chicago, IL: University of Chicago Press.
Google Scholar
Lakatos, I. (1970). Criticism and the growth of knowledge. Cambridge: Cambridge University Press.
Book Google Scholar
Laudan, L. (1984). Science and values. Los Angeles: University of California Press.
Google Scholar
Lingard, L. (2012). Rethinking competence in the context of teamwork. In B. D. Hodges & L. Lingard (Eds.), The question of competence: Reconsidering medical education in the twenty-first century (pp. 131–154). Ithaca, NY: Cornell University Press.
Google Scholar
Luce, R. D., & Krumhansl, C. L. (1988). Measurement, scaling and psychophysics. In R. C. Atkinson, R. J. Herrenstein, G. Lindzey, & R. D. Luce (Eds.), Stevens’ handbook of experimental psychology (pp. 3–74). New York: Wiley.
Google Scholar
Martin, J. A., Regehr, G., Reznick, R., Macrae, H., Murnaghan, J., Hutchinson, C., & Brown, M. (1997). Objective structured assessment of technical skill (OSATS) for surgical residents. British Journal of Surgery, 84, 273–278.
Article Google Scholar
Marx, M. H. (1963). The general nature of theory construction. In M. H. Marx (Ed.), Theories of contemporary psychology (pp. 3–46). London: MacMillan.
Google Scholar
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18, 5–11.
Article Google Scholar
Messick, S. (1995). Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741–749.
Article Google Scholar
Michell, J. (1986). Measurement scales and statistics: A clash of paradigms. Psychological Bulletin, 100, 398–407.
Article Google Scholar
Michell, J. (1997). Quantitative science and the definition of measurement in psychology. British Journal of Psychology, 88, 355–383.
Article Google Scholar
Nasca, T. J., Philibert, I., Brigham, T., & Flynn, T. C. (2012). The next GME accreditation system: Rationale and benefits. New England Journal of Medicine, 366(11), 1051–1056.
Article Google Scholar
Norman, G. R. (2002). Research in medical education: Three decades of progress. British Medical Journal, 324, 1560–1562.
Article Google Scholar
Popper, K. R. (1959). The logic of scientific discovery. London: Hutchinson.
Google Scholar
Regehr, G., Bogo, M., Regehr, C., & Power, R. (2007). Can we build a better mousetrap? Improving the measures of practice performance in the field practicum. Journal of Social Work Education, 43, 327–343.
Article Google Scholar
Robertson, I. (2012). The winner effect: How power affects your brain. London: Bloomsbury.
Google Scholar
Schaffer, S. (1988). Astronomers mark time: Discipline and the personal equation. Science in Context, 2, 115–145.
Article Google Scholar
Schuwirth, L. W. T., & van der Vleuten, C. P. M. (2005). Assessing professional competence: From methods to programmes. Medical Education, 39, 309–317.
Article Google Scholar
Schuwirth, L. W. T., & van der Vleuten, C. P. M. (2006). Challenges for educationalists. British Medical Journal, 333(7567), 544–546.
Article Google Scholar
Sherbino, J., Kulasegaram, M., Worster, A., & Norman, G. (2013). The reliability of encounter cards to assess the CanMEDS roles. Advances in Health Sciences Education, 18, 987–996.
Article Google Scholar
Sherif, M. (1958). Superordinate goals in the reduction of intergroup conflict. American Journal of Sociology, 63, 349–356.
Article Google Scholar
Speer, A. J., Solomon, D. J., & Fincher, R. M. (2000). Grade inflation in internal medicine clerkships: Results of a national survey. Teaching and Learning in Medicine, 12, 112–116.
Article Google Scholar
Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677–680.
Article Google Scholar
Stevens, S. S. (1951). Mathematics, measurement, and psychophysics. In S. S. Stevens (Ed.), Handbook of experimental psychology (pp. 21–29). New York: John Wiley.
Google Scholar
Traub, R. (1997). Classical test theory in historical perspective. Educational Measurement: Issues and Practice, 16, 8–14.
Article Google Scholar
Velleman, P. F., & Wilkinson, L. (1993). Nominal, ordinal, interval, and ratio typologies are misleading. The American Statistician, 47, 65–72.
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Medicine, University of Ottawa, Ottawa, Canada
Jordan Richard Schoenherr
Department of Psychology, Carleton University, Ottawa, Canada
Jordan Richard Schoenherr
Accreditation Council for Graduate Medical Education, 515 N. State Street, Suite 2000, Chicago, IL, 60654, USA
Stanley J. Hamstra

Authors

Jordan Richard Schoenherr
View author publications
You can also search for this author in PubMed Google Scholar
Stanley J. Hamstra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stanley J. Hamstra.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schoenherr, J.R., Hamstra, S.J. Psychometrics and its discontents: an historical perspective on the discourse of the measurement tradition. Adv in Health Sci Educ 21, 719–729 (2016). https://doi.org/10.1007/s10459-015-9623-z

Download citation

Received: 27 October 2014
Accepted: 13 July 2015
Published: 25 August 2015
Issue Date: August 2016
DOI: https://doi.org/10.1007/s10459-015-9623-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Psychometrics and its discontents: an historical perspective on the discourse of the measurement tradition

Abstract

Access this article

Similar content being viewed by others

Psychometrics in action, science as practice

Bias Assessment and Prevention in Noncognitive Outcome Measures in Context Assessments

We need more replication research – A case for test-retest reliability

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Psychometrics and its discontents: an historical perspective on the discourse of the measurement tradition

Abstract

Access this article

Similar content being viewed by others

Psychometrics in action, science as practice

Bias Assessment and Prevention in Noncognitive Outcome Measures in Context Assessments

We need more replication research – A case for test-retest reliability

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation