Advances in Health Sciences Education

, Volume 17, Issue 2, pp 165–181

Validity considerations in the assessment of professionalism

  • Brian E. Clauser
  • Melissa J. Margolis
  • Matthew C. Holtman
  • Peter J. Katsufrakis
  • Richard E. Hawkins


During the last decade, interest in assessing professionalism in medical education has increased exponentially and has led to the development of many new assessment tools. Efforts to validate the scores produced by tools designed to assess professionalism have lagged well behind the development of these tools. This paper provides a structured framework for collecting evidence to support the validity of assessments of professionalism. The paper begins with a short history of the concept of validity in the context of psychological assessment. It then describes Michael Kane’s approach to validity as a structured argument. The majority of the paper then focuses on how Kane’s framework can be applied to assessments of professionalism. Examples are provided from the literature, and recommendations for future investigation are made in areas where the literature is deficient.


Validity argument Assessment of professionalism 


  1. Abbott, L. C. (1983). A study of humanism in family practice. Journal of Family Practice, 16, 1141–1146.Google Scholar
  2. Anderson, L. A., & Dedrick, R. F. (1990). Development of the trust in physician scale: A measure to assess interpersonal trust in patient-physician relationships. Psychological Reports, 67, 1091–1100.Google Scholar
  3. Arnold, E. L., Blank, L. L., Race, K. E., & Cipparrone, N. (1998). Can professionalism be measured? The development of a scale for use in the medical environment. Academic Medicine, 73(10), 1119–1121.CrossRefGoogle Scholar
  4. Baker, R. (1990). Development of a questionnaire to assess patients’ satisfaction with consultations in general practice. British Journal of General Practice, 40, 487–490.Google Scholar
  5. Balzer, W. K., & Sulsky, L. M. (1992). Halo and performance appraisal research: A critical examination. Journal of Applied Psychology, 77(6), 975–985.CrossRefGoogle Scholar
  6. Brennan, R. L. (2001a). Generalizability theory. New York: Springer.Google Scholar
  7. Brennan, R. L. (2001b). An essay on the history and future of reliability from the perspective of replications. Journal of Educational Measurement, 38, 295–317.CrossRefGoogle Scholar
  8. Butterfield, P. S., Mazzaferri, E. L., & Sachs, L. A. (1987). Nurses as evaluators of humanistic behavior in internal medicine residents. Journal of Medical Education, 62, 842–849.Google Scholar
  9. Campbell, D. T., & Fiske, D. W. (1959). Convergent and divergent validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105.CrossRefGoogle Scholar
  10. Clauser, B. E., Margolis, M. J., & Swanson, D. B. (2008). Issues of validity and reliability for assessments in medical education. In E. Holmboe & R. Hawkins (Eds.), Practical guide to the evaluation of clinical competence (pp. 10–23). Amsterdam: Elsevier.Google Scholar
  11. Clauser, B. E., & Mazor, K. M. (1998). Using statistical procedures to identify differentially functioning test items (ITEMS Module). Educational Measurement: Issues and Practice, 17(1), 31–44.CrossRefGoogle Scholar
  12. Cronbach, L. J. (1980). Validity on parole: How can we go straight? New directions for testing and measurement: Measuring achievement over a decade. Proceedings of the 1979 ETS Invitational Conference (pp. 99–108). San Francisco: Jossey-Bass.Google Scholar
  13. Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302.CrossRefGoogle Scholar
  14. Cruess, R. L., & Cruess, S. R. (2006). Teaching professionalism: General principles. Medical Teacher, 28, 205–208.CrossRefGoogle Scholar
  15. Cruess, R., McIlroy, J. H., Cruess, S., Ginsburg, S., & Steinert, Y. (2006). The professionalism mini-evaluation exercise: A preliminary investigation. Academic Medicine (RIME Supplement), 81, S74–S78.Google Scholar
  16. Dannefer, E. F., Henson, L. C., Bierer, S. B., Grady-Weliky, T. A., Meldrum, S., Nofziger, A. C., et al. (2005). Peer assessment of professional competence. Medical Education, 39, 713–722.CrossRefGoogle Scholar
  17. Eagly, A. H., Ashmore, R. D., Makhijani, M. G., & Longo, L. C. (1991). What is beautiful is good, but…: A meta-analytic review of research on the physical attractiveness stereotype. Psychological Bulletin, 110, 109–128.CrossRefGoogle Scholar
  18. Engelhard, G. (1994). Examining rater errors in the assessment of written composition with a many-faceted Rasch model. Journal of Educational Measurement, 31(2), 93–112.CrossRefGoogle Scholar
  19. Flanagan, J. C. (1948). The aviation psychology program in the Army Air Forces. Washington: US Government Printing Office.Google Scholar
  20. Fleiss, J. L., & Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 33, 613–619.CrossRefGoogle Scholar
  21. Ginsburg, S., Regehr, G., & Mylopoulos, M. (2007). Reasoning when it counts: Students’ rationales for action on a professionalism exam. Academic Medicine (RIME Supplement), 82, S40–S43.Google Scholar
  22. Ginsburg, S., Regehr, G., & Mylopoulos, M. (2009). From behaviours to attributions: Further concerns regarding the evaluation of professionalism. Medical Education, 43, 414–425.CrossRefGoogle Scholar
  23. Gulliksen, H. (1950). Theory of mental tests. New York: Wiley.CrossRefGoogle Scholar
  24. Hambleton, R. K., Swaminathan, H., & Jane Rogers, H. (1991). MMSS fundamentals of item response theory. Newbury Park: Sage.Google Scholar
  25. Jacobs, R., & Kozlowski, S. W. J. (1985). A closer look at halo error in performance ratings. The Academy of Management Journal, 28(1), 201–212.CrossRefGoogle Scholar
  26. Kane, M. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Westport: American Council on Education/Praeger.Google Scholar
  27. Lipner, R. S., Blank, L. L., Leas, B. F., & Fortina, G. S. (2002). The value of patient and peer ratings in recertification. Academic Medicine, 77(10), S64–S66.CrossRefGoogle Scholar
  28. Margolis, M. J., Clauser, B. E., Cuddy, M. M., Ciccone, A., Mee, J., Harik, P., et al. (2006). Use of the Mini-CEX to rate examinee performance on a multiple-station clinical skills examination: A validity study. Academic Medicine (RIME Supplement), 81, S56–S60.Google Scholar
  29. Mazor, K., Canavan, C., Farrell, M., Margolis, M. J., & Clauser, B. E. (2008). Collecting validity evidence for an assessment of professionalism: Findings from think-aloud interviews. Academic Medicine, 83(10), S9–S12.CrossRefGoogle Scholar
  30. Mazor, K. M., Margolis, M. J., Holtman, M., & Clauser, B. E. (2007). Evaluation of missing data in an assessment of professional behaviors. Academic Medicine (RIME Supplement), 82, S44–S47.Google Scholar
  31. Mazor, K. M., Ockeme, J. K., & Rogers, H. J. (2005). The relationship between checklist scores on a communications OSCE and analogue patients’ perceptions of communications. Advances in Health Science Education, 10, 37–51.CrossRefGoogle Scholar
  32. McCall, G. J. (1984). Systematic field observation. Annual Review of Sociology, 10, 263–282.CrossRefGoogle Scholar
  33. Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education, MacMillan Publishing Co.Google Scholar
  34. Murphy, K. R., Jako, R. A., & Anhalt, R. L. (1993). Nature and consequences of halo error: A critical analysis. Journal of Applied Psychology, 78(2), 218–225.CrossRefGoogle Scholar
  35. Papadakis, M. A., Arnold, G. K., Blank, L. L., Holmboe, E. S., & Lipner, R. S. (2008). Performance during internal medicine residency training and subsequent disciplinary action by state licensing boards. Annals of Internal Medicine, 148, 869–876.Google Scholar
  36. Papadakis, M., & Loeser, H. (2006). Using critical incident reports and longitudinal observations to assess professionalism. In D. T. Stern (Ed.), Measuring medical professionalism (pp. 159–173). New York: Oxford University Press.Google Scholar
  37. Ram, P., van der Vleuten, C., Rethans, J. J., Grol, R., & Aretz, K. (1999). Assessment of practicing family physicians: Comparison of observation in a multiple-station examination using standardized patients with observation of consultations in daily practice. Academic Medicine, 74, 62–69.Google Scholar
  38. Singer, P. A., Cohen, R., Robb, A., & Rothhan, A. I. (1993). The ethics of objective structured clinical examination. Journal of General Internal Medicine, 8, 23–27.CrossRefGoogle Scholar
  39. Spearman, C. (1910). Correlation calculated with faulty data. British Journal of Psychology, 3, 271–295.Google Scholar
  40. Stern, D. T. (1996). Values on call: A method for assessing the teaching of professionalism. Academic Medicine, 71(10), S37–S39.CrossRefGoogle Scholar
  41. Stern, D. T. (2006). Measuring medical professionalism. New York: Oxford University Press.Google Scholar
  42. Stern, D. T., Frohna, A. Z., & Gruppen, L. D. (2005). The prediction of professional behavior. Medical Education, 39, 75–82.CrossRefGoogle Scholar
  43. Thorndike, E. L. (1920). A constant error in psychological ratings. Journal of Applied Psychology, 4(1), 25–29.CrossRefGoogle Scholar
  44. Van der Vleuten, C. P. M., & Schuwirth, L. W. T. (2005). Assessing professional competence: From methods to programmes. Medical Education, 39, 309–317.CrossRefGoogle Scholar
  45. Veloski, J. J., Fields, S. K., Boex, J. R., Blank, L. L. (2005). Measuring professionalism: A review of studies with instruments reported in the literature between 1982 and 2002. Academic Medicine, 80, 366–370.CrossRefGoogle Scholar
  46. Violato, C., Lockyer, J., & Fidler, H. (2003). Multisource feedback: a method of assessing surgical practice. British Medical Journal, 326, 546–548.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2010

Authors and Affiliations

  • Brian E. Clauser
    • 1
  • Melissa J. Margolis
    • 1
  • Matthew C. Holtman
    • 1
  • Peter J. Katsufrakis
    • 1
  • Richard E. Hawkins
    • 2
  1. 1.National Board of Medical ExaminersPhiladelphiaUSA
  2. 2.American Board of Medical SpecialtiesChicagoUSA

Personalised recommendations