Advances in Health Sciences Education

, Volume 19, Issue 3, pp 409–427 | Cite as

Exploring the role of first impressions in rater-based assessments

  • Timothy J. WoodEmail author


Medical education relies heavily on assessment formats that require raters to assess the competence and skills of learners. Unfortunately, there are often inconsistencies and variability in the scores raters assign. To ensure the scores from these assessment tools have validity, it is important to understand the underlying cognitive processes that raters use when judging the abilities of their learners. The goal of this paper, therefore, is to contribute to a better understanding of the cognitive processes used by raters. Representative findings from the social judgment and decision making, cognitive psychology, and educational measurement literature will be used to enlighten the underpinnings of these rater-based assessments. Of particular interest is the impact judgments referred to as first impressions (or thin slices) have on rater-based assessments. These are judgments about people made very quickly and based on very little information. A narrative review will provide a synthesis of research in these three literatures (social judgment and decision making, educational psychology, and cognitive psychology) and will focus on the underlying cognitive processes, the accuracy and the impact of first impressions on rater-based assessments. The application of these findings to the types of rater-based assessments used in medical education will then be reviewed. Gaps in understanding will be identified and suggested directions for future research studies will be discussed.


First impressions Rater-based assessment Rater-cognition 



The author would like to thank Dr. Andrea Gingerich and Dr Sydney Smee for their helpful discussions that led to this project and Dr. Stan Hamstra, Dr. Susan Humprey-Murto, Kulamakan Mahan Kulasegaram, Sarah Lynch, Dr. Debra Pugh, Dr. Nikki Woods and the journal editor for their thoughtful review and suggestions regarding this manuscript.


  1. AERA, APA, & NCME. (1999). Standards for educational and psychological testing (pp. 9–24). Washington, DC: American Educational Research Association.Google Scholar
  2. Ambady, N. (2010). The perils of pondering: Intuition and thin slice judgments. Psychological Inquiry, 21(4), 271–278.CrossRefGoogle Scholar
  3. Ambady, N., Bernieri, F., & Richeson, J. (2000). Toward a histology of social behavior: Judgmental accuracy from thin slices of the behavioral stream. Advances in Experimental Social Psychology, 32, 201–271.CrossRefGoogle Scholar
  4. Ambady, N., & Gray, H. M. (2002). On being sad and mistaken: Mood effects on the accuracy of thin-slice judgments. Journal of Personality and Social Psychology, 83(4), 947–961.CrossRefGoogle Scholar
  5. Ambady, N., Hallahan, M., & Conner, B. (1999). Accuracy of judgments of sexual orientation from thin slices of behavior. Journal of Personality and Social Psychology, 77(3), 538–547.CrossRefGoogle Scholar
  6. Ambady, N., Hallahan, M., & Rosenthal, R. (1995). On judging and being judged accurately in zero-acquaintance situations. Journal of Personality and Social Psychology, 69(3), 518–529.CrossRefGoogle Scholar
  7. Ambady, N., & Rosenthal, R. (1992). Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological Bulletin, 111(2), 256–274.CrossRefGoogle Scholar
  8. Ambady, N., & Rosenthal, R. (1993). Half a minute: Predicting teacher evaluations from thin slices of nonverbal behavior and physical attractiveness. Journal of Personality and Social Psychology, 64(3), 431–441.CrossRefGoogle Scholar
  9. Ames, D. R., Kammrath, L. K., Suppes, A., & Bolger, N. (2010). Not so fast: The (not-quite-complete) dissociation between accuracy and confidence in thin-slice impressions. Personality and Social Psychology Bulletin, 36(2), 264–277.CrossRefGoogle Scholar
  10. Babad, E., Avni-Babad, D., & Rosenthal, R. (2004). Prediction of students’ evaluations from brief instances of professors’ nonverbal behavior in defined instructional situations. Social Psychology of Education, 7(1), 3–33.CrossRefGoogle Scholar
  11. Balzer, W. K., & Sulsky, L. M. (1992). Halo and performance appraisal research: A critical examination. Journal of Applied Psychology, 77(6), 975–985.CrossRefGoogle Scholar
  12. Bargh, J. A. (1992). The ecology of automaticity: Toward establishing the conditions needed to produce automatic processing effects. The American Journal of Psychology, 105(2), 181–199.CrossRefGoogle Scholar
  13. Barrick, M. R., Shaffer, J. A., & DeGrassi, S. W. (2009). What you see may not be what you get: Relationships among self-presentation tactics and ratings of interview and job performance. The Journal of Applied Psychology, 94(6), 1394–1411.CrossRefGoogle Scholar
  14. Barrick, M. R., Swider, B. W., & Stewart, G. L. (2010). Initial evaluations in the interview: Relationships with subsequent interviewer evaluations and employment offers. The Journal of Applied Psychology, 95(6), 1163–1172.CrossRefGoogle Scholar
  15. Berendonk, C., Stalmeijer, R. E., & Schuwirth, L. W. T. (2013). Expertise in performance assessment: Assessors perspectives. Advances in Health Sciences Education: Theory and Practice. doi: 10.1007/s10459-012-9392-x
  16. Bernardin, H. J., & Pence, E. C. (1980). Effects of rater training: Creating new response sets and decreasing accuracy. Journal of Applied Psychology, 65(1), 60–66.CrossRefGoogle Scholar
  17. Biesanz, J. C., Human, L. J., Paquin, A. C., Chan, M., Parisotto, K. L., Sarracino, J., et al. (2011). Do we know when our impressions of others are valid? Evidence for realistic accuracy awareness in first impressions of personality. Social Psychological and Personality Science, 2(5), 452–459.CrossRefGoogle Scholar
  18. Borkenau, P., & Liebler, A. (1992). Trait inferences: Sources of validity at zero acquaintance. Journal of Personality and Social Psychology, 62(4), 645–657.CrossRefGoogle Scholar
  19. Brooks, L. R. (2005). The blossoms and the weeds. Canadian Journal of Experimental Psychology/Revue Canadienne de Psychologie Expérimentale, 59(1), 62–74.CrossRefGoogle Scholar
  20. Carney, D., Colvin, C., & Hall, J. (2007). A thin slice perspective on the accuracy of first impressions. Journal of Research in Personality, 41(5), 1054–1072.CrossRefGoogle Scholar
  21. Chan, M., Rogers, K. H., Parisotto, K. L., & Biesanz, J. C. (2011). Forming first impressions: The role of gender and normative accuracy in personality perception. Journal of Research in Personality, 45(1), 117–120.CrossRefGoogle Scholar
  22. Clauser, B. E., Margolis, M. J., & Swanson, D. B. (2008). Issues of validity and reliability for assessments in Medical Education. In E. S. Holmboe & R. E. Hawkins (Eds.), Practical guide to the evaluation of clinical competence (pp. 10–23). Philadelphia: Mosby Elsevier.Google Scholar
  23. Colvin, C. R., & Funder, D. C. (1991). Predicting personality and behavior: A boundary on the acquaintanceship effect. Journal of Personality and Social Psychology, 60(6), 884–894.CrossRefGoogle Scholar
  24. Cook, D. A., & Beckman, T. J. (2006). Current concepts in validity and reliability for psychometric instruments: Theory and application. The American Journal of Medicine, 119(2), 166.e7–166.e16.CrossRefGoogle Scholar
  25. Cook, D. A., Dupras, D. M., Beckman, T. J., Thomas, K. G., & Pankratz, V. A. (2008). Effect of rater training on reliability and accuracy of mini-cex scores: A randomized, controlled trial. Journal of General Internal Medicine, 24(1), 74–79.CrossRefGoogle Scholar
  26. Cooper, W. H. (1981). Ubiquitous halo. Psychological Bulletin, 90(2), 218–244.CrossRefGoogle Scholar
  27. Croskerry, P. (2009). Clinical cognition and diagnostic error: Applications of a dual process model of reasoning. Advances in Health Sciences Education, 14, 27–35.CrossRefGoogle Scholar
  28. DeNisi, A. S., Cafferty, T. P., & Meglino, B. M. (1984). A cognitive view of the performance appraisal process: A model and research propositions. Organizational Behaviour and Human Performance, 33, 360–396.CrossRefGoogle Scholar
  29. Dijksterhuis, A., Bos, M. W., Nordgren, L. F., & Van Baaren, R. B. (2006). On making the right choice: The deliberation-without-attention effect. Science, 311(5763), 1005–1007.CrossRefGoogle Scholar
  30. Dipboye, R. L. (1982). Self-fulfilling prophecies in the selection-recruitment interview. The Academy of Management Review, 7(4), 579.Google Scholar
  31. Dodson, M., Crotty, B., Prideaux, D., Carne, R., Ward, A., & De Leeuw, E. (2009). The multiple mini-interview: How long is long enough? Medical Education, 43(2), 168–174.CrossRefGoogle Scholar
  32. Dougherty, T. W., Turban, D. B., & Callender, J. C. (1994). Confirming first impressions in the employment interview: A field study of interview behaviour. Journal of Applied Psychology, 5(5), 659–665.CrossRefGoogle Scholar
  33. Downing, S. M., & Haladyna, T. M. (2009). Validity and its threats. In S. M. Downing & R. Yudkowsky (Eds.), Assessment in health professions education (pp. 21–56). New York: Routledge.Google Scholar
  34. Eva, K. W., & Norman, G. R. (2005). Heuristics and biases—A biased perspective on clinical reasoning. Medical Education, 39(9), 870–872.CrossRefGoogle Scholar
  35. Eva, K. W., & Regehr, G. (2011). Exploring the divergence between self-assessment and self-monitoring. Advances in Health Sciences Education: Theory and Practice, 16(3), 311–329.CrossRefGoogle Scholar
  36. Evans, J. S. B. T. (2008). Dual-processing accounts of reasoning, judgment, and social cognition. Annual Review of Psychology, 59, 255–278.CrossRefGoogle Scholar
  37. Feldman, J. M. (1981). Beyond attribution theory: Cognitive processes in performance appraisal. Applied Psychology, 66(2), 127–148.CrossRefGoogle Scholar
  38. Fisicaro, S. A., & Lance, C. E. (1990). Implications of three causal models for the measurement of halo error. Applied Psychological Measurement, 14(4), 419–429.CrossRefGoogle Scholar
  39. Fiske, S., & Neuberg, S. (1990). A continuum of impression formation, from category-based to individuating processes: Influences of information and motivation on attention and interpretation. In M. Zanna (Ed.), Advances in experimental social psychology (23rd ed., pp. 1–75). San Diego: Academic Press Inc.Google Scholar
  40. Funder, D. C. (1987). Errors and mistakes: Evaluating the accuracy of social judgment. Psychological Bulletin, 101(1), 75–90.CrossRefGoogle Scholar
  41. Funder, D. C., & West, S. G. (1993). Consensus, self-other agreement, and accuracy in personality judgment: A introduction. Journal of Personality, 61(4), 457–476.CrossRefGoogle Scholar
  42. Gigerenzer, G., & Gaissmaier, W. (2011). Heuristic decision making. Annual Review of Psychology, 62, 451–482.CrossRefGoogle Scholar
  43. Gingerich, A., Regehr, G., & Eva, K. W. (2011). Rater-based assessments as social judgments: Rethinking the etiology of rater errors. Academic Medicine, 86(10), S1–S7.CrossRefGoogle Scholar
  44. Ginsburg, S., McIlroy, J., Oulanova, O., Eva, K., & Regehr, G. (2010). Toward authentic clinical evaluation: Pitfalls in the pursuit of competency. Academic Medicine, 85(5), 780–786.CrossRefGoogle Scholar
  45. Goffin, R. D., Jelley, R. B., & Wagner, S. H. (2003). Is halo helpful? Effects of inducing halo on performance rating accuracy. Social Behaviour and Personality, 31(6), 625–636.CrossRefGoogle Scholar
  46. Govaerts, M. J. B., Schuwirth, L. W. T., Van der Vleuten, C. P. M., & Muijtjens, A. M. M. (2011). Workplace-based assessment: Effects of rater expertise. Advances in Health Sciences Education: Theory and Practice, 16(2), 151–165.CrossRefGoogle Scholar
  47. Govaerts, M. J. B., Van de Wiel, M. W. J., Schuwirth, L. W. T., Van der Vleuten, C. P. M., & Muijtjens, A. M. M. (2013). Workplace-based assessment: Raters’ performance theories and constructs. Advances in Health Sciences Education: Theory and Practice. doi: 10.1007/s10459-012-9376-x.
  48. Harris, M., & Garris, C. (2008). You never get a second chance to make a first impression. In N. Ambady & J. Skowronski (Eds.), First impressions (pp. 147–168). New York, NY: Guilford Press.Google Scholar
  49. Hasher, L., & Zacks, R. T. (1979). Automatic and effortful processes in memory. Journal of Experimental Psychology: General, 108(3), 356–388.CrossRefGoogle Scholar
  50. Hawkins, R. E., & Boulet, J. R. (2008). Direct observation: Standardized patients. In E. S. Holmboe & R. E. Hawkins (Eds.), Evaluation of clinical competence (pp. 102–118). Philadelphia, PA: Mosby Elsevier.Google Scholar
  51. Holmboe, E. S., Sherbino, J., Long, D. M., Swing, S. R., & Frank, J. R. (2010). The role of assessment in competency-based medical education. Medical Teacher, 32(8), 676–682.CrossRefGoogle Scholar
  52. Hoyt, W. T. (2000). Rater bias in psychological research: When is it a problem and what can we do about it? Psychological Methods, 5(1), 64–65.CrossRefGoogle Scholar
  53. Iramaneerat, C., & Yudkowsky, R. (2007). Rater errors in a clinical skills assessment of medical students. Evaluation and the Health Professions, 30(3), 266–283.CrossRefGoogle Scholar
  54. Jacoby, L. L. (1991). A process dissociation framework: Separating automatic from intentional uses of memory. Journal of Memory and Language, 30, 513–541.CrossRefGoogle Scholar
  55. Jacoby, L., & Kelley, C. (1990). An episodic view of motivation: Unconscious influences of memory. In E. T. Higgins & R. M. Sorrentino (Eds.), Handbook of motivation and cognition: Foundations of social behavior (Vol. 2, pp. 451–480). New York, NY: Guilford Press.Google Scholar
  56. Johnston, J. H., Driskell, J. E., & Salas, E. (1997). Vigilant and hypervigilant decision making. The Journal of Applied Psychology, 82(4), 614–622.CrossRefGoogle Scholar
  57. Kahneman, D. (2011). Thinking, fast and slow. Canada: Doubleday.Google Scholar
  58. Kenny, D. A. (1993). A coming-of-age for research on interpersonal perception. Journal of Personality, 61(4), 789–807.CrossRefGoogle Scholar
  59. Kenny, D. A., & Albright, L. (1987). Accuracy in interpersonal perception: A social relations analysis. Psychological Bulletin, 102(3), 390–402.CrossRefGoogle Scholar
  60. Klein, G. (2009). Streetlights and shadows: Searching for the keys to adaptive decision making. Cambridge, MA: MIT Press.Google Scholar
  61. Kogan, J. R., Conforti, L., Bernabeo, E., Iobst, W., & Holmboe, E. (2011). Opening the black box of clinical skills assessment via observation: A conceptual model. Medical Education, 45(10), 1048–1060.CrossRefGoogle Scholar
  62. Lance, C. E., LaPointe, J. A., & Stewart, A. M. (1994). A test of the context dependency of three causal models of halo rater error. Journal of Applied Psychology, 79(3), 332–340.CrossRefGoogle Scholar
  63. Landy, F. J., & Farr, J. L. (1980). Performance rating. Psychological Bulletin, 87(1), 72–107.CrossRefGoogle Scholar
  64. Lippa, R. A., & Dietz, J. K. (2000). The relation of gender, personality and intelligence to judges’ accuracy in judging strangers’ personality from brief video segments. Journal of Nonverbal Behavior, 24(1), 25–43.CrossRefGoogle Scholar
  65. Logan, G. D. (1992). Attention and preattention in theories of automaticity. The American Journal of Psychology, 105(2), 317–339.CrossRefGoogle Scholar
  66. Macan, T. H., & Dipboye, R. L. (1990). The relationship of interviewrs’ preinterview impressions to selection and recruitment outcomes. Personnel Psychology, 43(4), 745–768.CrossRefGoogle Scholar
  67. Monteiro, S. D., Sherbino J. D., Ilgen, J. S., Dore, K. L. Gaissmaier, W., Wood, T. J., et al. (unpublished manuscript). Diagnosing Fast and Slow: The Effect of Interruptions on Speeded and Reflective Clinical Reasoning. Google Scholar
  68. Murphy, K. R., Jako, R. A., & Anhalt, R. L. (1993). Nature and consequences of halo error: A critical analysis. Journal of Applied Psychology, 78(2), 218–225.CrossRefGoogle Scholar
  69. Nathan, B. R., & Tippins, N. (1990). The consequences of halo “error” in performance ratings: A field study of the moderating effect of halo on test validation results. Journal of Applied Psychology, 75(3), 290–296.CrossRefGoogle Scholar
  70. Norman, G. (2009). Dual processing and diagnostic errors. Advances in Health Sciences Education, 14, 37–49.CrossRefGoogle Scholar
  71. Norman, G. R., & Eva, K. W. (2010). Diagnostic error and clinical reasoning. Medical Education, 44, 94–100.CrossRefGoogle Scholar
  72. Norman, G. R., Sherbino, J., Dore, K. L., Wood, T. J. Ph. Young, M. E., Gaissmaier, W., et al. (in press). The etiology of diagnostic errors: A controlled trial of System 1 vs. System 2 reasoning. Academic Medicine. Google Scholar
  73. Norman, G., Young, M., & Brooks, L. (2007). Non-analytical models of clinical reasoning: The role of experience. Medical Education, 41, 1140–1145.Google Scholar
  74. Patterson, M. L., & Stockbridge, E. (1998). Effects of cognitive demand and judgment strategy on person perception accuracy. Journal of Nonverbal Behavior, 22(4), 253–263.CrossRefGoogle Scholar
  75. Pelaccia, T., Tardif, J., Triby, E., & Charlin, B. (2011). An analysis of clinical reasoning through a recent and comprehensive approach: The dual-process theory. Medical Education Online, 16, 5890.CrossRefGoogle Scholar
  76. Rosenthal, R. (1994). Interpersonal expectancy effects : A 30-year perspective. Current Directions in Psychological Science, 3(6), 176–179.CrossRefGoogle Scholar
  77. Saal, F. E., Downey, R. G., & Lahey, M. A. (1980). Rating the ratings: Assessing the psychometric quality of rating data. Psychological Bulletin, 88(2), 413–428.CrossRefGoogle Scholar
  78. Schneider, W., & Chein, J. M. (2003). Controlled & automatic processing: Behavior, theory, and biological mechanisms. Cognitive Science, 27(3), 525–559.CrossRefGoogle Scholar
  79. Sherbino, J., Dore, K. L., Wood, T. J., Young, M. E., Gaissmaier, W., Krueger, S., et al. (2012). On the relation between processing speed and diagnostic error. Academic Medicine, 87(6), 785–791.CrossRefGoogle Scholar
  80. Smith, H. J., Archer, D., & Costanzo, M. (1991). “Just a hunch”: Accuracy and awareness in person perception. Journal of Nonverbal Behavior, 15(1), 3–18.CrossRefGoogle Scholar
  81. Snyder, M., Tanke, E., & Berscheid, E. (1977). Social perception and interpersonal behavior : On the self-fulfilling nature of social stereotypes. Journal of Personality and Social Psychology, 35(9), 656–666.CrossRefGoogle Scholar
  82. Stroud, L., Herold, J., Tomlinson, G., & Cavalcanti, R. B. (2011). Who you know or what you know? Effect of examiner familiarity with residents on OSCE scores. Academic Medicine, 86(10), S8–S11.CrossRefGoogle Scholar
  83. Tavares, W., & Eva, K. W. (2013). Exploring the impact of mental workload on rater-based assessments. Advances in Health Sciences Education: Theory and Practice. doi: 10.1007/s10459-012-9370-3.
  84. Tom, G., Tong, S. T., & Hesse, C. (2009). Thick slice and thin slice teaching evaluations. Social Psychology of Education, 13(1), 129–136.CrossRefGoogle Scholar
  85. Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131.CrossRefGoogle Scholar
  86. Uleman, J. S., Saribay, S. A., & Gonzalez, C. M. (2008). Spontaneous inferences, implicit impressions, and implicit theories. Annual Review of Psychology, 59, 329–360.CrossRefGoogle Scholar
  87. Van der Vleuten, C. P. M., & Swanson, D. B. (1990). Assessment of clinical skills with standardized patients: State of the art. Teaching and Learning in Medicine, 2(2), 58–76.CrossRefGoogle Scholar
  88. Van Merriënboer, J. J. G., & Sweller, J. (2010). Cognitive load theory in health professional education: Design principles and strategies. Medical Education, 44(1), 85–93.CrossRefGoogle Scholar
  89. Wigton, R. (1980). The effects of student personal characteristics on the evaluation of clinical performance. Journal of Medical Education, 55, 423–427.Google Scholar
  90. Williams, R. G., Klamen, D. A., & McGaghie, W. C. (2003). Cognitive, social and environmental sources of bias in clinical performance ratings. Teaching and Learning in Medicine, 15(4), 270–292.CrossRefGoogle Scholar
  91. Willis, J., & Todorov, A. (2006). Making up your mind after a 100-ms exposure to a face. Psychological Science, 17(7), 592–598.CrossRefGoogle Scholar
  92. Wilson, T. D., & Schooler, J. W. (1991). Thinking too much: Introspection can reduce the quality of preferences and decisions. Journal of Personality and Social Psychology, 60(2), 181–192.CrossRefGoogle Scholar
  93. Woehr, D. J., Day, D. V., Winfred, A., & Bedeian, A. G. (1998). The systematic distortion hypothesis: A confirmatory test of the implicit covariance and general impression models. Basic and Applied Social Psychology, 16(4), 417–434.CrossRefGoogle Scholar
  94. Woehr, D. J., & Huffcutt, A. I. (1994). Rater training for performance appraisal: A quantitative review. Journal of Occupational and Organizational Psychology, 67, 189–205.CrossRefGoogle Scholar
  95. Wood, T. J. (2013). Mental workload as a tool for understanding dual processes in rater-based assessments. Advances in Health Sciences Education: Theory and Practice. doi: 10.1007/s10459-012-9396-6
  96. Yaphe, J., & Street, S. (2003). How do examiners decide?: A qualitative study of the process of decision making in the oral examination component of the MRCGP examination. Medical Education, 37(9), 764–771.CrossRefGoogle Scholar
  97. Yeates, P., O’Neill, P., Mann, K., & Eva, K. (2013). Seeing the same thing differently. Advances in Health Sciences Education: Theory and Practice. doi: 10.1007/s10459-012-9372-1.

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  1. 1.Academy for Innovation in Medical Education (AIME), RGN2206, Faculty of MedicineUniversity of OttawaOttawaCanada

Personalised recommendations