Skip to main content
Log in

The Performance Appraisal Milieu: A Multilevel Analysis of Context Effects in Performance Ratings

  • Original Paper
  • Published:
Journal of Business and Psychology Aims and scope Submit manuscript

Abstract

Purpose

The purpose of this study was to take an inductive approach in examining the extent to which organizational contexts represent significant sources of variance in supervisor performance ratings, and to explore various factors that may explain contextual rating variability.

Design/Methodology/Approach

Using archival field performance rating data from a large state law enforcement organization, we used a multilevel modeling approach to partition the variance in ratings due to ratees, raters, as well as rating contexts.

Findings

Results suggest that much of what may often be interpreted as idiosyncratic rater variance, may actually reflect systematic rating variability across contexts. In addition, performance-related and non-performance factors including contextual rating tendencies accounted for significant rating variability.

Implications

Supervisor ratings represent the most common approach for measuring job performance, and understanding the nature and sources of rating variability is important for research and practice. Given the many uses of performance rating data, our findings suggest that continuing to identify contextual sources of variability is particularly important for addressing criterion problems, and improving ratings as a form of performance measurement.

Originality/Value

Numerous performance appraisal models suggest the importance of context; however, previous research had not partitioned the variance in supervisor ratings due to omnibus context effects in organizational settings. The use of a multilevel modeling approach allowed the examination of contextual influences, while controlling for ratee and rater characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adler, S., Campion, M., Colquitt, A., Grubb, A., Murphy, K. R., Ollander-Krane, R., et al. (in press). Getting rid of performance ratings: Genius or folly? A debate. Industrial and Organizational Psychology: Perspectives on Science and Practice.

  • Aguinis, H., Gottfredson, R. K., & Culpepper, S. A. (2013). Best-practice recommendations for estimating cross-level interaction effects using multilevel modeling. Journal of Management, 39, 1490–1528. doi:10.1177/0149206313478188.

    Article  Google Scholar 

  • Ajzen, I. (1991). The theory of planned behavior. Organizational Behavior and Human Decision Processes, 50, 179–211. doi:10.1016/0749-5978(91)90020-T.

    Article  Google Scholar 

  • Ajzen, I., & Fishbein, M. (2005). The influence of attitudes on behavior. In D. Albarracín, B. T. Johnson, & M. P. Zanna (Eds.), The handbook of attitudes (pp. 173–221). Mahwah, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Austin, J. T., & Crespin, T. R. (2006). Problems of criteria in industrial and organizational psychology: Progress, pitfalls, and prospects. In W. Bennett Jr, C. E. Lance, & D. J. Woehr (Eds.), Performance measurement: Current perspectives and future challenges (pp. 9–48). Mahwah, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Austin, J. T., & Villanova, P. (1992). The criterion problem: 1917-1992. Journal of Applied Psychology, 77, 836–874.

    Article  Google Scholar 

  • Bartko, J. J. (1976). On various intraclass correlation reliability coefficients. Psychological Bulletin, 83, 762–765.

    Article  Google Scholar 

  • Bennett, W, Jr, Lance, C. E., & Woehr, D. J. (2006). Introduction. In W. Bennett Jr, C. E. Lance, & D. J. Woehr (Eds.), Performance measurement: Current perspectives and future challenges (pp. 1–5). Mahwah, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Bernardin, H. J., & Buckley, R. B. (1981). Strategies in rater training. Academy of Management Review, 6, 205–212.

    Google Scholar 

  • Bliese, P. D. (2000). Within-group agreement, non-independence, and reliability: Implications for data aggregation and analysis. In K. J. Klein & S. W. J. Kozlowski (Eds.), Multilevel theory, research, and methods in organizations: Foundations, extensions, and new directions (pp. 349–381). San Francisco, CA: Jossey-Bass.

    Google Scholar 

  • Bliese, P. D., & Hanges, P. J. (2004). Being both too liberal and too conservative: The perils of treating grouped data as though they were independent. Organizational Research Methods, 7, 400–417. doi:10.1177/1094428104268542.

    Article  Google Scholar 

  • Bommer, W. H., Johnson, J., Rich, G. A., Podsakoff, P. M., & MacKenzie, S. B. (1995). On the interchangeability of objective and subjective measures of employee performance: A meta-analysis. Personnel Psychology, 48, 587–605. doi:10.1111/j.1744-6570.1995.tb01772.x.

    Article  Google Scholar 

  • Borman, W. C. (1987). Personal constructs, performance schemata, and “folk theories” of subordinate effectiveness: Explorations in an Army officer sample. Organizational Behavior and Human Decision Processes, 40, 307–322.

    Article  Google Scholar 

  • Borman, W. C. (2004). The concept of organizational citizenship. Current Directions in Psychological Science, 13, 238–241.

    Article  Google Scholar 

  • Borman, W. C., Buck, D. E., Motowildo, S. J., Hanson, M. A., Stark, S., & Drasgow, F. (2001). An examination of the comparative reliability, validity, and accuracy of performance ratings made using computerized adaptive rating scales. Journal of Applied Psychology, 86, 965–973. doi:10.1037//0021-9010.86.5.965.

    Article  PubMed  Google Scholar 

  • Campbell, J. P., McCloy, R. A., Oppler, S. H., & Sager, C. E. (1993). A theory of performance. In N. Schmitt & W. C. Borman (Eds.), Personnel selection in organizations (pp. 35–70). San Francisco, CA: Jossey-Bass.

    Google Scholar 

  • Deadrick, D. L., & Gardner, D. G. (1997). Distributional ratings of performance levels and variability. Group and Organization Management, 22, 317–342.

    Article  Google Scholar 

  • DeCotiis, T., & Petit, A. (1978). The performance appraisal process: A model and some testable propositions. Academy of Management Review, 3, 635–646.

    PubMed  Google Scholar 

  • DeNisi, A. S., Cafferty, T. P., & Meglino, B. M. (1984). A cognitive view of the performance appraisal process: A model and research propositions. Organizational Behavior & Human Performance, 33, 360–396.

    Article  Google Scholar 

  • Dierdorff, E. C., Rubin, R. S., & Morgeson, F. P. (2009). The milieu of managerial work: an integrative framework linking work context to role requirements. Journal of Applied Psychology, 94, 972.

    Article  PubMed  Google Scholar 

  • Dierdorff, E. C., & Surface, E. A. (2007). Placing peer ratings in context: Systematic influences beyond ratee performance. Personnel Psychology, 60, 93–126. doi:10.1111/j.1744-6570.2007.00066.x.

    Article  Google Scholar 

  • Elsbach, K. D., & Pratt, M. G. (2008). The physical environment in organizations. In J. P. Walsh & A. P. Brief (Eds.), The academy of management annals (Vol. 1, pp. 181–224). New York: Taylor & Francis Group/Lawrence Erlbaum Associates.

    Google Scholar 

  • Goffin, R. D., Jelley, R. B., Powell, D. M., & Johnston, N. G. (2009). Taking advantage of social comparisons in performance appraisal: The relative percentile method. Human Resource Management, 48, 251–268. doi:10.1002/hrm.20278.

    Article  Google Scholar 

  • Greguras, G. J., Robie, C., Schleicher, D. J., & Goff, M, I. I. I. (2003). A field study of the effects of rating purpose on the quality of multisource ratings. Personnel Psychology, 56, 1–21.

    Article  Google Scholar 

  • Harris, M. M. (1994). Rater motivation in the performance appraisal context: A theoretical framework. Journal of Management, 20, 737–756.

    Article  Google Scholar 

  • Hattrup, K., & Jackson, S. (1996). Learning about individual differences by taking situations seriously. In K. R. Murphy (Ed.), Individual differences and behavior in organizations (pp. 507–547). San-Francisco: Jossey-Bass.

    Google Scholar 

  • Hauenstein, N. M. A. (1992). An information-processing approach to leniency in performance judgments. Journal of Applied Psychology, 77, 485.

    Article  Google Scholar 

  • Heneman, R. L. (1986). The relationship between supervisory ratings and results-oriented measures of performance: A meta-analysis. Personnel Psychology, 39, 811–826.

    Article  Google Scholar 

  • Hoffman, B. J., Gorman, C. A., Blair, C. A., Meriac, J. P., Overstreet, B., & Atchley, E. K. (2012). Evidence for the effectiveness of an alternative multisource performance rating methodology. Personnel Psychology, 65, 531–563. doi:10.1111/j.1744-6570.2012.01252.x.

    Article  Google Scholar 

  • Hoffman, B., Lance, C. E., Bynum, B., & Gentry, W. A. (2010). Rater source effects are alive and well after all. Personnel Psychology, 63, 119–151. doi:10.1111/j.1744-6570.2009.01164.x.

    Article  Google Scholar 

  • Hofmann, D. A., & Gavin, M. B. (1998). Centering decisions in hierarchical linear models: Implications for research in organizations. Journal of Management, 24, 623–641.

    Article  Google Scholar 

  • Ilgen, D. R., Barnes-Farrell, J. L., & McKellin, D. B. (1993). Performance appraisal process research in the 1980s: What has it contributed to appraisals in use? Organizational Behavior and Human Decision Processes, 54, 321–368.

    Article  Google Scholar 

  • Ilgen, D. R., & Feldman, J. M. (1983). Performance appraisal: A process focus. In L. Cummings & B. Staw (Eds.), Research in organizational behavior (Vol. 5, pp. 141–197). Greenwich, CT: JAI Press.

    Google Scholar 

  • Jawahar, I. M., & Williams, C. R. (1997). Where all the children are above average: The performance appraisal purpose effect. Personnel Psychology, 50, 905–925.

    Article  Google Scholar 

  • Johns, G. (2006). The essential impact of context on organizational behavior. Academy of Management Review, 31, 386–408.

    Article  Google Scholar 

  • Judge, T. A., & Ferris, G. R. (1993). Social context of performance evaluation decisions. Academy of Management Journal, 36, 80–105.

    Article  Google Scholar 

  • Kane, J. S., Bernardin, H. J., Villanova, P., & Peyrefitte, J. (1995). Stability of rater leniency: Three studies. Academy of Management Journal, 38, 1036–1051.

    Article  Google Scholar 

  • Kingstrom, P. O., & Mainstone, L. E. (1985). An investigation of the rater–ratee acquaintance and rater bias. Academy of Management Journal, 28, 641–653. doi:10.2307/256119.

    Article  Google Scholar 

  • Klores, M. S. (1966). Rater bias in forced-distribution performance ratings. Personnel Psychology, 19, 411–421.

    Article  Google Scholar 

  • Kozlowski, S. W. J., Kirsch, M. P., & Chao, G. T. (1986). Job knowledge, ratee familiarity, conceptual similarity and halo error: An exploration. Journal of Applied Psychology, 71, 45–49.

    Article  Google Scholar 

  • LaHuis, D. M., & Avis, J. M. (2007). Using multilevel random coefficient modeling to investigate rater effects in performance ratings. Organizational Research Methods, 10, 97–107.

    Article  Google Scholar 

  • Landy, F. (2010). Performance ratings: Then and now. In J. L. Outtz (Ed.), Adverse impact: Implications for organizational staffing and high stakes selection (pp. 227–248). New York: Routledge/Taylor & Francis Group.

    Google Scholar 

  • Landy, F. J., & Farr, J. L. (1980). Performance rating. Psychological Bulletin, 87, 72–107.

    Article  Google Scholar 

  • Levy, P. E., & Williams, J. R. (2004). The social context of performance appraisal: A review and framework for the future. Journal of Management, 30, 881–905.

    Article  Google Scholar 

  • McDaniel, M. A., Schmidt, F. L., & Hunter, J. E. (1988). Job experience correlates of job performance. Journal of Applied Psychology, 73, 327–330.

    Article  Google Scholar 

  • Mero, N. P., Motowidlo, S. J., & Anna, A. L. (2003). Effects of accountability on rating behavior and rater accuracy. Journal of Applied Social Psychology, 33, 2493–2514.

    Article  Google Scholar 

  • Mount, M. K., Judge, T. A., Scullen, S. E., Sytsma, M. R., & Hezlett, S. A. (1998). Trait, rater and level effects in 360-degree performance ratings. Personnel Psychology, 51, 557–576.

    Article  Google Scholar 

  • Mowday, R. T., & Sutton, R. I. (1993). Organizational behavior: Linking individuals and groups to organizational contexts. Annual Review of Psychology, 44, 195–229. doi:10.1146/annurev.ps.44.020193.001211.

    Article  PubMed  Google Scholar 

  • Murphy, K. R. (2008). Explaining the weak relationship between job performance and ratings of job performance. Industrial and Organizational Psychology: Perspectives on Science and Practice, 1, 148–160. doi:10.1111/j.1754-9434.2008.00030.x.

    Article  Google Scholar 

  • Murphy, K. R., & Cleveland, J. N. (1995). Understanding performance appraisal: Social, organizational, and goal-based perspectives. Thousand Oaks, CA: Sage Publications.

    Google Scholar 

  • Murphy, K. R., Cleveland, J. N., Kinney, T. B., Skattebo, A. L., Newman, D. A., & Sin, H. P. (2003). Unit climate, rater goals and performance ratings in an instructional setting. Irish Journal of Management, 24, 48.

    Google Scholar 

  • Murphy, K. R., & DeShon, R. (2000). Interrater correlations do not estimate the reliability of job performance ratings. Personnel Psychology, 53, 873–900.

    Article  Google Scholar 

  • O’Neill, T. A., Goffin, R. D., & Gellatly, I. R. (2012). The use of random coefficient modeling for understanding and predicting job performance ratings: An application with field data. Organizational Research Methods, 15, 436–462. doi:10.1177/1094428112438699.

    Article  Google Scholar 

  • Oldham, G. R., Kulik, C. T., & Stepina, L. P. (1991). Physical environments and employee reactions: Effects of stimulus-screening skills and job complexity. Academy of Management Journal, 34, 929–938. doi:10.2307/256397.

    Article  Google Scholar 

  • Peters, L. H., & O’Connor, E. J. (1980). Situational constraints and work outcomes: The influences of a frequently overlooked construct. Academy of Management Review, 5, 391–398. doi:10.5465/AMR.1980.4288856.

    Google Scholar 

  • Putka, D. J., Ingerick, M., & McCloy, R. A. (2008). Integrating traditional perspectives on error in ratings: Capitalizing on advances in mixed-effects modeling. Industrial & Organizational Psychology, 1, 167–173. doi:10.1111/j.1754-9434.2008.00032.x.

    Article  Google Scholar 

  • Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Thousand Oaks, CA: Sage Publications.

    Google Scholar 

  • Raudenbush, S. W., Bryk, A. S., Cheong, Y. F., Congdon, R. T., & du Toit, M. (2011). HLM 7. Lincolnwood, IL: Scientific Software International.

    Google Scholar 

  • Reb, J., & Cropanzano, R. (2007). Evaluating dynamic performance: The influence of salient gestalt characteristics on performance ratings. Journal of Applied Psychology, 92, 490–499.

    Article  PubMed  Google Scholar 

  • Reb, J., & Greguras, G. J. (2010). Understanding performance ratings: Dynamic performance, attributions, and rating purpose. Journal of Applied Psychology, 95, 213–220.

    Article  PubMed  Google Scholar 

  • Roch, S. G., Woehr, D. J., Mishra, V., & Kieszczynska, U. (2012). Rater training revisited: An updated meta-analytic review of frame-of-reference training. Journal of Occupational & Organizational Psychology, 85, 370–395. doi:10.1111/j.2044-8325.2011.02045.x.

    Article  Google Scholar 

  • Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124, 262–274. doi:10.1037/0033-2909.124.2.262.

    Article  Google Scholar 

  • Scullen, S. E., Mount, M. K., & Goff, M. (2000). Understanding the latent structure of job performance ratings. Journal of Applied Psychology, 85, 956–970.

    Article  PubMed  Google Scholar 

  • Shore, T. H., & Tashchian, A. (2002). Accountability forces in performance appraisal: Effects of self-appraisal information, normative information, and task performance. Journal of Business and Psychology, 17, 261–274. doi:10.1023/A:1019689616654.

    Article  Google Scholar 

  • Spence, J. R., & Keeping, L. M. (2010). The impact of non-performance information on ratings of job performance: A policy-capturing approach. Journal of Organizational Behavior, 31, 587–608.

    Article  Google Scholar 

  • Spence, J. R., & Keeping, L. (2011). Conscious rating distortion in performance appraisal: A review, commentary, and proposed framework for research. Human Resource Management Review, 21, 85–95. doi:10.1016/j.hrmr.2010.09.013.

    Article  Google Scholar 

  • Spence, J. R., & Keeping, L. M. (2013). The road to performance ratings is paved with intentions: A framework for understanding managers’ intentions when rating employee performance. Organizational Psychology Review, 3, 360–383. doi:10.1177/2041386613485969.

    Article  Google Scholar 

  • Tesluk, P. E., & Jacobs, R. R. (1998). Toward an integrated model of work experience. Personnel Psychology, 51, 321–355.

    Article  Google Scholar 

  • Waldman, D. A., Yammarino, F. J., & Avolio, B. J. (1990). A multiple level investigation of personnel ratings. Personnel Psychology, 43, 811–835.

    Article  Google Scholar 

  • Wherry, R. J., & Bartlett, C. J. (1982). The control of bias in ratings: A theory of rating. Personnel Psychology, 35, 521–551.

    Article  Google Scholar 

  • Woehr, D. J., & Roch, S. (2012). Supervisory performance ratings. In N. Schmitt (Ed.), The Oxford handbook of personnel assessment and selection (pp. 517–531). New York: Oxford University Press.

    Google Scholar 

  • Zalesny, M. D., & Highhouse, S. (1992). Accuracy in performance evaluations. Organizational Behavior and Human Decision Processes, 51, 22–50.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Kemp Ellington.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ellington, J.K., Wilson, M.A. The Performance Appraisal Milieu: A Multilevel Analysis of Context Effects in Performance Ratings. J Bus Psychol 32, 87–100 (2017). https://doi.org/10.1007/s10869-016-9437-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10869-016-9437-x

Keywords

Navigation