Previous literature shows that managers, evaluating employees, insufficiently differentiate between strong and weak performers, which causes disadvantageous organizational outcomes. Bol et al. (Acc Organ Soc 51:64–73, 2016) demonstrate in an independent bonuses context that when these managers can base their evaluations on accurate information, they differentiate more when allocating bonuses, but only when evaluation outcomes are transparent. Our experiment replicates and extends Bol et al. (2016) in a fixed bonus pool context. We investigate the effects of information accuracy and whether managers get the opportunity to write a justification to their different employees when making their bonus allocations as an alternative way to create transparency. We hypothesize and find that justification increases managers’ differentiation in bonus allocations, but only—as in Bol et al. (2016)—when performance information accuracy is high. With a path analysis we disentangle the underlying process: we find that justification increases managers’ expectations that employees will perceive the differentiation in the bonus allocations as fair, especially when information accuracy is high. Finally, such managers’ expectations are positively related to the degree to which they differentiate in their bonus allocations.

  1. 1.

    We use ‘compression of performance ratings’ and ‘centrality bias’ as synonyms, in line with Bol (2011) and Moers (2005).

  2. 2.

    For fair performance evaluations the employees should not perceive any difference between their rewards-to-input ratio and the rewards-to-input ratio of any colleague (Adams 1963, 1965; Golman and Bhatia 2012; Walster et al. 1973). Fair performance evaluations therefore differ from equal performance evaluations, in which the allocated rewards to each employee are the same or similar, irrespective of employees’ input, output or work contributions (Walster et al. 1973). Equal performance evaluations are the result of centrality bias.

  3. 3.

    Performance information accuracy refers to the extent to which information is informative about employees' effort. It refers to the variability around a point estimate of an employee’s effort level (Bol et al. 2016). Information accuracy is similar to information precision (Banker and Datar 1989).

  4. 4.

    See Colella et al. (2007) for an overview of the costs and benefits related to pay secrecy.

  5. 5.

    When evaluating performance, managers may justify to their supervisor, their subordinates or both (Ferris et al. 2008). However, the justification of performance evaluations to subordinates is the norm (Libby et al. 2004).

  6. 6.

    We introduce a voluntary disclosure (a possibility to justify one’s bonus decision) as in Bol et al. (2016) instead of a forced disclosure (a requirement to justify one’s bonus decision) in order to increase the comparability between our study and Bol et al. (2016).

  7. 7.

    The university at which the experiment took place granted approval for the experiment.

  8. 8.

    The accuracy of the information used in subjective performance evaluations could theoretically range from ‘no accuracy’ to ‘full accuracy’, however in practice these extremes are rare. Totally inaccurate information (‘no accuracy’) will not provide a good basis for a subjective performance evaluation and will therefore not be used. Fully accurate information (‘full accuracy’) is more often associated with formula-based evaluations that explicitly weigh each performance measure in a formula than with subjective performance evaluations. However, even under highly to fully accurate information, one could prefer subjective performance evaluations in order to avoid “‘game-playing’ associated with any formula-based plan, the possibility that bonuses will be paid even when performance is ‘unbalanced’ (i.e., overachievement on some objectives and underachievement on others)” (Ittner et al. 2003).

  9. 9.

    However, justification is not always positive. Ashton (1990) discusses how a justification requirement (instead of the justification possibility, as in our paper) can also alter the focus of evaluators from making good evaluations towards making ‘justifiable’ evaluations and good justification of evaluations. Next, Bartlett et al. (2014) show how the additional processing of information due to justification can increase judgment biases in the presence of both relevant and irrelevant information, as it increases the processing of all performance measures instead of only the relevant ones.

  10. 10.

    Informational fairness perceptions refer the extent to which employees perceive that they receive timely, accurate, and reasonable explanations about decision-making processes or outcomes (Colquitt 2001).

  11. 11.

    We conduct the experiment in the z-Tree experimental software (Fischbacher 2007).

  12. 12.

    Firms often use scorecards that only contain performance measures that are common to all business units (Cardinaels and van Veen-Dirks 2010).

  13. 13.

    Bol et al. (2016) measure differentiation in the bonus allocation by the difference in bonuses allocated to the strongest and the weakest performer, which is similar to our variable Bonus Range. Bol (2011) measures differentiation in the bonus allocation by the ratio between the standard deviation of the objective performance measures and the standard deviation of the subjective performance ratings provided by a manager to all employees in a reference group. Our variable Differentiation is similar to the measure of Bol (2011), as the objective performance measures are the same in all treatments of our paper and we only focus on the standard deviation of the subjective performance ratings of all employees.

  14. 14.

    Every p-value mentioned in this paper is a two-sided p-value.

  15. 15.

    Even in the Low Accuracy conditions participants thought the performance measures were quite accurate though. Their average response to the statement “the three performance measures provided a highly accurate image of the performance of each individual store manager” was 4.13 on 7, which is not significantly different from the scale midpoint of 4 (t145 = 1.089; p = 0.278), but still rather high.

  16. 16.

    ANOVA analyses allow to detect significant differences between cell means, but are not able to signal the functional form of the relationship among cell means. Contrast analysis is a refinement of ANOVA which allows to test a specific functional form of the relationship among cell means (i.e. a specific pattern predicted for the cell means). Contrast coding has greater statistical power than the conventional ANOVA (Buckless and Ravenscroft 1990). Therefore, contrast analysis is more suitable than conventional factorial ANOVA in case of an ordinal interaction effect because it allows for more statistical power in order to reveal the predicted interaction i.e. it allows for more statistical power in order to reveal the specific pattern of hypothesis 1 in which we specifically predict that differentiation in the bonus allocation is significantly higher in the High Accuracy—Justification condition than in the other three experimental conditions. We refer to Buckless and Ravenscroft (1990) and Bol et al. (2016) for a more extensive discussion of this matter.

  17. 17.

    A semi-omnibus F-test on the remaining between-group variance indicates that the mean value of Maximum Bonus is not significantly different across the three remaining experimental conditions (F2, 288 = 0.080; p = 0.923).

  18. 18.

    A semi-omnibus F-test on the remaining between-group variance indicates that the mean value of Minimum Bonus is not significantly different across the three remaining experimental conditions (F2, 288 = 0.364; p = 0.695).

  19. 19.

    A semi-omnibus F-test on the remaining between-group variance indicates that the mean value of the average bonus allocated to the three mediocre store managers is not significantly different across the three remaining experimental conditions (F2, 288 = 0.017; p = 0.983).

  20. 20.

    A semi-omnibus F-test on the remaining between-group variance indicates that the mean value of the standard deviation of the bonus allocated to the three mediocre store managers is not significantly different across the three remaining experimental conditions (F2, 288 = 0.176; p = 0.838).

  21. 21.

    Using a seven-point Likert scale, we collected this data at the end of the experiment in order to avoid that participants’ attention was attracted to the presence or absence of the possibility for justification depending on the condition they were in and in order to avoid leading participants to certain response patterns.


We thank the editor and two anonymous reviewers, Jasmijn Bol, Eddy Cardinaels, John Christensen, Thomas De Groot, Henri Dekker, Sophie De Winne, Kathryn Kadous, Victor Maas, Karl Schuhmacher, Marcel Van Rinsum, Eelke Wiersma, and participants at the European Network for experimental Accounting Research Summer School 2015, at the Amsterdam Research Center in Accounting seminars 7th of March 2016 at Vrije Universiteit Amsterdam, at the Annual Conference for Management Accounting Research 2016, and at the Annual Conference of the European Accounting Association 2016 for helpful comments.

