Many economic activities secure both immediate returns and rewards to reputation. For example, a company that bolsters earnings gets not only those monies, but the reward of a higher price/earnings ratio and hence a higher stock price. Mutual funds that perform well get an inflow of funds. In some cases, a player’s reputation reward will swamp any benefits from his immediate return. In the corporate context, there may be a “cash out” event, such as the sale of the whole company. For a division manager, a big promotion or a strong outside offer would play much the same role. This paper addresses situations where reputations are a heavily weighted consideration.

Asymmetric information drives our analysis. We assume that each player, hereafter agent, knows his quality, but others do not. His reputation will be inferred from his risk choice and the payoff that results. Our central question in this paper is how much risk good and bad agents will choose. Given that there are significant reputation rewards, we argue that even risk-neutral players will choose their risk level strategically. They will sacrifice some expected direct payoff in the hope of burnishing their reputations.

We look at situations where players’ risk choices are observable. Transactions in which corporations are sold well illustrate this case. Financial markets can see whether a retail chain attempted to expand rapidly—swifter expansion entails higher risk—or how diversified a Real Estate Investment Trust (REIT) is across metropolitan areas.

The sequence of moves is as follows: 1. The agent chooses his risk level, and that is observable to all. 2. Both the agent and outsiders learn the outcome. 3. The outsiders then contract with the agent based on the inference they draw from what they observe. That is, outsiders update their assessment of the agent using two signals: (1) the lottery outcome, and (2) the level of risk chosen, which serves as a “signal” in the Spence (1974) sense. We show that: First, agents will often pool on their risk choice. Second, when they do, all agents, regardless of type, are induced into conspicuous conservatism. That is, all agents will choose a level of risk below the one that would maximize their expected direct payoff. That is because good agents are in the driver’s seat; low risk levels help them distinguish themselves. Bad agents must choose low risk levels as well, lest they reveal their type.

We proceed as follows. “The model” presents the model. “Results” develops our results. “Conclusion” concludes.

1 The model

We assume that agents have private information regarding their quality. Agents care about the lottery outcome because they stand to gain from the reputation they acquire among outsiders, based in part on the lottery that they choose, and in part on the lottery outcome.

Agents choose a level of risk for their activity. We assume the risk choice to be continuous. Formally, each agent chooses from a one-dimensional family of random variables indexed by its variance, V, where \( \underline{V} < V < \overline{V} \).

The choice of V is common knowledge. That is, outsiders who are drawing inferences about the agent’s type, see V, and update accordingly. By contrast, when the risk choices by agents are unobservable, the situation considered earlier by Degeorge et al. (2004), hereafter DMZ, the results are quite different. The prime concepts in this paper, such as signaling strategies, and pooling and separating equilibria, do not apply. To facilitate contrasts, we follow the features of the DMZ model, with one crucial difference: In our setting risk choice is observable.

The random variable \( \widetilde{x}_{V} \) represents first-period performance (say, the test score, or company earnings). It is distributed normally with variance V and mean \( \mu {\left( {V,\theta } \right)} = \mu {\left( V \right)} + \theta \), where θ indexes the agent’s type. We will refer to the θ = 0 type agent as bad and the θ = Δ type as good, where Δ > 0. The prior probability that an agent is of the good type, denoted by p, is common knowledge, as is the mean-variance schedule μ(V).

Each agent chooses a point on the mean-variance schedule given by μ(V) and for given V an agent of type θ has performance

$$ \widetilde{x} \sim N{\left( {\mu {\left( V \right)} + \theta ,V} \right)}. $$

We posit that μ(V) is single-peaked, with its maximum at internal point V*, and concave, i.e., the marginal benefit of adding variance decreases throughout, turning negative beyond V*. In fact, the mere existence of an interior maximum is sufficient for most of our theoretical results.Footnote 1 The existence of such an interior maximum follows naturally from the usual assumption that agents have only a finite amount of favorable lotteries.Footnote 2 Our theoretical results, apart from Claim 3, do not assume concavity. Single-peakedness of μ(V) is assumed for the proof of Claim 1 and Claim 3.Footnote 3

It is convenient (though by no means essential) to assume that \( \mu {\left( V \right)} \to - \infty {\text{ }}as{\text{ }}V \to \underline{V} {\text{ }}or{\text{ }}V \to \overline{V} \), and we maintain this assumption throughout.

At the beginning of period 0, an agent learns his type, and then chooses his desired variance V. His random performance x is then drawn according to the equation above, reaped by him, and observed by all, and the period ends. At the beginning of period 1, outsiders draw inferences, and the agent reaps the rewards to his reputation.

All agents are assumed to be risk-neutral, so that in a full-information setting agents would choose the mean-maximizing level of variance V*.Footnote 4 We assume risk neutrality to facilitate exposition and because we are confident the same qualitative results would apply—namely agents tilting toward conservatism in risk choice—if agents were risk averse, as is normally assumed. We investigate how asymmetric information leads to departures from this optimum. Choosing a variance less than V* represents risk-reducing behavior; choosing a variance greater than V* would increase risk.

We can formulate the last step in the game as the sale by the agent of his capital—human, organizational or intellectual—to a new long-term owner. It will clearly be optimal for the new owner to choose the level of variance V* that maximizes expected performance (we assume risk-neutrality throughout), so the expected return to an agent of type θ from next period onward is \( {\sum\limits_{t = 1}^\infty {\frac{{\mu {\left( {V^{*} } \right)} + \theta }} {{{\left( {1 + r} \right)}^{t} }}} } = \frac{{\mu {\left( {V^{*} } \right)} + \theta }} {r} \) where r is the discount rate. Thus, the expected present value (performance plus expected price) of an agent with performance x this period is

$$ x + \frac{{\mu {\left( {V^{*} } \right)}}} {r} + \frac{1} {r}E{\left[ {\left. {\theta \,} \right|{\text{all}}\,{\text{available}}\,{\text{information}}} \right]}, $$

where buyers use Bayesian analysis to compute the expectation term.

Agents choose V to maximize their expected total payoff, which is given by their present value, namely

$$ \Pi = {\mathop {\max }\limits_V }E{\left[ {\left. {\widetilde{x} + \frac{{\mu {\left( {V^{*} } \right)}}} {r} + \frac{1} {r}E{\left[ {\left. {\theta \,} \right|\,all\,available\,\inf ormation} \right]}} \right|choice\,of\,V} \right]}. $$

If θ were observable, this problem would be trivial; we would have

$$ E{\left[ {\left. {\theta \,} \right|\,all\,available\,\inf ormation} \right]} = \theta , $$

and agents would always set V = V*. We consider the case where θ is private information, so the agent is faced with a trade-off in maximizing its expected total payoff: clearly, setting V = V* maximizes the expected performance; however, deviating from V* will change the information that flows to outsiders, and has the potential to increase the expected reputation.

2 Results

All proofs are gathered in the Appendix.

In a standard signaling game, e.g., Cho and Kreps (1987), outsiders draw inferences from an agent’s choice, e.g., whether he goes to college. Our model adds one stage to a standard signaling game; namely a performance signal x that is produced by a probabilistic process, and that is revealed to all. The agent receives the direct payment x, plus their expected price from an outsider. Total payment is given by the random variable

$$ \widetilde{x} + \frac{{\mu {\left( {V^{*} } \right)}}} {r} + \frac{1} {r}E{\left[ {\left. \theta \right|V,\widetilde{x}} \right]}. $$

We consider only pure strategy equilibria. When both types of agents choose the same V, a pooling equilibrium results. When agents choose different V’s, a separating equilibrium emerges. Even in a pooling equilibrium, the current performance is informative about each agent’s type. Thus our model involves both signaling and signal-jamming. The choice of variance is the signal in the Spence (1974) sense. The lottery outcome is the performance signal. Its degree of informativeness can be attenuated through a choice of high variance. When risk choice is observable, however, agents cannot surreptitiously manipulate variance, and we shall see that bad agents have a strong incentive to mimic the good agents’ choices of risk.

We posit no exogenous difference in costs between types: such differences are the factor that drives ordinary signaling models.Footnote 5 Instead, we find that an endogenous reputational difference emerges between types. That is, the good type has less to gain by deviating from the mean-maximizing variance choice than does the bad type, because, as the good type wishes, his ability will be revealed partly through his performance.

Outsider beliefs play an important role here, as they do in standard signaling games. First, after observing an agent’s choice of variance V, outsiders update their prior probability p that the agent is good to a posterior ξ(p, V) via Bayesian updating (where possible). Second, after observing the agent’s performance x, outsiders then further update ξ to a new posterior \( \widehat{\xi }{\left( {\xi ,x} \right)} \), using Bayesian updating.

So an agent of type θ = 0 or Δ who chooses variance V expects a total payoff of

$$ \Pi _{\theta } {\left( V \right)} = \mu {\left( V \right)} + \theta + \frac{1} {r}{\left( {\mu {\left( {V^{*} } \right)} + \Delta E{\left[ {\left. {\widehat{\xi }{\left( {\xi {\left( {p,V} \right)},\widetilde{x}} \right)}} \right|\theta ,V} \right]}} \right)}. $$

2.1 Pooling equilibria

Claim 1

A necessary and sufficient condition for \( \widehat{V} \) to represent a pooling equilibrium is that

$$ \frac{\Delta } {r}E{\left[ {\widehat{\xi }\left. {{\left( {p,x} \right)}} \right|0,\widehat{V}} \right]} \geqslant \mu {\left( {V*} \right)} - \mu {\left( {\widehat{V}} \right)} $$

The set of \( \widehat{V} \) values leading to a pooling equilibrium includes V* and forms a union of closed intervals.

The left hand side (LHS) of the inequality represents the gain to a bad agent of pooling at \( \widehat{V} \) instead of choosing its mean-maximizing variance, V*, and thereby admitting up front to being bad. The right hand side (RHS) is the cost, in foregone performance, of choosing \( \widehat{V} \) rather than the mean-maximizing value. For \( \widehat{V} \) to be an equilibrium it must be the case that both types value the returns to pooling more than the cost of pooling, i.e. prefer pooling. It is easy to see that if the bad type prefers \( \widehat{V} \) to V*, so does the good type, since the good type has a higher expected reputation from pooling but pays the same price for deviating from V*. If \( \widehat{V} \) is equal to V*, then the right hand side of the inequality is equal to zero while the left hand side is always non-negative, so V* itself is always a pooling equilibrium.

Welfare and efficiency properties of pooling equilibria

Relative to the full information case, signaling through one’s choice of risk reduces aggregate welfare. The manipulation of reputations is a negative-sum game among bad and good agents. Consider the “expected reputation”—defined formally below—over the two types of agents. This expectation must equal the prior reputation. Therefore any improvement in the expected reputation of one type of agent will exactly cancel out the corresponding deterioration for the other. However, since agents depart from the performance-maximizing variance V* in order to enhance their reputations, value in direct payoffs is sacrificed, and the net efficiency effect is negative.Footnote 6

Claim 2 formalizes the comparison of equilibria. Informally, good agents will prefer to pool at lower variance because it enables them to better distinguish themselves from bad agents. Bad agents will prefer to pool at high variance, since noisier performance makes it more likely that bad agents will produce performance common to good agents. Using this result, we show that in general there exists a continuum of Pareto-unranked pooling equilibria, although some such equilibria are Pareto-ranked.Footnote 7 In particular, so long as it is increasingly costly to add risk, there will always be a pooling equilibrium at a variance higher than V*, which is Pareto-dominated by pooling at V*.

Definition

Given a pooling equilibrium\( {\left( {\widehat{V},\widehat{V}} \right)} \), the expected reputation in this equilibrium of an agent of type θ is defined as:

$$ \begin{array}{*{20}c} {ER{\left( {\mu ,\widehat{V},\theta } \right)} = \Delta E{\left[ {\left. {\widehat{\xi }{\left( {p,\widetilde{x}} \right)}} \right|\theta ,\widehat{V}} \right]}} \\ { = {\int_{ - \infty }^{ + \infty } {\Delta \widehat{\xi }{\left( {\left. {p,x} \right|\mu ,\widehat{V}} \right)}} }f{\left( {\left. x \right|\mu + \theta ,\widehat{V}} \right)}dx.} \\ \end{array} $$

Thus \( ER{\left( {\mu ,\widehat{V},\theta } \right)} \) gives the Bayesian estimate of the expected value of θ, conditional on observing the draw, if the agent is in fact of type θ. Here \( \widehat{\xi }{\left( {\left. {p,x} \right|\mu ,\widehat{V}} \right)} \) is the posterior probability that the agent is good when p was the prior probability and x was observed, while \( f{\left( {\left. \cdot \right|\mu ,\widehat{V}} \right)} \) is the density function associated with a normally distributed random variable with mean μ and variance \( \widehat{V} \).

Claim 2

Given two pooling equilibria \( {\left( {\widehat{V}_{1} ,\widehat{V}_{1} } \right)} \) and \( {\left( {\widehat{V}_{2} ,\widehat{V}_{2} } \right)} \), if \( \widehat{V}_{1} < \widehat{V}_{2} \) then good agents have a higher expected reputation under \( \widehat{V}_{1} \) than under \( \widehat{V}_{2} \) (so bad agents have a lower expected reputation).

In other words, as the signal becomes noisier, bad agents are better able to hide their type, and good agents less able to reveal theirs. We can make a number of general observations about the payoff profiles generated by different pooling equilibria. Figure 1, which is drawn for the case where good agents comprise one half of the population, illustrates.

Fig. 1
figure 1

Welfare and efficiency properties of pooling equilibria

Claim 3

  1. (1)

    For ɛ small and positive, the pooling equilibrium \( {\left( {V^{*} + \varepsilon ,V^{*} + \varepsilon } \right)} \) gives the good agent a strictly lower payoff than does the pooling equilibrium \( {\left( {V^{*} ,V^{*} } \right)} \) (point B on Figure 1), and the bad agent a strictly higher payoff. The opposite is true for pooling at\( {\left( {V^{*} - \varepsilon ,V^{*} - \varepsilon } \right)} \). In particular, this means that there is a continuum of Pareto-unranked pooling equilibria (the portion of the curve in Figure 1 between points A and C).

  2. (2)

    If \( V^{*} < \widehat{V}_{1} < \widehat{V}_{2} \), then the good agent strictly prefers \( \widehat{V}_{1} \) to \( \widehat{V}_{2} \). Whenever \( \widehat{V}_{1} < \widehat{V}_{2} \) a good agent always gains more, or loses less, than a bad agent from a move to the lower variance equilibrium:

    $$ \Pi _{G} {\left( {\widehat{V}_{1} } \right)} - \Pi _{G} {\left( {\widehat{V}_{2} } \right)} > \Pi _{B} {\left( {\widehat{V}_{1} } \right)} - \Pi _{B} {\left( {\widehat{V}_{2} } \right)} $$

    or, equivalently, \( {d\Pi _{G} } \mathord{\left/ {\vphantom {{d\Pi _{G} } {d\widehat{V}}}} \right. \kern-\nulldelimiterspace} {d\widehat{V}} < {d\Pi _{B} } \mathord{\left/ {\vphantom {{d\Pi _{B} } {d\widehat{V}}}} \right. \kern-\nulldelimiterspace} {d\widehat{V}} \).

  3. (3)

    If the marginal cost of taking on variance is increasing (μ(v) is concave), then there are always pooling equilibria that are weakly (point D on Figure 1) and strongly Pareto-dominated (point E) by pooling at (V*,V*) (point B).

  4. (4)

    There exists \( V\prime < V* \) such that \( {\left( {V\prime ,V\prime } \right)} \) is a pooling equilibrium (point A on Figure 1), and gives the highest payoff to good agents out of all possible equilibria.

2.1.1 Three observations

  1. 1.

    The expected payoff in this game consists of two portions: the immediate payoff and the reputation. Averaged across types the posterior reputation has to be the same as the prior, so that V only affects the expected direct payoff averaged across types, which is maximized at V*. Hence pooling at V* maximizes the aggregate expected payoff of all agents. That is, of any pooling equilibrium, it gives the highest value of \( p\Pi _{G} + {\left( {1 - p} \right)}\Pi _{B} \), so the tangent to the graph of payoff profiles at point B in Figure 1 is given by the equi-payoff line through B, whose equation is \( p\Pi _{G} + {\left( {1 - p} \right)}\Pi _{B} = {\left( {1 + r^{{ - 1}} } \right)}{\left( {\mu {\left( {V^{*} } \right)} + p\Delta } \right)}. \) Other equi-payoff lines are parallel to this one.

  2. 2.

    Figure 1 shows a region of Pareto-ranked low-variance equilibria to the left of A on the curve. However, there is no simple general condition on \( \mu {\left( V \right)} \) that guarantees its existence; that is, for low variance equilibria there is no analog to part (3) of Claim 3.

  3. 3.

    The slope of the curve in Figure 1 is given by \( {d\Pi _{G} } \mathord{\left/ {\vphantom {{d\Pi _{G} } {d\Pi _{B} }}} \right. \kern-\nulldelimiterspace} {d\Pi _{B} } = {\Pi ^{\prime }_{G} {\left( V \right)}} \mathord{\left/ {\vphantom {{\Pi ^{\prime }_{G} {\left( V \right)}} {\Pi ^{\prime }_{B} {\left( V \right)}}}} \right. \kern-\nulldelimiterspace} {\Pi ^{\prime }_{B} {\left( V \right)}} \), so from part (2) of Claim 3 we can deduce that the curve is flatter than 45° to the left of A, and steeper than 45° to the left and below C.

2.1.2 Equilibrium selection

As is common in signaling models, we have a continuum of equilibria. Alas, standard equilibrium refinements are not effective here. Focal point theory (Schelling 1960) may help the good agents coordinate on a beneficial outcome. Pareto optimality is a salient property, which suggests some location on the frontier between A and C. Point \( {\left( {V\prime ,V\prime } \right)} \), here represented by point A, stands out. It is readily recognizable as the best equilibrium for the good agents. Moreover the cost of deviating from it to a higher variance is less for bad agents than for good, suggesting that deviators will be branded as bad. Given the prominence of \( {\left( {V\prime ,V\prime } \right)} \) for good agents, they could readily coordinate in choosing it, whether on a tacit or open basis, and bad agents would have no place else to go.Footnote 8

2.2 Separating equilibria

The driving force behind separating equilibria in signaling models is cost differences across types. But our model has no such differences. Although two separating equilibria exist—in both the bad type chooses V*—neither is consistent with intuition or the Banks and Sobel (1987) “divinity” equilibrium refinement, and we rule these equilibria out on these grounds.

Claim 4

This game has exactly two separating equilibria in pure strategies, in which the good type chooses V g and the bad type chooses V*(see Figure 2). In one separating equilibrium, the good type chooses \( V_{{g_{1} }} < V^{*} \); in the other, \( V_{{g_{2} }} > V^{*} \). In each of these equilibria, both types are indifferent between choosing V g and choosing V* (equivalently, for both types, the incentive compatibility constraint binds). Neither equilibrium is consistent with the Banks–Sobel “divinity” equilibrium refinement (extended to this model in a natural way).

Fig. 2
figure 2

Separating equilibria

These separating equilibria are intuitively unappealing. First, the good type is indifferent between its own strategy and that of the bad type, even though the market assigns probability 1 to its being of the bad type if it chooses V*. If there were any probability that the market expected the agents to pool at V*, then a deviation to V* would be (in an unformalized sense) a dominant strategy for the good type. We therefore rule out these separating equilibria.Footnote 9

2.3 Discussion

Our results suggest that when the level of risk chosen by agents is conspicuous to outsiders, and thus functions as a signal in the Spence (1974) sense, agents will often pool at levels of risk below the performance-maximizing level, since good agents prefer that performance reveal true quality. For a bad agent, increasing noise in the hope of clouding the picture is a pointless venture. That very choice would admit to his low quality.

Strategies that reduce the noise of a signal are well known to induce significant amounts of pooling in various contexts. Thus, students applying to elite colleges that do not require the SAT know that not taking it conveys negative information about their type.

Our conspicuous conservatism argument might also explain the strong resistance of some institutions to change. Consider the French higher education system marked by “grandes écoles,” the springboards for most of the French elite. The system was essentially invented in Napoleon’s time and it can be argued that it has evolved little since, despite cogent criticisms of its operation. What could account for such stability? In order to enter a “grande école” students must pass a series of examinations that provide a rather precise assessment of ability. Any alternative educational system—such as one that would put less emphasis on testing students upfront and more emphasis on student learning, for example—faces huge hurdles in establishing itself. Students opting for such an alternative system would automatically be labeled as bad. In fact, many students with little chance of entering a “grande école” try anyway, in the small hope of securing the good label.

For bad types, conservatism pays if it is conspicuous. In contrast, given that the agent’s choice of risk is unobservable in the situation studied by DMZ, it cannot function there as a Spence-type signal. Outsiders only observe the outcome x of the lottery. This prior work shows that in such contexts, good agents choose low levels of risk, and bad agents choose high levels—provided outsiders have no strong priors about whether agents are good or bad. Good agents are seeking to reduce noise so as to stand out. Bad agents are seeking to increase noise in the hope of producing the results of good agents. For example, a strong student might choose a low-risk strategy, e.g., avoiding guesses on a multiple-choice test that deducts for wrong answers, so as to reduce noise and maximize information flow. A weak student might choose otherwise. A somewhat similar intuition emerges from the work of Tsetlin, Gaba and Winkler (2004). They analyze the strategic choice of risk in multiround contests, and contests with handicaps (but without private information). They find that contestants in a weak position (e.g., low mean, high handicap, or low previous performance in a multiround contest) should maximize risk, and those in a strong position should minimize it.Footnote 10 In this vein, Chevalier and Ellison (1997) document that mutual fund managers with poor January-September performance increase the risk of their investment strategy in the fourth quarter, while managers with strong year-to-date performance reduce it. At a cost, investors have the potential to assess the risk associated with a manager’s portfolio choice, since portfolios are announced periodically. Since poor first-half performers select distinctively high risk levels, presumably not many investors accept the cost to do such monitoring. Were rating firms like Morningstar to make risk assessments of mutual funds widely and cheaply available, gambles to “catch up” in the second half of the year would become less common.

3 Conclusion

We analyzed the levels of risk good and bad agents take on when they know their quality but outsiders do not. If the agents’ risk choices are observed by outsiders, then, invoking reasonable criteria about market beliefs, a single pooling equilibrium is likely to emerge. Good types set the standard, so they select the equilibrium that is most favorable for them. At it, both good and bad types will choose a risk level below the one that maximizes their respective expected performance. We conclude that when risk choice is observable, agents with private information on their quality face strong incentives—regardless of their quality—to pick performance lotteries with low risk. When risk choice is conspicuous, conservatism helps good types separate themselves. Bad types will not like the equilibrium. Nevertheless, they will choose conservatism because that is better than maximizing their expected outcome but admitting their type.