The principal-agent problem with smooth ambiguity

We study a principal-agent model in which the (effort-dependent) realisation of output levels is ambiguous, and the agent is ambiguity averse (while the principal is ambiguity neutral). We show that introducing ambiguity aversion will lower profits if the action that the principal wants to implement is the most ambiguous one, while they may increase otherwise. Regarding the design of the optimal contract, we show that under ambiguity aversion the optimal incentive scheme may not be monotone even if a natural generalization of the monotone likelihood ratio property is satisfied, and illustrate how this fact could affect the design of contracts in an applied economic context. We also find that the individual rationality constraint need not bind in the presence of ambiguity aversion unless preferences satisfy constant absolute ambiguity aversion.


Introduction
In its most standard form the principal-agent problem is typically described in the following way. The owner of a firm, the principal, has to decide how to remunerate a manager, the agent, who is in charge of running the firm. The effort of the agent determines, to a large extent, the quantity produced by the firm, but the firm's output is also influenced by random events that are beyond the control of the manager. The problem arises as the manager's effort cannot be observed (or otherwise inferred) by the principal.
As it is commonly assumed that the principal is risk neutral and the agent risk averse, the solution to this problem arises as the optimal trade-off between risk-sharing and incentives. Thus it should be apparent that the optimal wage scheme is sensitive to the agent's reaction to non-deterministic payoffs. Most of the literature assumes that the agent (and the principal) are expected utility maximisers. In the context of the principalagent problem, the expected utility framework requires the agent to treat the uncertainty about her monetary payoffs in the same way as a lottery over monetary prizes with known probabilities. The well-known Ellsberg paradox however has convincingly cast some doubt at the validity of such an assumption (Ellsberg 1961). In response, the decision theory literature has generalised the expected utility model to accommodate aversion (as well as other attitudes) towards ambiguity, which can explain the behaviour observed in the Ellsberg paradox. Specifically, ambiguity is defined as subjective uncertainty about outcome distributions. Our paper addresses the question to what extent the traditional analysis of the principal-agent problem remains valid in situations where ambiguity about the consequences of the agent's actions prevails.
Why might ambiguity be of particular interest in the principal-agent problem? In many cases where moral hazard arises in economic interactions, it seems reasonable to assume that the consequences of the agent's actions cannot be described with great confidence by a single probability distribution. A good example may be a firm that employs a scientist to develop a new production technique. Yet, in other instances ambiguity may well be less of an issue. If the principal routinely contracts with some agent to do always essentially the same task, then it should be possible to have a precise understanding about the consequences of each of the agent's actions. This might be true for the relationship between a firm and one of her sales agents. Thus, our approach also enables us to address the question how the optimal incentive scheme will differ between such situations. Additionally, ambiguity can vary between the actions available to the agent. Consider again a scientist: For her, exerting a lot of effort to develop an entirely new approach may have very ambiguous benefits. Exerting low effort (by merely adopting an existing technology) may still not lead to a deterministic outcome, but there may be less ambiguity about the outcome distribution. For the case of a "routine" principal-agent situation only the action which is in fact implemented in equilibrium might be well understood. The other actions available to the agents might very well be of ambiguous consequences (as they are never actually chosen). 1 In the principal-agent problem the optimal incentive contract is, in general, inefficient. That is, if the principal could observe the action chosen by the agent she could offer a different wage scheme that makes her better off without harming the agent. As long as the optimal incentive scheme does not leave any rent to the agent, the degree of inefficiency is directly linked to the principal's profit. Therefore, we want to find out whether profits necessarily decrease when ambiguity matters. Ambiguity may also affect other properties of the optimal incentive scheme. Wage schedules are often expected to be monotonic. This means that if an outcome realises that is better for the principal, the agent receives a higher wage as well. Even without ambiguity additional assumptions are needed to ensure the optimality of monotonic schemes. Thus we seek to show whether similar assumptions are sufficient in the presence of ambiguity, or if stronger assumptions are needed.
To address these questions we adopt a recent model of decision making under ambiguity, the smooth ambiguity model (Klibanoff et al. 2005). This is a key difference between this paper and the work by Ghirardato (1994) (which predates the smooth model), who studies the principal-agent problem with a different model of ambiguity. In his paper, the agents use capacities instead of probabilities, which are evaluated using Schmeidler's Choquet expected utility theory (Schmeidler 1989). A major advantage of using the smooth ambiguity model is that it suggests a clear distinction between ambiguity and attitude towards ambiguity, and allows each to vary separately, while in the Choquet expected utility model these two concepts cannot be separated. 2 For most parts of the paper, we assume that absolute ambiguity aversion is constant. We will show that without this assumption the individual rationality constraint might not bind in the optimal contract, so that the agent can be strictly better off than under her outside option.
Lang (forthcoming) contains a result complementary to this paper for the case of infinitely many effort levels. He finds that under the smooth ambiguity model (and other second-order models) a strictly positive effort level is typically optimal while under first order models of ambiguity aversion, like the maxmin expected utility model (Gilboa and Schmeidler 1989) or the Choquet expected utility model, zero effort may be optimal in some circumstances. Lopomo et al. (2011) show that given Bewley preferences (which can be interpreted as ambiguity averse preferences), the optimal contract may be coarser due to ambiguity. Contracting under vague information is studied in Viero (2012), the effects of loss aversion are investigated in Herweg et al. (2010). Carroll (2015) find that ambiguity about available actions can lead to linear contracts. Di Tillio et al. (2014) and Bose and Renou (2014) provide rationales for designing ambiguous contracts even if there is initially no ambiguity about the relevant distributions. We restrict ourselves instead to studying standard (i.e. unambiguous) contracts in an exogeneously ambiguous setting. For the case of more than one agent, Kellner (2015) shows ambiguity aversion can make the use of tournaments more attractive.
Other papers related to moral hazard and ambiguity are Mukerji (1998Mukerji ( , 2003. For other more distantly related papers on the implications of ambiguity in economics see the survey by Mukerji and Tallon (2004). Also of interest, as another application of the smooth ambiguity model, in this case to portfolio choice, is Gollier (2011). While it is beyond the scope of this introduction to give an overview of the contributions to the principal-agent literature, it is worth mentioning that our formulation of the problem most directly corresponds to Grossman and Hart (1983) (henceforth GH), which we also use as a reference when we compare our results to the case without ambiguity aversion.
Even in the absence of ambiguity, very few general results regarding properties of the optimal contract are available. 3 Hence, we will follow established practice of restricting the model to special cases, most notably the case of two actions and two outcomes, as this allows us to highlight important channels how ambiguity and ambiguity aversion matters.
Regarding the shape of the optimal contract, we will build on the observation of Ghirardato (1994), who finds that, under ambiguity, non-monotonicity may arise even in the case of only two outcomes. We will however also show that in many cases the optimal wage contract will nevertheless entail a positive bonus to the agent if the better outcome realises, for instance if it is in a certain sense unambiguous that the high-cost action results more likely in a better outcome. On the other hand, we identify a new source of non-monotonicity for the case of at least three outcomes: Non-monotonicities may arise if there is more ambiguity about the probability of some outcomes than others. Then the principal might aim at reducing the ambiguity about wages by keeping them similar between the more ambiguous outcomes. We will argue using an example that this effect is not simply a curiosity, but may be quite relevant in the context of real-world incentive contracts.
In addition to describing the optimal wage contract, we provide some insight on how the principal's profits change in response to changes in the problem. In particular, as the smooth ambiguity model provides a separation between ambiguity and ambiguity aversion, we can study an increase in ambiguity and an increase in ambiguity aversion separately. We will show that these two changes may have very different effects. Replacing an ambiguity neutral agent with an ambiguity averse agent, for instance, will be bad for the principal, if she wants to implement the most ambiguous action. If the principal implements the least ambiguous action, then ambiguity aversion might actually increase profits, in marked contrast to the introduction of risk aversion in the standard model. The intuition behind this finding is that, while ambiguity aversion always makes the incentive constraint harder to satisfy, the effect on the individual rationality constraints depends on which action is the more ambiguous one. The situation is different, however, for an increase in ambiguity. Using a suitable definition of a uniform increase in ambiguity, we find that if ambiguity increases uniformly, then the principal's payoff may increase only if actually the most ambiguous action is to be implemented. This follows since, given our specification of preferences, a payment scheme that is based on the most ambiguous action is less affected from a further increase in ambiguity than any less ambiguous payment scheme.
Finally, we look at the case of more than two actions, and show that, given smooth ambiguity aversion, it may be the case that the binding incentive constraint in an optimal contract pertains to a more expensive action. Table 1 summaries the main differences between the expected utility approach, assuming ambiguity neutrality (col-  Ghirardato (1994). The remainder of this paper is organised as follows: Sect. 2 introduces the model. Section 3 discusses the properties of the optimal contract. It also motivates why we restrict the model to the case of constant absolute ambiguity aversion in the remaining parts of the paper. Section 4 discusses two properties of the optimal contract for the case of two outcomes only, assuming for the larger part that there are only two actions: First, we discuss sufficient conditions for the monotonicity of the optimal contract. Second, we discuss comparative statics in ambiguity and ambiguity aversion. Section 5 discusses first the robustness of these results to the case of more than two outcomes, and second to the case of more than two actions.

The model
In the standard formulation of the principal-agent problem, a principal (typically the owner of a firm) has to hire an agent (a worker or manager) to perform a certain task. The agent's effort choice while doing this task cannot be observed directly, but it determines the distribution of a random variable (the output of the firm). Only the principal, not the agent, directly cares about the realisation of this random variable. The principal's problem is to design a payment scheme that incentivises the agent to choose an action that is in the principal's best interest, given the constraint that the scheme cannot depend on the action choice itself.
We will assume that there are only finitely many output levels that the firm might possibly achieve, which are elements of the finite set Q = {q 1 , . . . , q I }. As a convention we require that q i < q j if i < j. As the principal can reward the agent based on the output level only, the wage scheme can be represented by a vector w = (w 1 , . . . , w I ). The agent can choose from a finite set of actions, A. Choosing an action a ∈ A incurs an effort cost c a to the agent. The standard treatment of the principal-agent problem would now assume that in addition to the effort cost, each action a is identified with a probability distribution p a ∈ (Q). That is, when the agent chooses action a, the probability that the firm achieves output level q i is given by p i . Our approach differs here since we introduce ambiguity by assuming that both the principal and the agent have a common but imperfect understanding of the (stochastic) relationship between the actions and the consequences [as axiomatised by Klibanoff et al. (2005)]. Specifically, they are unsure about these probabilities and thus they think that a set of probabilities might better describe the implications of each action. However, they do not necessarily consider all members of that set equally likely, but assign different likelihoods to them, which are specified by a probability measure μ a defined on (Q). To rule out any effects that may arise from information asymmetries, we assume that both principal and agent use the same measure to evaluate each act.
To close the description of the agent's preferences we assume, based on the smooth ambiguity model, that they can be represented by the utility function U : In this representation, the increasing function u : (w, ∞) → R represents the agent's attitude towards risk. We allow w to be −∞, but require that lim w→w u(w) = −∞. The function φ : R → R, represents the agent's attitude towards ambiguity. We assume that the agent is ambiguity averse, corresponding to a concave φ, and strictly risk averse, corresponding to a strictly concave u. 4 Typically our results carry over to the case where the agent is risk neutral, provided she is strictly ambiguity averse. 5 To interpret the utility representation, observe first that for all actions that can be described by a single, unambiguous probability distribution (i.e. for all actions a where μ a puts weight 1 on some p a ), the equation specialises to Thus such actions are ranked by the agent in the same way as by an expected-utility maximizer.
For ambiguous actions, the agent computes the expected utility associated to each possible probability distribution in the same way in a first step. Then she weights the resulting utilities according to her confidence in each probability distribution (given by μ a ), after she has applied the transformation φ, which represents her ambiguity attitude. Note that the function u(a, w) = u(w)−c a determines the agent's preferences between effort costs and wage payments. We have implicitly assumed that this function is additively separable, which is a common simplifying assumption. Note that due to the presence of ambiguity this does not necessarily mean however, that the function U (a, w) is additively separable. 6 Sometimes it is convenient to assume that the support of μ a is finite for all actions. In this case we write p i j for the probability of realizing output q i with action a according to the j-th probability distribution and the number μ a j denotes the weight assigned to this distribution given action a. Then the agent's utility of choosing action a given wage scheme w becomes Finally, we turn to the principal's payoff. We denote the firm's payoff for action a under payment scheme w by the function : A × (w, ∞) I → R. We assume the firm is both ambiguity and risk neutral (Klibanoff et al. (2005) show that ambiguity neutrality corresponds to φ affine). Thus, the firm's payoff can be expressed as The principal maximises expected net output (i.e. profit), where the expectation is taken with respect to the μ-average distribution p i dμ a .
The principal's problem As the formulation of the principal's problem differs from Grossman and Hart (1983) only by the way the uncertainty is modelled, we will be brief. It is generally useful to think about the principal's problem in two stages: First, she determines the cost of implementing each of the actions available to the agent. If it is impossible to implement an action, we will say that the cost is infinity. Second, she compares the benefits of each action with the implementation costs to determine which action she wants to implement.
Implementing a given action Suppose that the principal wants to implement action a using reward scheme w. The agent will work for the principal only if she prefers doing so to the outside option, which corresponds to a certain utility level, φ(u 0 ), which does not depend on the output distribution. Moreover, the agent must (weakly) prefer this action over any other action. Thus, the problem of optimally implementing a given action a can be summarized as to maximise (a, w) by choosing w subject to the constraints (IC) 6 We could maintain the assumption that U (a, w) is additively separable by choosing U (a, w) = Also in this case, the agents would act indentical to expected utility decision makers in the absence of ambiguity. Assuming constant absolute ambiguity aversion, the two approaches become equivalent.
It is important to observe that, when the agent is ambiguity neutral (φ affine) or there is no ambiguity (μ a degenerate for each action) this is exactly the standard GH model. The very same arguments as used there establish that this problem always has a solution. (This is formally stated and proven as Lemma 1 in the Appendix.)

The optimal incentive scheme
As mentioned in the introduction, even in the standard model it is not generally straightforward to describe the optimal incentive scheme, at least not without making additional assumptions about the model. Yet many interesting properties of the optimal incentive scheme can be stated at least for some special cases of the model. In this section we will present those (limited) results that hold even if we do not impose any constraints on the structure of the problem (that is, we do not limit the number of actions or outcomes), while in the larger part of Sect. 4 we focus on the most important special case (two outcomes and two actions).
The first question we address is whether the agent's IR constraint will bind, or whether she might end up strictly better off than with her outside option. Recall that in the standard model, the IR constraint will bind if utility is either additively (as we have assumed before) or multiplicative separable in monetary reward and effort costs.
Example 1 In this example there are two actions, H and L, with c H > c L , and two outcomes. Consider a decision maker with ambiguity attitude represented by the function φ(u) = − exp(− √ u/2). The high-cost action is ambiguous, weight μ H 1 = .7 is attributed to the distribution where the better outcome obtains for sure ( p H 21 = 1), probability .3 to the probability distribution where the worse outcome is certain, i.e. p H 22 = 0. Action L unambiguously results in the better outcome with a probability of .5, i.e. p L 21 = p L 22 = .5. 7 In this example, the individual rationality constraint does not bind in the optimal contract, for reasons illustrated in Fig. 1.
The graph depicts the agent's indifference curves in a state-preference graph, where the axes represent u(w 1 ) and u(w 2 ), the interim utilities the agent gets (resulting from some wage scheme) in the two different states. As u is increasing, each point in the graph corresponds to a particular wage scheme (w 1 , w 2 ). 8 The curve U H represents all wage schemes that leave the agent indifferent between not working for the principal (which results in the outside option without any effort costs), and choosing the high-cost action H (after accepting the principal's incentive scheme). In other words, U H shows all the points (u(w 1 ), u(w 2 )) such that the IR constraint binds, U H = {(u(w 1 ), u(w 2 )) : U (H, (w 1 , w 2 )) = φ(u 0 )}, and points to the right satisfy the constraint with a strict inequality. The curve U L represents all incentive contracts that leave the agent indifferent between choosing the low-cost action (if she accepts the contract) and refusing to work for the principal (which Fig. 1 The IR constraint need not bind again results in the outside option and the absence of any effort costs). Formally, As the agent is ambiguity averse, these indifference curves are convex. Now suppose that the principal wants to implement the high-cost action. If the IR constraint binds in an optimal wage scheme, the utility pair corresponding to any feasible wage schedule would have to lie not only on the schedule U H but also on or below the schedule U L . This follows since the incentive compatibility constraint requires that the low-cost action cannot result in a higher utility than the high-cost action, which in turn is indifferent to the outside option. But the figure shows that this is impossible, as the U L schedule is always below the U H schedule.
The schedule U H depicts the indifference curve of the agent (assuming she chooses H ) that corresponds to a slightly higher utility level. 9 Similarly, U L is the indifference curve corresponding to the new utility level and the low-cost action. Assuming the principal uses a payment scheme that is on the U H curve, she meets the IR constraint (but it does not bind), and all points on this schedule that are also on or below the curve U L meet the incentive constraint. Now the two curves do intersect (even at two points, marked by small circles), so the points between the two intersections meet both constraints. To gain intuition, note that H was disadvantaged compared to L not only by being more costly, but also by being more ambiguous. At a higher utility level, the second disadvantage is reduced given preferences with decreasing ambiguity aversion, making it easier to implement H . Thus it is possible to implement the high-cost action at finite cost.
As the principal is ambiguity neutral, the benefit of implementing action H is directly proportional to q 2 − q 1 , and hence can be chosen arbitrarily large by chosing a high enough q 2 . Therefore, there is indeed a concrete principal-agent problem where it is optimal to implement the high-cost action, even if this is impossible with a binding IR constraint. 10 Note that such an example cannot be constructed if ambiguity is modeled using the non-additive probabilities framework. Formally, this is established in Ghirardato (1994), Lemma 1, which states that the IR constraint always binds in their model. In a two outcome world it is easy to understand this result as indifference curves are piecewise linear (while they are linear under expected utility), parallel lines with only one kink at the forty-five degree line. So if the indifference curves associated to the two actions don't intersect at the reservation utility, they never intersect. If they do, the intersections corresponding to higher utility levels will be shifted out parallel to the 45 degree line and so they will be always more expensive than the equivalent of the outside option. The same applies to the expected utility framework, where indifference curve are not piecewise linear, but linear.
We will show now that this result still extends to some classes of smooth ambiguity preferences. To do so, we specialise our model to the case of constant absolute ambiguity aversion preferences (CAAA), as defined by Klibanoff et al. (2005). Such preferences correspond to the case where φ(u) = − exp(−αu), with α being a parameter measuring the degree of absolute ambiguity aversion. 11 The analogy to constant absolute risk aversion is obvious.

Proposition 1 1. The individual rationality constraint may not bind in an optimal
contract.

The individual rationality constraint binds if φ is of the CAAA variety.
A proof is given in the Appendix. To understand the intuition, note that only for CAAA preferences the standard proof of GH can be followed: Assume a contract satisfies both constraints, but the IR constraint does not bind. Create a second contract, where utility u(w i ) given any outcome i is lowered by a constant amount. The second contract is actually preferred by the principal. If the amount is small enough, the IR is still satisfied. Moreover, the expected utility under any action (and any possible distribution) decreases by this constant. Hence, under any action, uncertainty about the expected utility remains the same, except that the mean changes by an equal amount. This has no effect on the ranking of the different actions (and consequently the IC constraint) precisely if preferences are of the CAAA variety (including the ambiguity neutral case). Constant ambiguity attitude ensures that the ambiguity premium is translation invariant, which is in general not the case given the smooth model. 12 In principle a similar argument would hold if relative ambiguity aversion (CRAA, φ(u) = u α )) was constant, and the utility function u would be multiplicatively separa-ble in costs and wage, i.e. u(w, a) = c a u(w). However we have seen that to ensure that the IR constraint binds, it is at least necessary that the image of u can never reach a lower bound. But then preserving existence actually requires that also lim w→w u(w) = −∞. But this is incompatible with constant relative ambiguity aversion.
In order to facilitate comparison with the standard model we will henceforth assume that preferences are of the CAAA variety. To simplify the notation we will sometimes continue to use φ(u) even if we have this particular functional form in mind. Moreover we define the payment necessary to obtain a utility level u by the (increasing) function h(u) so that h(u(w)) = w. Strict risk aversion implies that h is strictly convex.
We can now state a result for the incentive constraints.
Proposition 2 Suppose the agent's preferences are CAAA and the principal chooses not to implement the least-cost action. Then at least one of the incentive constraints will bind in any optimal solution.
The intuition behind this result is the following: Suppose w is a payment scheme where no IC constraint binds. Consider a payment scheme w where utility levels u(w i ) are a convex combination to those of w and the reservation utility. The scheme w still satisfies the IR, as it is less ambiguous, and, if sufficient weight is given to w, all incentive constraints, while it is cheaper for the principal, as it is less risky.
Note that often, but not always, this result would also hold if the agent is risk neutral, but strictly ambiguity averse. To see this, assume the optimal contract is not unambiguous (i.e. does not yield the same expected utility under all possible distributions). Since the payments are more similar between outcomes given w , this scheme also strictly reduces ambiguity about the payments for the agent. Thus, under w also the IR constraint does not bind, so that the principal can find yet another contract that satisfies all the constraints, but entails a lower ambiguity premium. Hence, unless the optimal contract achieves unambiguous wages, at least one incentive constraint must bind in the optimal contract even under risk neutrality, provided ambiguity aversion is strict. It might, however, well be possible that the distribution of wages in an optimal contract is unambiguous. If, for instance, all possible distributions agree about the probability that one of the highest two outcomes obtains, a contract which pays the same wage level for these two outcomes, and a different wage level for all other outcomes, would result in an unambiguous wage distribution.
We conclude this section by introducing a notation that decomposes the individual's beliefs about each action into two parts. Given our restriction to CAAA preferences it will turn out useful in the following sections to distinguish between the information about each action that is relevant also for an ambiguity neutral agent, and the part that is relevant only to ambiguity averse agents. Thus, we will denote the average probability distribution over output levels bȳ p a ≡ pdμ a , and the difference between a distribution p a and the averagep a byp a . Hence the ambiguity associated with each action can be described by the measureμ a , wherê This allows us to restate the problem of implementing action a as The advantage of this approach is that it highlights the effect of ambiguity aversion. Here F a (u) represents the ambiguity premium that the principal needs to pay to the agent. In the individual rationality constraint, the ambiguity premium is the only difference to the standard framework with an unambiguous distribution ofp a . For the incentive constraints, the effect of ambiguity aversion corresponds to the difference between the ambiguity premia of the two actions. Via the IR constraint ambiguity aversion will always have a non-positive effect for the principal's profits, the effect via the IC constraint could go either way.

Optimal incentives: the two outcome case
To get a first understanding of how the addition of ambiguity affects the solution to the principal-agent problem, we reduce the model to the case of two outcomes only. In particular, we show how the properties of the optimal contract differ from the case of no ambiguity aversion, and how changes in ambiguity as well as ambiguity aversion affect the principal's profits. An additional advantage of the two-outcome case is that it is relatively easy to give a definition of the relative degree of ambiguity of two actions, which will turn out to be an important factor in determining the optimal incentive scheme.
In this section, we call outcome 2 'success', and outcome 1 'failure'. The ambiguous view of the outcome distribution for each action a can now be described by a cumulative distribution function μ a ( p) (defined for p ∈ [0, 1]), representing the confidence in potential success probabilities, or alternatively by the average success probabilitȳ p H and the cdfμ a (p). The latter represents the cumulative likelihood attributed to deviations about the average success rate, its support is is a subset of the interval [−p a , 1 −p a ]. With two outcomes the payment scheme can be understood as a guaranteed payment, w 1 , and a bonus w 2 − w 1 that is paid only in case of success. Correspondingly we define u F = u(w 1 ) to be the utility level corresponding to a failure, and u = u(w 2 ) − u(w 1 ) to be the "utility bonus" in case of success.
We will focus on explaining how ambiguity affects the principal's problem of implementing a given action a. The principal wishes to minimise the costs of implementation,p subject to the constraints that her payment scheme given by (u F , u F +u Δ ) ∈ [img(u)] 2 induces outcome a and is acceptable for the agent. After straightforward manipulations these constraints become This reformulation facilitates identifying the effects that ambiguity aversion has on the constraint set. It is easy to see that the first-best outcome is not affected by ambiguity aversion: If the action was contractible, the IC-constraint becomes obsolete and the optimal incentive scheme satisfies the IR constraint with a constant wage scheme (u F = u 0 + c H with u Δ = 0), so that all ambiguity disappears. But as the IC constraint has to be satisfied, a non-constant incentive scheme has to be used, so that ambiguity will matter. Compared to an agent without ambiguity aversion the term F a (u Δ ) describes the effect of ambiguity aversion on the individual rationality constraint, while F a (u Δ ) − F a (u Δ ) denotes the effect on the IC constraint. If action a is ambiguous, it follows immediately that for all non-constant contracts, ambiguity aversion tightens the IR constraint, while the effect on the IC constraint could go either way. We introduce the following definition to identify the cases in which an IC constraint becomes more or less restrictive.
Definition 1 Action a is more ambiguous than a ifμ a is a mean preserving spread ofμ a . 13 Note that our definition implies that if action a is preferred to action a (which is more ambiguous) by an ambiguity neutral agent under any wage contract, action a will (ceteris paribus) also be preferred by an ambiguity averse agent, as stated in the following proposition. The result is very intuitive since if an action is more ambiguous, it will become less attractive to an ambiguity averse decision maker.

Proposition 3 Suppose the principal wants to implement action a (which is not the cheapest action), and ambiguity aversion raises from zero to a positive level. Then the incentive constraint associated to action a becomes more restrictive if a is more ambiguous, and less restrictive if a is less ambiguous than a .
Proof The result follows from (2) since if a is more ambiguous, F a (u )−F a (u ) ≥ 0, while if a is more ambiguous, the inequality is reversed.
Yet, this observation is not enough to study the effects of ambiguity aversion on the principal's profits. The fact that it becomes harder to implement the action that is optimal in the absence of ambiguity, does not necessarily mean that the profits of the principal have to decrease. It might still be the case that another action becomes cheaper to implement, so that the principal might actually benefit when she changes the action she implements with the optimal contract. To abstract from such concerns, we turn to the case of two actions.
Two actions Suppose there are only two actions, so that A = {H, L} with c H > c L . It follows immediately that the low-cost action is optimally implemented with a constant incentive scheme. Thus the problem is only interesting ifp H >p L , as otherwise the principal prefers to implement the low-cost action. Another convenient feature of this simplification is that whenever we find that the principal can implement the high-cost action in some principal-agent problem, we can easily construct another problem (by raising q 2 − q 1 ) such that implementing the high-cost action is not only efficient, but also optimal for the principal. Thus we now seek to characterise the incentive scheme that optimally implements the high-cost action. (Whether it is in fact optimal to do so will only depend on the size of q 2 − q 1 .) In the standard model, the optimal contract pays a positive wage bonus if the agent achieves the high outcome. Wage and bonus are determined by the unique point at which the IR constraint and the (one) IC constraint bind. Also under ambiguity (at least assuming constant absolute ambiguity attitude) we know (from above) that in the optimal solution the two constraints bind. Recall that for any (u F , u F + u Δ ) ∈ [img(u)] 2 these two constraints have the form and

Monotonicity
We introduce now an example that illustrates that it is possible that the IC constraint binds for both a positive and a negative value of u , and that it can be optimal to choose the negative bonus.
Example 2 In this example, both the high-cost and the low-cost action result in the good outcome with almost the same (low) probability, but only the low-cost action is ambiguous. The details are the following: The high-cost action leads to success with the unambiguous probabilityp H = .22, the low cost action has an average success rate ofp L = .19. For the low-cost action, the agent attributes probability .5 to the deviations −.19 and .19. The costs of the actions are c L = 1 and c H = 1.05, the reservation utility is φ(u 0 ) = −1. Ambiguity attitude is CAAA with an ambiguity coefficient α = 5, risk attitude is given by the function u(x) = x 2 11 for x ≥ 0. Figure 2 illustrates that the optimal contract uses a negative wage bonus.
To understand the intuition for the optimality of the negative bonus in this example, consider first a monotonic contract (CM) corresponding to an expected utility level of u 0 (given the distributionp H ) and an utility difference of u . If, as in this example, p H < 1/2, the potential gain (1 −p H )u Δ is larger, but less likely, than the potential loss ofp H u Δ (high upside risk). Now compare this with an analogous non-monotonic contract (CN) with the same expected utility (underp H ) as CM, where however the difference in utility corresponds to −u . The potential gain ofp H u realizes now with a probability of 1 −p H , so it is smaller, but more likely, than the possible loss (high downside risk). Whether the monotonic or the non-monotonic contract is now cheaper for the principal depends on the agent's utility function. In particular, the non-monotonic contract can be shown to be prefered if h > 0.
In the absence of ambiguity aversion it would however be clearly the case that only the first of the two suggested contracts can provide incentives for the high-cost action. Given ambiguity aversion, this is not the case any more. The reason is that Fig. 2 Negative bonus is optimal in this example, given a monotonic contract like CM, the ambiguity averse decision maker effectively acts as if she puts a probability of somewhere between 0 and .19 (depending on the level of ambiguity aversion) to the low-cost action. Given however a non-monotonic contract like CN, she acts as if she puts a probability somewhere between .19 and .38 to the low cost action. If ambiguity aversion is high enough so that the effective probability is above .22, incentives for H can be provided with such a contract. Hence, if CM is actually the best monotonic contract, CN will probably fail to provide incentives, but a very similar contract with slightly more powerful incentives would succeed. Provided the shape of u favors the second type of contract (with high downside risk) it could be the case that even the required modification of CN remains cheaper than CM.
A similar example is discussed in Ghirardato (1994, example 1 in Sect. 3.4) for the case of the non-additive utility model, which is also driven by the fact that a less ambiguous action is implemented. We will now identify conditions that ensure that the optimal incentive scheme uses a positive bonus.
The following proposition will provide necessary conditions that ensure that the optimal contract is monotonic even under ambiguity aversion. To do so, we introduce a definition of symmetric ambiguity.
In words, this definition requires that the likelihood that the agent assigns to the possibility that the success probability deviates from the average success probability by more than any non-negative numberp is the same as the likelihood she assigns to a deviation of less than −p.

Proposition 4
Suppose that there are only two actions and two outcomes. Then, the optimal incentive scheme to implement the high-cost action will use a positive bonus if at least one of these three conditions holds: 1. The high-cost action is at least as ambiguous as the low-cost action. 2. It is unambiguous that the success probability of the low-cost action is lower than the average success rate for the high-cost action, in the sense that p ∈ supp(μ L ) impliesp H > p.
A complete proof is given in the Appendix. It proceeds by showing that, provided the low (cost) action is more ambiguous, it can be possible to implement the action with strictly higher costs even if the average success rates are the same. As discussed for the example, the sign of h characterizes whether or not the positive bonus is optimal in this case. The proof concludes with the observation that any increase in the average success rate of the high-cost action ceteris paribus favors the monotone contract.
Note that the second point implies that, if the likelihood ratio for the high-cost action is unambiguously higher than for the low-cost action, the optimal contract is monotonic. In Sect. 5 we will show that such a condition would however not be sufficient for the case of three outcomes. First, though, we will complete the analysis of the two-outcome case by examining some comparative statics in ambiguity aversion and in ambiguity.

An increase in ambiguity aversion
We now evaluate the impact of the introduction of ambiguity aversion on the profits of the principal, and more generally, the effects of an increase in ambiguity aversion. Recall that in the standard model the introduction of risk aversion can never be beneficial to the principal. Moreover, for the case of two outcomes, GH establish that an increase in risk aversion always lowers the payoff of the principal if the agent has CARA preferences that are multiplicatively separable in effort costs and wages. This raises the question whether corresponding results are valid for the case of ambiguity aversion. The answer is no, as shown by the following example.
Example 3 In this example there is no ambiguity about the consequences of the more costly action H . That is, the (stochastic) consequences of the action H are perfectly understood. However, the consequences of the low-cost action L are ambiguous. 16 Figure 3 illustrates how to implement action H . Recall that for the principal, outcome 2 is more desirable (in the sense that q 2 > q 1 ). The curve U H identifies all those incentive schemes which leave an agent with low ambiguity aversion indifferent between selecting the high-cost action, and rejecting to work for the principal (her outside option), i.e. U H = {(u(w 1 ), u(w 2 )) : U (H, (w 1 , w 2 )) = φ(u 0 )}. This curve is linear 16 In particular, uncertainty is described by p H 21 = p H 22 = .8, p L 21 = 1, p L 22 = 0 and μ L 1 = .6. This implies that while the high-cost action is unambiguous, the other action is extremely ambiguous. Costs are c H = .6, c L = .5. Ambiguity aversion raises from α = .5 to α = 2, and u(x) = √ x, the outside option φ(u 0 ) = −1.
as the high-cost action is not ambiguous. U L is the equivalent for the low-cost action. Any optimal incentive scheme corresponds to an intersection of these two curves, as has been argued in the context of Fig. 2. Finally, the curve (the shape of which is determined by h) depicts all incentive schemes which results in the same profits as V (assuming that H is chosen), and points below increase profits.
The curves U L and U H represent the same indifference curves after an increase in ambiguity aversion. As H is not ambiguous, only U L changes; it's curvature increases. We can see that the optimal incentive scheme is given by V in this case, and as V lies below , V is cheaper for the principal than V . Thus, the principal benefits from ambiguity aversion in this example.
The idea behind this example is intuitive: If the action we want to implement is not ambiguous, then an increase in ambiguity aversion makes it less attractive for the agent to pick the other action. Therefore, the principal can use a less high-powered incentive scheme, which is cheaper for her, due to the agent's risk aversion.
So when will it be true that an increase in ambiguity aversion leads to a decrease in profits? To keep things simple, we use the case of no ambiguity aversion as benchmark and identify in which cases ambiguity aversion is beneficial to the principal. Note that ambiguity aversion affects the set of feasible incentive schemes in two different ways: Whenever the action that is supposed to be implemented is ambiguous, and a nonconstant incentive scheme is used, ambiguity aversion makes the individual rationality constraint harder to satisfy. On the other hand ambiguity aversion may have a positive or a negative effect on the incentive constraint: If the high-cost action is less ambiguous, then ambiguity aversion makes the high-cost action relatively more attractive than the low-cost action. This intuition leads to the following proposition: Proposition 5 Consider a principal-agent problem (P) with an ambiguity neutral agent, and the same problem but with an ambiguity averse agent (P ). Suppose that the high-cost action is more ambiguous than the low-cost action. Then the profits in problem P can be no higher than the profits in problem P. If, instead, the high-cost action is less ambiguous, then profits may be higher in problem P .
Proof Assume that the high-cost action is more ambiguous. We will proof the proposition by arguing that the set of incentive schemes that implements the high-cost action in problem P is contained in the feasible set of problem P [compare (3) and (4)]. Consider first any incentive scheme (given by u F and u Δ ) that does not satisfy the individual rationality constraint in problem P. The IR constraint of problem P differs by F H (u Δ ) which is non-negative. Thus, the IR constraint cannot hold in problem P . Similarly, suppose the IC constraint was violated in P. The incentive constraint of problem P differs by F H (u ) − F L (u ). As the high-cost action is more ambiguous, F H (u ) − F L (u ) ≥ 0, and thus the incentive constraint cannot be satisfied in problem P as well, establishing the claim. For the case that the high-cost action is less ambiguous, assume, for example, that it is not ambiguous at all (while the low-cost action is ambiguous). Consider the optimal contract in (P). Given strict risk aversion, the IC constraint is binding. For this contract, in problem (P ), F H (u ) = 0 and F L (u ) > 0. Hence, the IR constraint holds with equality, while the IC constraint holds with a strict inequality. As the proof of Proposition 2 demonstrates, this means that a strictly better contract is available in problem (P ).
In principle we could try to obtain a similar statement for an increase in ambiguity aversion that starts from an already positive level. However this would require stronger assumptions about the dominance-relationship between the two actions. (The problem we face is essentially caused by the fact that CARA preferences are ordered by Arrow-Pratt risk aversion but not by the stronger relationship of Ross risk aversion. The same issue arises for CAAA preferences.) In Example 8 of the Appendix we present an example where, in the context of Proposition 5, the initial effect of introducing ambiguity aversion goes in the opposite direction of a further increase in ambiguity aversion. In this example, the high-cost action is very ambiguous compared to the lowcost action, such that the (negative) effects of ambiguity about the high-cost action are strong already for a small level of ambiguity aversion, while the positive effects of ambiguity about the low-cost action become relevant only after a further increase in ambiguity aversion.
In conclusion, we can observe that while in the standard model with two outcomes and CARA preferences an increase in absolute risk aversion reduces profits, in our model with CAAA preferences the introduction of ambiguity aversion (or a marginal increase of ambiguity aversion) may also benefit the principal. As a change in ambiguity aversion does not alter the first-best outcome, the change in the principal's problem reflects the change in the severity of the inefficiency due to moral hazard. To understand this difference between risk aversion and ambiguity aversion intuitively, note that in the standard model risk aversion is the only source of inefficiencies. Whenever the high-cost action is not very risky, but the low cost action is, the incentive problem is not very severe in the first place. For CARA preferences the positive effect of risk aversion, providing stronger incentives in a given contract, is dominated by the negative effect (a higher risk premium is required keeping incentives fixed).

Increases in ambiguity
It is easy to see that an increase in ambiguity that affects only the high-cost action can never benefit the principal (both constraints become more restrictive), while an increase in ambiguity of the low-cost action can never harm the principal (the IC becomes less restrictive). However it is less immediate whether an increase in ambiguity that affects both actions in the same way is beneficial to the principal. One might expect that such an uniform increase in ambiguity might have similar effects to an increase in ambiguity aversion. However, this turns out not to be true: While Proposition 5 showed that introducing ambiguity aversion cannot benefit the principal if the high-cost action is more ambiguous, a uniform increase in ambiguity can never benefit the principal in the opposite case, if initially the high-cost action is less ambiguous. To state this result more precisely, we introduce the following definition.
Definition 3 Suppose that for both actions a ∈ {H, L},μ a represents ambiguity for the principal-agent problem P , whileμ a describes ambiguity in P, and that the two problems are identical otherwise. Then problem P differs from P by a uniform increase in ambiguity if ambiguity about all actions increases by the same mean preserving spread, i.e.μ H −μ H =μ L −μ L =μ, whereμ is a mean preserving spread).

Proposition 6 Suppose the high-cost action is more ambiguous. Then a uniform increase in ambiguity may increase the principal's profit. However, if the high-cost action is less ambiguous, then a uniform increase in ambiguity can never be beneficial to the principal.
The proof for the case that the high-cost action is less ambiguous is given in the Appendix. The intuition behind the result is the following: When the less ambiguous action is to be implemented, and ambiguity increases uniformly, the positive effect of ambiguity aversion on the incentive constraint remains. However, the relative difference in the ambiguity of the two actions decreases, reducing this positive effect (operating via the IC constraint). In addition, the increase in ambiguity reinforces the negative effect (via the IR constraint). On the other hand, if the more ambiguous action is to be implemented, the fact that the relative difference in ambiguity between the two action decreases, reduces the negative effect of ambiguity aversion on the IC constraint. Still, the negative effect on the IR constraint remains, so that the outcome depends on the relative size of these two effects. The example below shows that the net effect can still be positive.

Example 4
In this example, the low-cost action is not ambiguous, but the high-cost action is. V denotes the optimal incentive scheme. The primed counterparts correspond to a modified problem, which differs only for by a uniform increase in ambiguity. 17 In that are equally expensive as V , and V is below , the principal prefers the uniform increase in ambiguity.
Comparing Propositions 5 and 6 shows that the effects of ambiguity aversion and a uniform increase in ambiguity can be quite different. Consider for instance the case where the high-cost action is not ambiguous. In this case an increase in ambiguity aversion can never decrease the principal's payoff (even if ambiguity starts at some already positive level), but a uniform increase in ambiguity can never increase the payoff. So in this case the two changes lead to an opposite effect profits. The opposing effects of ambiguity and ambiguity aversion illustrate an advantage of working with a model that can separate ambiguity and ambiguity aversion, like the smooth ambiguity model.

Extensions
We now discuss the robustness of our findings to more general formulations of the principal agent problem. We begin with dropping the restriction to two outcomes, and revisit the issues of monotonicity and the effects of ambiguity and ambiguity aversion. Finally, we look at the case of multiple actions.

Multiple outcomes: monotonicity
Recall that in the absence of ambiguity the monotone likelihood ratio property (MLRP) is a sufficient condition to guarantee the monotonicity of the optimal incentive scheme, as long as there are only two actions. For the case of only two outcomes we have introduced a modification of this property for ambiguous outcome distributions that ensures that the optimal contract is monotonic even given ambiguity aversion. We will now demonstrate using an example that for more than two outcomes even a natural, but very demanding analogue of the MLRP property under ambiguity is not sufficient to ensure monotonicity of the reward scheme.

Definition 4
The actions H and L satisfy the strong monotone likelihood ratio property (SMLRP) if for all p H , p H ∈ supp(μ H ) and p L , p L ∈ supp(μ L ), . This is actually a very restrictive adoption of the MLRP property, as we require that the agent's likelihood ratios have to be monotone even if we use two different (possible) probability distributions to compute the likelihood ratio for two different outcomes. Note that in particular this assumption also implies that the likelihood ratios computed using the average probabilities (p H andp L ) are monotonic. However, it does not suffice to ensure monotonicity, as the analyses of the example below will demonstrate. The example is motivated in an applied economic context, to illustrate that the optimality of non-monotonic schemes is not only a purely theoretical possibility.
Example 5 Consider the relationship between a firm (as a principal) and an agent who is hired to increase sales of the principal's company. That is, the agent's job is to do marketing for the principal's product. The outcome is the quantity of products sold, which is the only contractible measurement of the agent's performance. Assume the agent has access to an innovative marketing technology. Assume that the firm has a good understanding of the distribution of sales in the absence of any marketing efforts (and this information can be made available to the agent). With respect to the effects of the marketing technology, suppose that there a two types of consumers. The firm knows how the marketing technology affects the sales for one type of consumer (type 1). This could be e.g. the younger consumers whose social media activities generate lots of data about them. There is however not much information about the remaining (type 2) consumers, so the probability that such a consumer buys after marketing efforts can be considered ambiguous.
In designing the optimal incentive scheme, the principal combines the unambiguous information about the type 1 consumers with the ambiguous information about the other types. If the chance of buying for a consumer is around .5, there could be reason to expect that intermediate sales are less ambiguous than deviations to either side of the average. We will illustrate this using the simple case of only two consumers, one of each type. In the absence of any marketing efforts, both consumers buy one unit of the good with probability .5 − γ . As motivated above, if the agent exerts effort, her marketing technology increases the probability that type 1 consumes the good to .5. In the absence of better information, principal and agent consider the three values .5 + d, .5 and .5 − d as possible consumption probabilities for type 2 consumers (and they consider them to be equally likely). The parameter d could thus be interpreted as representing a dimension of ambiguity. Table 2 summarizes this example.
Note that if the agent exerts effort, the probabilities associated to a sales quantity of 0 and 2, but not to a quantity of 1, are ambiguous.
Obviously, there are many combinations of the parameters γ and d such that SMLRP holds. However, for all of them, given a fixed level of ambiguity aversion, the optimal contract will not be monotonic whenever risk aversion is sufficiently small. (Lemma 5 in the Appendix states this observation in a slightly more general context and gives a proof.) The result is largely driven by the fact that if ambiguity and ambiguity aversion is high compared to risk aversion, the principal wants to reduce ambiguity by devising a wage scheme that does not vary a lot between the two ambiguous outcomes. Supposē p H 2 >p L 2 , as in this example and consider a payment scheme that is even constant between the two ambiguous outcomes (1 and 3, corresponding to 0 or 2 units sold). Sincep L 1 +p L 3 >p H 1 +p H 3 , the payment for the unambiguous outcome 2 (1 unit) must the highest in order to provide the necessary incentives. If risk aversion is strict, however, such a scheme will be, informally speaking, too risky, as in the absence of ambiguity aversion, the shape of the contract should reflect the average (increasing) likelihood ratios. In the presence of risk and ambiguity aversion, the optimal incentive

Fig. 5
Ambiguity and the optimal wage schedule scheme will thus reward the highest outcome more than the lowest, but if risk aversion is small compared to ambiguity aversion, the result could be that outcome two (one unit sold) still is rewarded most.
To further illustrate which factors contribute to the optimality of non-monotonic incentive schemes, Fig. 5 (obtained using numerical methods) shows that high ambiguity (reflected by increases in the parameter d) plays a similar role as low risk aversion. 18 We chose the case of γ = .2, with μ H 1 = μ H 3 = .5 which implies that SMLRP holds whenever d < .28. The optimal incentive scheme is non-monotonic as soon as ambiguity is large enough, even in situations where the likelihood ratios satisfy SMLRP.
With only one member of each type the example is arguably not very general, but note also when we have n members of each group, the probability assigned to selling exactly n units is still unambiguous. The fact that one outcome is not ambiguous at all is driven by the assumption that average success probabilities are .5 in both markets. Without this assumption, typically no outcome will be completely unambiguous, but a non-monotone incentive scheme might still be optimal.
Additionally, note that it may sometimes be possible for the agent to hide parts of the output that is produced (as she contributes to its creation), and in a non-monotonic incentive scheme it is in her interest to do so. In such cases the principal would not want to use a non-monotonic incentive scheme. If this additional constraint binds, the optimal incentive scheme would treat some outcomes identically, resulting in a coarser incentive scheme than one would expect in the absence of ambiguity. Figure 5 also points to a further difference between moral hazard under risk and ambiguity. Within the expected utility framework, the optimal contract can be viewed as if it was the solution to an inference problem. The optimal contract looks as if the principal makes use of any signal that informs her about the unobservable action of the agent. Here, this is not the case since the distinction between the output level 1 and 2 is not used for a certain value of d in the optimal contract, even if this is certainly informative about the chosen action. 19

Multiple outcomes: comparative statics
In our results on the effect of ambiguity and ambiguity aversion, the definition of the relationship "more ambiguous" was the only aspect specific to the case of only two outcomes. Hence, these findings generalise quite directly to the case of many outcomes if a more general notion of the relation 'more ambiguous' between two acts can be found. Recall that our definition of this relationship for the two-outcome case had the property that whenever two actions could be ranked, the disutility attributed to ambiguity alone [which we denoted by F a (u)] was (weakly) larger for the more ambiguous action irrespective of the wage schedule that is used.
In fact, one can also define a relation between two actions that has this property for the case of more than two actions. One possibility to do so is: Definition 5 Let, for a fixed u, F (u,μ a ) denote the cdf attributed to the real-valued random variable constructed by ip i u i , wherep is distributed according toμ a . Action a is at least as ambiguous as a if for every possible u, F (u,μ a ) is a mean preserving spread of F (u,μ a ).
In fact, this definition is similar to the one introduced by Jewitt and Mukerji (2015) to compare ambiguous acts. To see an example of two actions which are ranked according to this criterion in a three outcome world, letp = (2 , − , − ) and suppose that the support of μ a and μ a contain at least the three possible probability distributions p 1 =p +p, p 2 =p and p 3 =p −p. If μ 1 = μ 3 = α and μ 2 = β − 2α, ambiguity increases as α (or ) increases according to the above definition. Based on such a definition, we can state the following result: Proposition 7 Suppose there are only two actions. Suppose the high-cost action is at least as ambiguous as the low-cost action. Suppose problem P and P are identical, except that the agent in P is ambiguity neutral, while she is ambiguity averse in P . Then profits are (weakly) lower in problem P .
This result includes the cases where only the low-cost action is unambiguous, and where the two actions are equally ambiguous. The proof does not differ in any essential way from the two outcome case, and is hence omitted. If H is at least as ambiguous according to the above definition, ambiguity aversion makes both constraints harder to satisfy.
Comparing ambiguity of two actions in this way has two limitations. On the one hand it is not necessarily easy to check whether two actions can be ranked according to this criterion (unless if actions are equally ambiguous or only one of them is). On the other hand, often this definition cannot rank two actions. As an example it is impossible to compare the action which is perceived to result with equal likelihood in the two probability distributionsp ± ( , 0, − ) with the action that may result in bothp ± (0, , − ) with equal chances. However, under any monotone incentive scheme the first distribution is more vulnerable to ambiguity aversion than the latter. Hence, if the first action was considered more ambiguous our results would still apply if contracts are restricted to be monotonic (In fact, Jewitt and Mukerji (2015) provide a characterization of this relationship under the restriction to monotonic utility profiles as well). However, as our results suggest that if risk aversion is small compared to ambiguity aversion, the optimal contract will not be monotonic in many cases. Hence, a broader definition of the relationship "more ambiguous than" might be particularly useful in cases where risk aversion is large, or agents have the possibility to hide output, so that the optimal contract is forced to be monotonic.
If the action chosen in the optimal contract is unambiguous, we can also show how marginal increases in ambiguity aversion affect the principal's payoff, even if there are many outcomes.
Proposition 8 Suppose there is no ambiguity about the action which is implemented in the optimal contract. Then any increase in ambiguity aversion (weakly) improves the payoffs of the principal (and thus reduces the severity of the incentive problem).
The proof, located in the Appendix, exploits parallels between risk aversion and ambiguity aversion. It is intuitive since if an ambiguity averse agent finds the ambiguous off-equilibrium actions not attractive enough given a certain contract, a more ambiguity averse agent will find them even less attractive.
Recall our earlier observation that in many cases it seems likely that equilibrium actions are maybe very well understood in many instances of actual principal-agent problems, while the consequences of deviations, which never actually occur, are ambiguous. Hence, this proposition suggests that in many relevant cases the effects of increases in ambiguity aversion are very different from the effects of risk aversion, which are more typically attributed to lower profits. 20 Obviously an analogous result exists if the principals wants to implement an action that is ambiguous, while the other is not: In this case, an increase in ambiguity aversion can only harm the principal.
Also, working with more than two outcomes enables us to contrast the effects of increases in ambiguity about the outcome distribution with an increase in risk about the outcomes. In the example below, we show that the fact that the high-cost action becomes more risky does not mean that profits decrease.
Example 6 Suppose Q = {0, 1, 2} andp a = (γ , 1 − 2γ, γ ) (for γ ∈ [0, .5]). As γ increases, the risk inherent in the outcome distribution increases. Yet, that does not mean that any payment scheme based on these three outcomes necessarily becomes worse for the agent: any payment scheme that treats the lower two outcomes similar and rewards the high output with the highest wage becomes in fact better for the agent as γ increases. If a is the high-cost action, and the low-cost action is described by, say, the distribution (.5, .25, .25), values of γ that are close to 1/2 will bring the principal close to the first-best outcome, even though the outcome distribution is most risky in this case. This follows as the high-cost action comes close to not having full support in this case, so that the principal can 'punish' the agent in a state that is likely only under the low-cost action.

Multiple actions
In general, increasing the number of actions available to the agent does not change the problem in a fundamental way: The principal first checks what are the costs of implementing any possible action (where she now has to satisfy multiple incentive constraints), and then she compares these costs to the expected output the corresponding action generates. However, smooth ambiguity aversion introduces one important difference: it is no longer true that one of the incentive constraints that binds has to correspond to an action with lower costs. (This has to be true in the standard model and if ambiguity is modelled using capacities.) We illustrate using an example.
Example 7 In this example, there are two outcomes, and three actions, H , M and L.
Here the medium cost action (represented by the indifference curve U M ) yields the highest expected benefits for the principal, followed by the high-cost action (U H ). The cheapest action (U L ) is also least beneficial to the principal. All of them are ambiguous, but note that the ambiguity about the high-cost (medium-benefit) action is such that it becomes significant only if the wage scheme differs a lot between the two outcomes (which is impossible in the non-additive model). 21 If the principal wants to implement the medium-cost action, her profits are highest if she uses the incentive scheme V (Fig. 6). In this scheme, the agent is indifferent between the medium and high-cost action, but prefers both to the low cost action. Note that the principal actually wants to use this scheme rather than using scheme V to implement H when the benefits of the higher outcomes are large enough.

Conclusion
Given ambiguity about the outcome distributions, we find that allowing for ambiguity averse preferences has important consequences for the optimal contract. First, unless preferences display constant absolute ambiguity aversion, some rents may be left with the agent (the individual rationality constraint need not bind in the optimal contract). Second, we find that it will more often be the case that the optimal contract is nonmonotonic. In settings were output can be hidden this means that in such cases the optimal contract will be coarse. This fits well with the often stated belief that the optimal contract is found to be more coarse than what the expected utility framework predicts. Third, we find that even in a simple model with binary outcomes increases in ambiguity aversion can lead to very different effects compared to increases in risk aversion. More risk aversion typically leads to lower profits, more ambiguity aversion can easily lead to higher profits, e.g. if there is only very little ambiguity about the action which is to be implemented, while ambiguity about possible deviations is large. Additionally, we demonstrate that, given ambiguity averse preferences, the principal's problem of finding the optimal incentive contract can no longer be viewed as an inference problem.

Proof of Proposition 1
(The proof of part 1) is established by the example in the text. For part 2, assume preferences are of the CAAA variety. Then we can rewrite the constraints as: We use the same technique as GH: Suppose the constraint IR' does not bind, and IC' holds. Replace the payment scheme w with another payment scheme that reduces utilities given each outcome by the same amount , where is small enough so that the IR' constraint still holds. Such an exists since the image of u has no lower bound. Then also the IC' constraints still hold, as both sides of the inequality get multiplied by exp(α ). But as u is increasing in w, this change clearly benefits the principal, as implementation costs decrease.

Proof of Proposition 2
Suppose that no IC binds in an optimal solution. Denote the optimal payment scheme by w * , and note that it cannot be constant (otherwise, the agent would choose the least-cost action). Now consider, for some γ ∈ (0, 1) an alternative payment scheme w defined by u(w i ) = γ u(w * i )+(1−γ ) i u(w * i ) p i dμ a . This payment scheme makes the principal strictly better off, since given strict risk aversion, Since the incentive scheme is not constant, the inequality sign follows from the strict convexity of h, which is implied by the agent's strict risk aversion, Also, the contract w is (weakly) beneficial to the agent, so that the individual rationality constraint continues to hold: The inequality follows from concavity of φ. Thus w makes the principal better off and ensures the I R constraint holds. Assuming γ is close enough to 1, continuity ensures that also all IC constraints continue to hold given w . Hence, there is a contract w that makes the principal better off and satisfies all constraints, contradicting the optimality of the solution w.

Proof of Proposition 4
We begin by stating a preliminary lemma.
Lemma 2 Suppose two incentive schemes V = (u F , u F + u Δ ) and V = (u F , u F + u Δ ) both satisfy the IR and IC constraint with equality and u u > 0. Then the principal will weakly prefer incentive scheme V iff |u | ≤ |u |. Assuming either strict risk aversion or strict ambiguity aversion, she will strictly prefer V iff |u | < |u |.
Proof Pick any two incentive schemes where u u > 0 and both constraints bind. Assume wlog |u | < |u |. As both schemes satisfy IR, It follows from the concavity of φ that u F +p H u Δ ≤ u F +p H u Δ . The principals implementation costs using V are given by: Using the above inequality, Finally, note that due to the convexity of h, and since |u | < |u |, The right-hand-side of this equation corresponds to the implementation costs for V . Combining the inequalities gives the result. Note that assuming either strict risk aversion or strict ambiguity aversion ensures that one of the weak inequalities is in fact strict as needed for the result.
We now turn to the proof of the three parts of the proposition.

Part 1 As
is positive in this case, it is clear that the IC constraint in (4) cannot hold for any negative value of u .
Part 2 Let p = sup(supp(μ L )). Supposep H ≥ p but, by contradiction, u Δ < 0. Then the term F H (u ) − F L (u ) as it appears in the IC constraint (4) is larger then ( p −p L )u Δ , since the numerator (of the fraction in the definition F H (u )−F L (u )) is at least 1 and the denominator at most (exp(−α( p −p L )u Δ ). Therefore it is necessary for the IC constraint to hold that (p H − p )u Δ > c H −c L , which contradicts our assumptions.
Part 3 The proof of part 3 proceeds in the following steps. First we turn to the case of a principal that tries to implement the more costly action even though the reduced probabilities are the same. This case however is only interesting as a benchmark case, since if average probabilities are the same, the ambiguity neutral principal will always implement the low-cost action using a constant incentive scheme. Lemma 3 below will characterise the sign of the bonus in this case. Finally, we turn to the more interesting case thatp H >p L and argue that in this case it becomes cheaper to use the positive bonus (Lemma 4), so that the conditions we identify in Lemma 3 are still sufficient for the positive bonus, (and moreover necessary for the negative bonus), establishing the result. Proof Consider any incentive scheme V = (u F , u F + u ) that implements H . Let u F = u F + 2pu and u = −u Δ . It follows from Eqs. (3) and (4) To see this, observe that (and similarly for L.) Therefore plugging V instead of V in the IC constraints leaves the RHS of both constraints unchanged. The LHS of the IR constraints (3) for V and V are identical as well (by construction), and the LHS of (4) is zero in both cases. Now let V be the incentive scheme that satisfies both constraints with the smallest possible positive bonus. It remains to check whether the expected wage payment associated to V is larger than the expected payment needed for V .
To do so, rewrite the two incentive schemes in terms of the deviation about their mean,ū = u F +pu Δ . Then, Thus, the principal prefers the positive bonus if The LHS of this inequality equals 0 if u Δ = 0. Thus the LHS is negative for u Δ > 0 if it is decreasing in u Δ . The first derivative of the LHS w.r.t. u Δ is actually given by Now supposep > 1/2. Since h > 0 this derivative is negative (and thus the positive bonus is optimal) if, equivalently, h is convex or h > 0. The relationship is reversed if eitherp < 1/2 or h concave, and if h is linear, there is no difference between the positive and negative bonus. 22 Lemma 4 Suppose it is optimal for the principal to implement the high-cost action using a positive bonus in a principal-agent problem P E , where the reduced success probabilities are equal (or she is indifferent). Then she will strictly prefer the positive bonus in any principal-agent problem P that differs from P E only by a decrease in p L . 22 Thus, the principal prefers V iff an expected utility maximizer with convex utility function −h prefers the (zero mean) lottery (−pu , 1 −p; (1 −p)u ,p), to the lottery (pu , 1 −p; −(1 −p)u ,p) at wealth levelū. Another way of proving the proposition could be to verify the fact that the sign ofp H − 1/2 establishes which of the two lotteries third-order stochastic dominates the other, and then to apply the (unnumbered) Theorem in Whitmore (1970) to obtain the result.) Proof Consider the cheapest scheme that implements H in problem P using a negative bonus. This incentive scheme implements H also in problem P E : The individual rationality constraints are the same in both cases. Regarding the IC (Eq. 4), the LHS in problem P equals (p H −p L )u Δ which is negative provided that u Δ < 0, while the LHS of P E equals zero, which makes the constraint easier to satisfy. Thus the principal weakly prefers the best non-monotone incentive scheme in problem P E to the best non-monotone incentive scheme of problem P. And in fact this preference must be strict as for the optimal incentive scheme of P E both constraints must hold with equality. By assumption, this scheme is (weakly) worse than the best monotone incentive scheme of problem P. By similar arguments, the best monotone incentive scheme of problem P E is also a feasible (monotone) scheme in problem P. Thus, this scheme is strictly better than any non-monotone incentive scheme, and thus the optimal incentive scheme must be monotone.

Example 8
Example 8 In this example, there is very little ambiguity about the low-cost action, while ambiguity about the high-cost action is large.The (relevant) details for the example are the following: Possible deviations about the average success probability are {−.2, .2} for H and and {−.2, 0, .2} for L. For the high-cost action the agent puts the same likelihood on both deviations, (μ H 1 = μ H 2 = .5), while for the low-cost action she considers the average success probability to be by far the most likely: 0005. The graph in Fig. 7 shows, as usual, the indifference curves that corresponds to the outside option for both actions and for three different levels of ambiguity aversion: The straight lines U H and U L correspond to no ambiguity aversion. The corresponding dashed curves represents a positive level of ambiguity aversion(U H a nd U L ), and an even higher level (U H and U L ). For wage schemes that are close to being constant, the introduction of ambiguity aversion affects only the high-cost action significantly, making it harder to implement the highcost action. But if ambiguity aversion rises to more extreme levels, it has a sizeable effect on the low-cost action, while the additional effect on the high-cost action is negligible. The optimal incentive schemes, given by intersections of the respective two indifference curves, are labelled V, V and V in increasing order of ambiguity aversion. Even if the agent is risk neutral (so that the principal's iso-profit curve is a straight line parallel to U H ), V is less expensive than V despite the fact that the high-cost action is more ambiguous and V is the solution for a problem where the agent is more ambiguity averse than in the problem corresponding to V . Hence implementation costs are non-monotonic in ambiguity aversion.

Proof of Proposition 6
Let unprimed variables correspond to the principal-agent problem before a uniform increase in ambiguity aversion, and their primed counterparts denote the problem after the change. Thus, for both a ∈ {H, L} we can writeμ a =μ a +μ. The new incentive constraint of the primed problem can now be rewritten as u ) .
The primed IC constraint differs from the unprimed IC only by the last term, which used to be F H (u ) − F L (u ) = 1 a log( exp(−αp j u )dμ H exp(−αp j u )dμ L ). Fix an arbitrary u that did not satisfy the incentive constraint. Suppose the high-cost action is less ambiguous. Then, as the fraction inside this term is less than 1, and since exp −αp j u dμ a > 0, one can conclude that F H (u ) − F L (u ) > F H (u ) − F L (u ). Thus, the incentive constraint still does not hold. Finally, suppose only the IR constraint fails in the unprimed problem. Then, as ambiguity of the high-cost action has increased, the individual rationality constraint continues to fail in the unprimed problem. Therefore there is no incentive scheme that implements the high-cost action only in the primed problem, establishing the claim. For the case of a more ambiguous high-cost action see Example 4.

Formal analysis of Example 5
We proof a lemma to which the example fits as a special case.
values of u Δ . Finally, observe that irrespective whether or not this condition still holds, in region 4, C (u Δ ) > 0. Thus, C(u Δ ) is initially decreasing, then convex, and finally increasing. Thus it has a unique minimum, which is in either region 2 or region 3. Its local behavior at the boundary between those two regions characterizes the location of the optimal u Δ .
Denote byũ Δ the value that makes the payment scheme constant over outcomes 2 and 3, such that u 2 (ũ Δ ) = u 3 (ũ Δ ). Then it is a necessary and sufficient condition for the optimal incentive scheme to be monotonic that C (ũ Δ ) < 0. Note that u 1 (ũ Δ ) < u 2 (ũ Δ ) = u 3 (ũ Δ ). Then the condition for monotonicity is equivalent tō (8) Note thatũ does not depend on the agent's attitudes towards risk (as captured by h). The first term is negative, since u 1 (ũ Δ ) < u 2 (ũ Δ ) and we assume SMLRP. We can interpret this term as the benefit (of reducing costs) due to reducing the riskiness of the wage scheme, by using a wage scheme that reflects the average likelihood ratio difference between second and the third outcome. Now consider the second term. By standard arguments, as long as u Δ > 0 we have that F H (ũ ) > 0. Thus the second term captures the costs of increasing ambiguity. Consequently, as the utility function is sufficiently close to representing risk neutral preferences, the second effect will dominate the first effect so that the optimal contract is in region 2 and hence not monotonic. The case wherep H 2 /p L 2 < 1 is entirely analogous, with the exception that an optimal non-monotonic contract will now be in a region where u 3 > u 1 > u 2 .

Proof of Proposition 8
Suppose that w is the optimal incentive scheme in a given principal-agent problem. Suppose the optimal constract implements action a. We proof the claim by showing that w remains feasible after an increase in ambiguity aversion. Note first that this increase does not affect the individual rationality constraint. As w satisfies each incentive constraint with equality, before the increase in ambiguity aversion, We can interpret this condition in a language borrowed from the literature dealing with risk aversion by noting that the certainty equivalent of the ambiguous utility lottery on the left hand side is below I i=1p a i u(w i ) − c a + c a . What in the context of risk aversion is a (certain) monetary payment, corresponds to a certain utility level in the ambiguity framework. Similarly, a lottery offering monetary payments corresponds to a lottery over utility levels. Consider an increase in ambiguity aversion. We know from the definition of the term "more ambiguity averse"(analogous to more risk averse) that an agent that is more ambiguity averse will have an even lower certainty equivalent.
Hence all IC constraints still hold, after the increase in ambiguity aversion, and w remains feasible.