Skip to main content

Optimal contracts with random monitoring


We study an optimal contract problem under moral hazard in a principal-agent framework where contracts are implemented through random monitoring. This is a monitoring instrument that reveals the precise action taken by the agent with some nondegenerate probability, and otherwise reveals no information. The agent’s cost of performing the action depends on a random state of nature. This state is private information to the agent, but can be non-verifiably communicated, allowing the contract to specify wages as a function of the agent’s message. We show that the optimal contract partitions the set of types in three regions. The most efficient types exert effort and receive a reward when monitored. Moderately efficient types exert effort but are paid the same wage with monitoring as without. The least efficient types do not exert effort. More intense monitoring increases the value of a contract when the agent is risk averse.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2


  1. For instance, repairing an item, resolving issues in a service center, or providing health care services in a hospital.

  2. For instance, most car dealerships send emails to customers of their service center asking them to complete a survey about general satisfaction, but also about particular aspects, such as the quality of service performed by the manager, technician, billing department, etc. The response rate is lower than 100%, making such monitoring random.

  3. To protect environmentally sensitive areas, environmental agencies design voluntary payment schemes that incentivize farmers to exert effort towards protecting the environment, for instance, by reducing the amount of pesticides or fertilizer applied. Compliance monitoring is often executed randomly by “making unannounced visits on the farm so as to observe the level of effort being exerted when undertaking agri-environmental activities” (Choe and Fraser 1999). Often, these “contractual relationships are subject to asymmetric information between landowners and conservation agents” (Ferraro 2008). For instance, the opportunity cost of the farmer’s effort in a given period may depend on a variety of unpredictable factors such as weather, pest prevalence, etc., and environmental agencies “know less than landowners know about the costs of contractual compliance” (Ferraro 2008). This private information can be communicated, or equivalently, communication can take the form of a choice by the farmer of an option from a menu. “In UK, farmers are invited to choose from different tiers of contract, which vary in terms of livestock stocking rates, fertiliser use and construction of environmental public goods, such as stone walls” (Ozanne and White 2007).

  4. Another paper where monitoring is random and is performed by a strategic supervisor is Rahman (2012).

  5. In an application to agri-environmental policy, Ozanne and White (2007) consider a random monitoring technology in a model with moral hazard and adverse selection, but their analysis of the incentive scheme does not fully account for all possible deviations of the agent in such an environment. Instead, they simply bundle the set of incentive compatibility conditions from the models with pure moral hazard and with pure adverse selection. Additionally, deviations from the required action under moral hazard are only prevented locally, rather than globally.

  6. For instance, in the version of the model from Jost (1996) where the principal can commit to a monitoring intensity, the agent strictly prefers being monitored, whereas we show that the agent can be indifferent.

  7. Other relevant papers from this literature are Eeckhout et al. (2010) and Dechenaux and Samuel (2014) who expand on the work of Lazear (2006) to model other specific real world situations.

  8. More recent contributions to this literature are Ben-Porath et al. (2014) and Mylovanov and Zapechelnyuk (2017) who study optimal allocation problems with state verification when no transfers are allowed.

  9. The Inada condition is not necessary for the analysis to go through, but simplifies the exposition of the results.

  10. We assume for simplicity that the set of feasible effort levels is unconstrained from above or, equivalently, that such a constraint does not bind. In appendix A11 we study the case where an upper bound exists and it may bind.

  11. Note that when no monitoring is executed, \({\mathcal {P}}\) does not observe y(e) either. MacLeod (2003) is another paper where it is assumed that the principal may not observe the contribution of a particular worker to its payoff. MacLeod (2003) also provides several real-world examples of such situations, close to those we use for motivating our study.

  12. An interesting extension of this model would consider situations where r could be chosen by \({\mathcal {P}}\) depending on s (at a cost of increasing this probability). In such an environment \({\mathcal {P}}\) has an additional dimension of incentive provision besides wages, but while it is likely that the optimal contract will balance these incentive tools, it is not evident ex-ante how this optimal contract looks like. One related issue is that it never makes sense to monitor agents who are recommended zero effort. However, even if the probability of monitoring of such types is positive, the optimal contract would prescribe the same wage for these types whether they are monitored or not. Therefore, a version of the model where such types are never monitored would yield the same contract problem and optimal contract.

  13. Two assumptions are made here. First, when monitoring is performed, \( {\mathcal {P}}\) obtains publicly verifiable evidence of \({\mathcal {A}}\)’s effort. Second, \({\mathcal {P}}\) can credibly promise different wages depending on whether or not monitoring is performed.

  14. A slightly technical observation is that for this benchmark, we drop the assumption \(u^{\prime }(0)=+\infty \), so as to allow for a concave function u over \( {\mathbb {R}} \). While this makes the benchmark slightly inconsistent with our underlying model, as discussed earlier, the assumption \(u^{\prime }(0)=+\infty \) is made solely for simplifying the exposition of the results. The benchmark we consider here remains thus relevant.

  15. Note that it is not optimal to set e(s) higher than \({\overline{e}}\) which satisfies \(u\left( y({\overline{e}})\right) =c({\underline{s}},{\overline{e}})\) since the marginal benefit of doing so is lower than the marginal cost of compensating \({\mathcal {A}}\) for the additional effort. It can then be argued that this implies that, optimally, the wages w(s) must also belong to a compact set. By the continuity of the relevant functions, it follows then that an optimal contract exists. Moreover, it is unique up to a set of zero measure.

  16. Since the density function \(f(\cdot )\) is continuous, the optimality conditions are necessary to be satisfied only almost everywhere, i.e., everywhere but on a set of measure zero. To slightly simplify the exposition of the analysis and results, we focus on contracts where this incentive compatibility constraint is satisfied everywhere, as required by (5).

  17. Instead of requiring \({\mathcal {A}}\) to communicate a message about his type, \( {\mathcal {P}}\) can equivalently ask \({\mathcal {A}}\) to select an option from a menu of wage pairs \(\left\{ w^{n}(e),w(e)\right\} _{e\ge 0}\), with the commitment that \({\mathcal {A}}\) will be paid a wage \(w^{n}(e)\) if monitoring is not performed, a wage w(e) if monitoring reveals effort e, and a wage of 0 if a different effort is observed (we employ here the insight of Lemma 2(i)). Ozanne and White (2007) suggest examples of such an approach in agri-environmental policy. In this alternative interpretation, \({\mathcal {A}}\) makes a promise about an effort level that he will exert instead of communicating his private information. By designing the menu appropriately, \({\mathcal {P}}\) can ensure that \({\mathcal {A}}\) selects the pair aimed at his type. Since \({\mathcal {P}}\) can also define the wage paid to \({\mathcal {A}}\) when monitoring is not executed as a function of \({\mathcal {A}}\)’s promised effort level, the two specifications are strategically equivalent and their analysis is essentially identical.

  18. We do not specify in the text of the lemma \(e({\widehat{s}})\) since this could be strictly positive if \({\widehat{s}}={\overline{s}}\), or zero, if \({\widehat{s}}< {\overline{s}}\).

  19. More precisely, \({\varvec{1}}_{{\widehat{s}}<{\overline{s}}}=1\), when \( {\widehat{s}}<{\overline{s}}\), and \(=0\), when \({\widehat{s}}={\overline{s}}\). \( {\varvec{1}}_{{\widehat{s}}={\overline{s}}}\), employed below, is defined analogously.

  20. Since the contract may not be differentiable in s, the first-order condition is only required almost everywhere.

  21. See Chapter 6 in Caputo (2005) for a comprehensive discussion of problems with state and control constraints, and Chapter 5 in Seierstad and Sydsæter (1987) for a more detailed discussion of problems with pure state constraints.

  22. Seierstad and Sydsæter (1987) and Caputo (2005) constitute good introductions to the application of deterministic optimal control methods to solving economics problems. The complete literature, which by now spans almost 50 years, is however probably too large to be surveyed in any single manuscript.

  23. Recall that constraints (10) and (11) preempt deviations such as those labeled by (IC2) and (IC4) in Fig. 1.

  24. Together with w(s), the wage \(w^{n}\left( s\right) \) is also employed in the incentive scheme for truthful type revelation, but these incentives could also be provided in an equally efficient manner solely by means of \( w\left( s\right) \).

  25. The fact that \(w^{n}(s)\) varies with s is another consequence of the adverse selection assumed in this model. In the pure moral hazard benchmark, \(w^{n}(s)\) is constant across all states of the world so as to minimize \( {\mathcal {A}}\)’s risk exposure.

  26. Note that the above argument does not necessarily require \(ru\left( w(s)\right) +\left( 1-r\right) u\left( w^{n}(s)\right) \) be strictly decreasing in s; a weak monotonicity suffices. Equivalently, by Corollary 7, it is enough that \( e^{\prime }(s)\le 0\). The key insight of proposition, that regards the partition of types, continues thus to hold in situations where the monotonicity constraint \(e^{\prime }(s)\le 0\), elicited in Lemma 6, may bind on certain intervals in \([{\underline{s}}, {\widehat{s}}]\). However, on such intervals we have \(w^{\prime }(s)=0\).

  27. Note also that the monitoring risk is endogenously generated in optimal contracts with random monitoring, whereas in insurance contracts, the risk some types of the agent are subjected to is a share of some exogenous risk.

  28. Clearly, these indifferences hold vacuously if \(e\left( {\widehat{s}}\right) =0 \), in which case \(e\left( s\right) \) is continuous at \({\widehat{s}}\). Moreover, since u is strictly concave, from (9) and (24), it would then follow that \(w\left( {\widehat{s}}\right) =w^{n}\left( {\widehat{s}}\right) =w_{0}\). However, continuity of the contract parameters at \({\widehat{s}}\) does not appear to be a necessary condition for optimality, as the ex-ante participation constraint restricts the choice of \(w_{0}\) and may preclude a smooth wage schedule at \({\widehat{s}}\).

  29. This discussion alludes also to the effects of the limited liability on the optimal contract. If there was no limited liability (dropping also the Inada condition so as to allow u to be concave everywhere), it would follow \( \phi >0\). Therefore, the term in square brackets from equation (27 ) must equal zero. The economic implication would be that without limited liability, there is no situation where the shadow cost of the outside option \({\overline{u}}\) is zero, implying that if the agent’s outside option was to increase, the principal would be strictly worse off.

  30. The social cost equals the amount required to compensate \({\mathcal {A}}\) for marginal effort given his current wage w(s).

  31. Unlike many other agency models, where both players’ utilities are quasilinear in transfers and thus these transfers vanish from the expression of the virtual surplus, what we refer to here slightly improperly as the virtual surplus also depends on wages. The same observation is valid for the expression which we refer to above as the social surplus.

  32. The intuition for the form of the distorting factor \(c_{s}\left( s,e\left( s\right) \right) \frac{1}{f(s)}\mathop {\displaystyle \int }\nolimits _{{\underline{s}}}^{s}\left[ \frac{1}{u^{\prime }(w(\sigma ))}-\phi \right] f(\sigma )d\sigma \) from the expression of the virtual surplus is familiar from other models of contracting under adverse selection. Thus, note that since effort can only be induced with the wage \(w(\cdot )\), when the required effort of type s increases by one unit, each type \(\sigma >s\) must be paid an additional amount \(c_{s}\left( s,e\left( s\right) \right) \frac{1}{u^{\prime }(w(\sigma ))}\) in information rent. This wage adjustment leads to an increase in the ex-ante utility delivered to the agent, and thus \({\mathcal {P}}\) can reduce the insurance transfers \(\left\{ \left\{ w^{n}(s)\right\} _{s\in [{\underline{s}},{\widehat{s}}]},w_{0}\right\} \) so as to maintain \({\mathcal {A}}\) ’s resulting ex-ante expected utility at the same level. Recalling the interpretation of \(\phi \), it follows that this amount is \(c_{s}\left( s,e\left( s\right) \right) \mathop {\displaystyle \int }\nolimits _{{\underline{s}}}^{s}\phi f(\sigma )d\sigma \). The distorting factor is then the difference of these two effects.

  33. An exception are some models with multidimensional screening; see, for instance, Rochet and Stole (2002).

  34. While this shows that the optimal contract can in principle be computed with the conditions identified in Proposition 12, a practical numerical implementation would be to construct a system of differential equations in e(s), \(u^{n}(s)\) and \(\lambda _{2}(s)\) and their derivatives, with two equations obtained from the second derivatives of (33) and (34) with respect to s (see the computation of \( \frac{d^{2}}{ds^{2}}\left( \frac{\partial H}{\partial x}\right) \) and \(\frac{ d^{2}}{ds^{2}}\left( \frac{\partial H}{\partial k}\right) \) in appendix A8) and the third equation from (36).

  35. In a pure moral hazard model, where information rents are unnecessary, the first best can be attained for any \({\overline{u}}\).

  36. While such models are interesting and would have practical real world applications, we believe that our modeling specification captures many of the situations where random monitoring is employed. As previously stated, such situations are likely to be those where the employer contracts with many employees who perform numerous tasks. Therefore, on the one hand, committment can be accomplished through the repeated interaction. On the other, implementing a sophisticated monitoring technology that incorporates the agent’s message may be too costly.

  37. We cannot use Topkis’ Monotonicity Theorem here since we do not aim to show that the maximizer \({\widetilde{s}}\) from (IC1) is monotonic in s, but e(s). However, the argument we employ is an adaptation of the proof of that theorem.

  38. With a slight abuse of notation, \(\frac{\partial }{\partial {\widetilde{s}}} \Phi \left( {\widetilde{s}},{\widetilde{s}}\right) \) is the partial derivative with respect to \({\widetilde{s}}\) of \(\Phi \left( {\widetilde{s}},s\right) \) evaluated at \(s={\widetilde{s}}\).

  39. Recall that we are solving the relaxed problem where we drop the monotonicity condition \(e^{\prime }(s)\le 0\) and thus the domain of x is \( {\mathbb {R}} \). If instead we incorporate that condition, the solution may involve a so-called bang-singular-bang control, with \(\frac{\partial H}{\partial x}=0\) when \(x(s)<0\), and \(\frac{\partial H}{\partial x}>0\) when \(x(s)=0\). See appendix A7 for the details.

  40. The various features of the optimal control problem (14)–(22), discussed in Sect. 3.2, are usually analyzed separately in the optimal control literature. We need, thus, to employ several different results to state the Footnote 40 Continued set of necessary conditions for a solution to our problem. See Theorem 4.2 on page 81 in Caputo (2005) for a more common version of this result. Theorem 1 on page 178 in Seierstad and Sydsæter (1987) presents a version that accounts for the state constraints at \({\widehat{s}}\) in (21) and (22 ). Finally, Theorem 4.1 in Hartl et al. (1995) presents a version that accounts for the the state constraints in (19) and (20). The formal proof of Theorem 4.1 from the latter reference corresponding to our optimal control problem in which we have no mixed constraints with both state and control variables (so condition 2.3 from their problem does not exist), is presented in the references cited therein.

  41. The theorem does not state that the costate variables \(\lambda _{1},\lambda _{2},\lambda _{3},\lambda _{4}\) are continuous, but it states that at all points where the constraint (19) becomes or ceases to be active, the function \(\lambda _{4}\) may have discontinuities given by the following jump conditions \(\lambda _{4}(s^{-})=\lambda _{4}(s^{+})+\eta (s)\frac{ \partial }{\partial u^{n}(s)}\left[ u_{0}-(1-r)u^{n}(s)\right] \) for some positive function \(\eta (s)\), and similarly where (20) or \(u^{n}( {\widehat{s}})\ge 0\) bind (see, for instance, Note 1 on page 318 in Seierstad and Sydsæter 1987). The same result applies to \(\lambda _{1}\), \(\lambda _{2}\) and \(\lambda _{3}\), and since those constraints are independent of the corresponding state variables e(s), u(s) and v(s) except at \(\widehat{s }\), it follows that \(\lambda _{1}\), \(\lambda _{2}\) and \(\lambda _{3}\) are continuous everywhere except possibly at \({\widehat{s}}\).

  42. Given (33) and (34), (41) implies (40), so the latter condition is not used in eliciting the optimal contract.

  43. See, Theorem 10.2 on page 266 in Caputo (2005) for a standard version of the necessary condition for free end-time optimal control problems without state constraints. The result we apply here is Theorem 4 on page 337 in Seierstad and Sydsæter (1987). An alternative way to derive this result is by defining \(v\left( s\right) \equiv \int _{{\underline{s}}}^{s}\left\{ ru\left( \sigma \right) +(1-r)u^{n}(\sigma )-c\left( \sigma ,e\left( \sigma \right) \right) -u_{0}{\varvec{1}}_{{\widehat{s}}<{\overline{s}}}\right\} f(\sigma )d\sigma \) and writing the transversality condition as \(v\left( {\widehat{s}} \right) \ge {\overline{u}}-u_{0}{\varvec{1}}_{{\widehat{s}}<{\overline{s}}}\). In this control problem, the constraints do not depend directly on \(\widehat{ s}\), and thus applying the same theorem from Seierstad and Sydsæter (1987), the condition is written solely in terms of the corresponding Hamiltonian.

  44. Let \({\mathcal {K}}\) be the value function of the problem in (14)–(22) when \({\widehat{s}}\) is fixed at \({\overline{s}}\). Then, the optimal value for \(u_{0}\) is \({u_{0}=\arg \max _{{\widetilde{u}}_{0}\ge 0}}\left\{ {\mathcal {V}}({\widetilde{u}}_{0})-h({\widetilde{u}}_{0})\right\} \) if \({\mathcal {V}}(u_{0})-h(u_{0})>{\mathcal {K}}\). If \({\mathcal {V}} (u_{0})-h(u_{0})\le {\mathcal {K}}\), then the optimal value of \({\widehat{s}}\) is \({\overline{s}}\), and \(u_{0}\) does not need to be specified in the contract. It follows that (49) is necessary when \({\widehat{s}}< {\overline{s}}\).

  45. Differentiating twice each side of the equality \(u(h(v))=v\), we obtain \( h^{\prime \prime }(v)=-u^{\prime \prime }(h(v))\left[ h^{\prime }(v)\right] ^{2}/u^{\prime }(h(v))>0\).

  46. The existence result presented below, Lemma 18, states the existence of a solution to the optimal control problem where the contract variables are absolutely continuous in s.

  47. Note that this conclusion would continue to hold even if instead of \( e^{\prime }<0\), we had \(e^{\prime }\le 0\).

  48. The version of this envelope theorem that we employ for our problem with state constraints is Theorem 1 in LaFrance and Barney (1991), which, as argued above in the proof of claim 20, can be applied. The optimal control problem considered in that theorem does not have end-time state constraints as our problem has in (21 ) and (22), but the argument of the proof goes through when such constraints are incorporated in the problem. The only modification that is required is that these constraints be accounted for when taking the derivative of the value function. The final result needs to be modified accordingly, and the version of the theorem we employ accounts for this modification.


  • Azar OH (2004) Optimal monitoring with external incentives: the case of tipping. South Econ J 70:170–181

    Google Scholar 

  • Baiman S, Demski JS (1980) Economically optimal performance evaluation and control systems. J Acc Res 18:184–220

    Article  Google Scholar 

  • Barbos A (2017) Random monitoring without communication. University of South Florida

  • Barbos A (2019) Dynamic contracts with random monitoring. J Math Econ 85:1–16

    Article  Google Scholar 

  • Ben-Porath E, Dekel E, Lipman B (2014) Optimal allocation with costly verification. Am Econ Rev 104:3779–3813

    Article  Google Scholar 

  • Border KC, Sobel J (1987) Samurai accountant: a theory of auditing and plunder. Rev Econ Stud 54:525–540

    Article  Google Scholar 

  • Bryson AE, Ho YC (1975) Applied optimal control. Hemisphere Publishing Corp, Washington, D.C

    Google Scholar 

  • Caputo MR (2005) Foundations of dynamic economic analysis. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Cesari L (1983) Optimization—theory and applications. Springer, New York

    Book  Google Scholar 

  • Choe C, Fraser I (1999) Compliance monitoring and agri-environmental policy. J Agric Econ 50:468–487

    Article  Google Scholar 

  • Dechenaux E, Samuel A (2014) Announced vs. surprise inspections with tipping-off. Eur J Polit Econ 34:167–183

    Article  Google Scholar 

  • Eeckhout J, Persico N, Todd P (2010) A theory of optimal random crackdowns. Am Econ Rev 100:1104–1135

    Article  Google Scholar 

  • Ferraro PJ (2008) Asymmetric information and contract design for payments for environmental services. Ecol Econ 65:810–821

    Article  Google Scholar 

  • Hartl RF, Sethi SP, Vickson RG (1995) A survey of the maximum principles for optimal control problems with state constraints. SIAM Rev 37:181–218

    Article  Google Scholar 

  • Holmstrom B (1979) Moral hazard and observability. Bell J Econ 10:74–91

    Article  Google Scholar 

  • Jost P-J (1991) Monitoring in principal-agent relationships. J Inst Theor Econ 147:517–538

    Google Scholar 

  • Jost P-J (1996) On the role of commitment in a principal-agent relationship with an informed principal. J Econ Theory 68:510–530

    Article  Google Scholar 

  • Kihlstrom R, Laffont J (1979) A general equilibrium entrepreneurial theory of firm formation based on risk aversion. J Polit Econ 87:719–748

    Article  Google Scholar 

  • Krener AJ (1977) The high order maximal principle and its applications to singular extremals. SIAM J Control Optim 15:256–293

    Article  Google Scholar 

  • LaFrance JT, Barney LD (1991) The envelope theorem in dynamic optimization. J Econ Dyn Control 15:355–385

    Article  Google Scholar 

  • Lazear EP (2006) Speeding, terrorism, and teaching to the test. Q J Econ 121:1029–1061

    Article  Google Scholar 

  • MacLeod WB (2003) Optimal contracting with subjective evaluation. Am Econ Rev 93:216–240

    Article  Google Scholar 

  • Milgrom P, Segal I (2002) Envelope theorems for arbitrary choice sets. Econometrica 70:583–601

    Article  Google Scholar 

  • Mookherjee D, Png I (1989) Optimal auditing, insurance and redistribution. Q J Econ 104:399–415

    Article  Google Scholar 

  • Mylovanov T, Zapechelnyuk A (2017) Optimal allocation with ex post verification and limited penalties. Am Econ Rev 107:2666–2694

  • Ozanne A, Hogan T, Colman D (2001) Moral hazard, risk aversion and compliance monitoring in agri-environmental policy. Eur Rev Agric Econ 28:329–347

    Article  Google Scholar 

  • Ozanne A, White B (2007) Equivalence of input quotas and input charges under asymmetric information in agri-environmental schemes. J Agric Econ 58:260–268

    Article  Google Scholar 

  • Rahman D (2012) But who will monitor the monitor? Am Econ Rev 102:2767–2797

    Article  Google Scholar 

  • Rochet J-C, Stole L (2002) Nonlinear pricing with random participation. Rev Econ Stud 69:277–311

    Article  Google Scholar 

  • Rothschild M, Stiglitz J (1976) Equilibrium in competitive insurance markets: an essay on the economics of imperfect information. Q J Econ 90:629–649

    Article  Google Scholar 

  • Seierstad A (1984) Sufficient conditions in free final time optimal control problems. A comment. J Econ Theory 32:367–370

    Article  Google Scholar 

  • Seierstad A, Sydsæter K (1987) Optimal control theory with economic applications. North-Holland, Amsterdam

    Google Scholar 

  • Seywald H, Cliff EM (1993) The generalized Legendre–Clebsch condition on state/control constrained arcs. In: AIAA Guidance, Navigation and Control Conference, Monterey, CA. AIAA Paper 93–3746

  • Simon CP, Blume L (1994) Mathematics for economists. Norton, New York

    Google Scholar 

  • Strausz R (1997) Delegation of monitoring in a principal-agent relationship. Rev Econ Stud 64:337–357

    Article  Google Scholar 

  • Strausz R (2006) Timing of verification procedures: monitoring versus auditing. J Econ Behav Organ 59:89–107

    Article  Google Scholar 

  • Townsend R (1979) Optimal contracts and competitive markets with costly state verification. J Econ Theory 21:265–293

    Article  Google Scholar 

  • Varas F, Marinovic I, Skrzypacz A (2020) Random inspections and periodic reviews: optimal dynamic monitoring. Rev Econ Stud:1–45

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Andrei Barbos.

Ethics declarations

Conflict of interest

The author declares that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 359 KB)



Appendix A1: Proofs of the results from Sect. 3.1

Proof of Lemma 2

The condition in constraint (5) can be rewritten for all \(s\in [{\underline{s}},{\overline{s}}]\) as

$$\begin{aligned}&ru\left( w(s,e(s))\right) +\left( 1-r\right) u\left( w^{n}(s)\right) -c\left( s,e\left( s\right) \right) \nonumber \\&\quad \ge \underset{{\widetilde{s}}\in [{\underline{s}},{\overline{s}}],e\in {\mathbb {R}} _{+}}{\max }\left[ ru\left( w({\widetilde{s}},e)\right) +\left( 1-r\right) u\left( w^{n}({\widetilde{s}})\right) -c\left( s,e\right) \right] \text {.} \end{aligned}$$

Part (i) of the lemma follows immediately from the fact that setting \(w(s,e)=0\) for all \(e\ne e(s)\) and \(s\in [{\underline{s}}, {\overline{s}}]\) weakens this constraint. Since these wages do not appear either the objective function in (4) or the participation constraint in (6), weakening the constraint (5) weakly increases the optimal value. Employing this finding, the term in right hand side of (30) can be rewritten as \({\max _{\widetilde{ s}\in [{\underline{s}},{\overline{s}}]}}\max \left\{ \left[ ru\left( w({\widetilde{s}},e({\widetilde{s}}))\right) +\left( 1-r\right) u\left( w^{n}({\widetilde{s}})\right) -c\left( s,e({\widetilde{s}})\right) \right] , {\max _{e\in [e({\widetilde{s}}),\infty )}}\right. \left. \left[ \left( 1-r\right) u\left( w^{n}({\widetilde{s}})\right) -c\left( s,e\right) \right] \right\} \). Since \(c_{e}>0\) and \(c(s,0)=0\), we have\(\ {\max _{e\in [e( {\widetilde{s}}),\infty )}} \left[ \left( 1-r\right) u\left( w^{n}( {\widetilde{s}})\right) -c\left( s,e\right) \right] \le \left( 1-r\right) u\left( w^{n}({\widetilde{s}})\right) \), which implies part (ii) of the lemma. \(\square \)

Proof of Lemma 3

Let \({\mathcal {C}}\equiv \left\{ e(s),w(s),w^{n}\left( s\right) \right\} _{s\in \left[ {\underline{s}},{\overline{s}}\right] }\) be the solution to problem (4)–(6) and let \({\widehat{S}}\equiv \{s\in \left[ {\underline{s}},{\overline{s}}\right] |e(s)>0\}\). Note now that in order for (5) to be satisfied, \(ru\left( w(s)\right) +\left( 1-r\right) u\left( w^{n}(s)\right) \) must equal the same value for all \(s\in \left[ {\underline{s}},{\overline{s}}\right] \backslash {\widehat{S}}\); otherwise, some types \(s\in \left[ {\underline{s}},{\overline{s}}\right] \backslash {\widehat{S}}\) would not truthfully disclose their information. Assume however by contradiction that \(w^{n}(s)\ne w(s)\) on a subset of \(\left[ \underline{s },{\overline{s}}\right] \backslash {\widehat{S}}\) of strictly positive measure. Let then \(w_{0}\) be the non-negative value satisfying \(u(w_{0})\int _{s\in \left[ {\underline{s}},{\overline{s}}\right] \backslash {\widehat{S}} }f(s)ds=\int _{s\in \left[ {\underline{s}},{\overline{s}}\right] \backslash {\widehat{S}}}\left[ ru\left( w\left( s\right) \right) +\left( 1-r\right) u\left( w^{n}(s)\right) \right] f(s)ds\). If u is strictly concave, by Jensen’s inequality and our contradiction assumption, it follows that

$$\begin{aligned} w_{0}\int _{s\in \left[ {\underline{s}},{\overline{s}}\right] \backslash \widehat{ S}}f(s)ds<\int _{s\in \left[ {\underline{s}},{\overline{s}}\right] \backslash {\widehat{S}}}\left[ rw\left( s\right) +\left( 1-r\right) w^{n}(s)\right] f(s)ds\text {.} \end{aligned}$$

Consider then the contract \({\mathcal {C}}_{1}\equiv \left\{ e_{1}(s),w_{1}(s),w_{1}^{n}\left( s\right) \right\} _{s\in \left[ {\underline{s}},{\overline{s}}\right] }\), where \(w_{1}^{n}\left( s\right) =w\left( s\right) \equiv w_{0}\) for all \(s\in \left[ {\underline{s}},\overline{ s}\right] \backslash {\widehat{S}}\), and \(w_{1}^{n}\left( s\right) \equiv w^{n}\left( s\right) \) and \(w_{1}\left( s\right) \equiv w\left( s\right) \) for \(s\in {\widehat{S}}\). Since \({\mathcal {C}}\) is incentive compatible, while \( ru\left( w(s)\right) +\left( 1-r\right) u\left( w^{n}(s)\right) =ru\left( w_{1}(s)\right) +\left( 1-r\right) u\left( w_{1}^{n}(s)\right) =u(w_{0})\) for all \(s\in \left[ {\underline{s}},{\overline{s}}\right] \backslash {\widehat{S}} \), the contract \({\mathcal {C}}_{1}\) is incentive compatible as well. Moreover, by (31), \({\mathcal {C}}_{1}\) delivers a strictly higher payoff to \( {\mathcal {P}}\), contradicting the optimality of \({\mathcal {C}}\). This implies the claim of the lemma. \(\square \)

Proof of Lemma 4

Let \({\mathcal {C}}\equiv \left\{ e(s),w(s),w^{n}\left( s\right) \right\} _{s\in \left[ {\underline{s}},{\overline{s}}\right] }\) be the solution to problem (4)–(6). Let again \({\widehat{S}}\equiv \{s\in \left[ {\underline{s}},{\overline{s}}\right] |e(s)>0\}\). Assume \(\left[ {\underline{s}},{\overline{s}}\right] \backslash {\widehat{S}}\ne \varnothing \) since otherwise the claim of the lemma is vacuously satisfied. By Lemma 3, there exists \(w_{0}\) such that \( w(s)=w^{n}\left( s\right) =w_{0}\) for all \(s\in \left[ {\underline{s}}, {\overline{s}}\right] \backslash {\widehat{S}}\). Take then some type \(s\in {\widehat{S}}\). Since \({\mathcal {C}}\) is incentive compatible, it must be that \( ru\left( w(s)\right) +\left( 1-r\right) u\left( w^{n}(s)\right) -c\left( s,e(s)\right) \ge u(w_{0})\). Assume by contradiction that there exists a type \(s^{\prime }<s\) such that \(e(s^{\prime })=0\), which thus receives a payoff \(u(w_{0})\). Since \(c_{s}>0\), we have \(ru\left( w(s)\right) +\left( 1-r\right) u\left( w^{n}(s)\right) -c\left( s^{\prime },e(s)\right) >ru\left( w(s)\right) +\left( 1-r\right) u\left( w^{n}(s)\right) -c\left( s,e(s)\right) \ge u(w_{0})\), implying that type \(s^{\prime }\) is better off claiming to be type s and exerting e(s). This contradicts the assumption that \({\mathcal {C}}\) is incentive compatible. \(\square \)

Proof of Lemma 5

We argue first that a contract satisfying (IC1) will satisfy (IC2) if and only if

$$\begin{aligned}&ru\left( w({\widehat{s}})\right) +\left( 1-r\right) u\left( w^{n}\left( {\widehat{s}}\right) \right) -c\left( {\widehat{s}},e\left( {\widehat{s}}\right) \right) \nonumber \\&\quad \ge \max \left\{ u\left( w_{0}\right) {\varvec{1}}_{{\widehat{s}}< {\overline{s}}},{ \max _{{\widetilde{s}}\in [{\underline{s}},{\widehat{s}}]}}\left[ \left( 1-r\right) u\left( w^{n}({\widetilde{s}})\right) \right] \right\} \text { } \end{aligned}$$

The fact that (IC2) implies (32) is immediate. In the other direction, note that \(ru\left( w(s)\right) +\left( 1-r\right) u\left( w^{n}\left( s\right) \right) -c\left( s,e\left( s\right) \right) \ge ru\left( w({\widehat{s}})\right) +\left( 1-r\right) u\left( w^{n}\left( {\widehat{s}}\right) \right) -c\left( s,e\left( {\widehat{s}}\right) \right) \ge ru\left( w({\widehat{s}})\right) +\left( 1-r\right) u\left( w^{n}\left( {\widehat{s}}\right) \right) -c\left( {\widehat{s}},e\left( {\widehat{s}}\right) \right) \) for all \(s\in [{\underline{s}},{\widehat{s}}]\), where the first inequality is implied by (IC1) and the second by \(s\le {\widehat{s}} \). Therefore, (32) implies (IC2) as well. We therefore substitute (32) for (IC2) in the following.

Now, if \({\widehat{s}}={\overline{s}}\), then (32) is written as \( ru\left( w({\overline{s}})\right) +\left( 1-r\right) u\left( w^{n}({\overline{s}} )\right) -c\left( {\overline{s}},e\left( {\overline{s}}\right) \right) \ge {\max _{{\widetilde{s}}\in [{\underline{s}},{\overline{s}}]}}\left( 1-r\right) u\left( w^{n}({\widetilde{s}})\right) \), which is the condition from the part (i) of the lemma.

Assume that \({\widehat{s}}<{\overline{s}}\). We will show first that (IC3), (IC4), and (32) imply \(ru\left( w({\widehat{s}} )\right) +\left( 1-r\right) u\left( w^{n}({\widehat{s}})\right) -c\left( {\widehat{s}},e\left( {\widehat{s}}\right) \right) =\) \(u\left( w_{0}\right) \) and \(u\left( w_{0}\right) \ge {\max _{s\in [{\underline{s}},\widehat{ s}]}}\left( 1-r\right) u\left( w^{n}(s)\right) \). From (32) we have that \(ru\left( w({\widehat{s}})\right) +\left( 1-r\right) u\left( w^{n}\left( {\widehat{s}}\right) \right) -c\left( {\widehat{s}},e\left( \widehat{ s}\right) \right) \ge u\left( w_{0}\right) \). Assuming by contradiction that \(ru\left( w({\widehat{s}})\right) +\left( 1-r\right) u\left( w^{n}\left( {\widehat{s}}\right) \right) -c\left( {\widehat{s}},e\left( {\widehat{s}}\right) \right) >u\left( w_{0}\right) \), it follows by the continuity of \(c\left( \cdot \right) \) in s that there exists \(s^{\prime }\in \left( {\widehat{s}}, {\overline{s}}\right) \) such that \(ru\left( w({\widehat{s}})\right) +\left( 1-r\right) u\left( w^{n}\left( {\widehat{s}}\right) \right) -c\left( s^{\prime },e\left( {\widehat{s}}\right) \right) >u\left( w_{0}\right) \). But this implies then that for some \(s^{\prime }\in \left( {\widehat{s}},{\overline{s}} \right) \) we have \({\max _{{\widetilde{s}}\in [{\underline{s}},\widehat{ s}]}}\left[ ru\left( w({\widetilde{s}})\right) +\left( 1-r\right) u\left( w^{n}({\widetilde{s}})\right) -c\left( s^{\prime },e\left( \widetilde{s }\right) \right) \right] >u\left( w_{0}\right) \), contradicting (IC3). Therefore, indeed \(ru\left( w({\widehat{s}})\right) +\left( 1-r\right) u\left( w^{n}\left( {\widehat{s}}\right) \right) -c\left( {\widehat{s}},e\left( {\widehat{s}}\right) \right) =u\left( w_{0}\right) \). Given this, (32) implies then that it must be that \(u\left( w_{0}\right) \ge {\max _{s\in [{\underline{s}},{\widehat{s}}]}}\left( 1-r\right) u\left( w^{n}(s)\right) \). For the converse, (IC4) and (32) follow immediately from the two assumptions. To show that (IC3) is also satisfied, note that for all \(s\in ({\widehat{s}},{\overline{s}}]\), we have \({\max _{{\widetilde{s}}\in [{\underline{s}},{\widehat{s}}]}}\left[ ru\left( w({\widetilde{s}})\right) +\left( 1-r\right) u\left( w^{n}(\widetilde{ s})\right) -c\left( s,e\left( {\widetilde{s}}\right) \right) \right] < {\max _{{\widetilde{s}}\in [{\underline{s}},{\widehat{s}}]} }\left[ ru\left( w({\widetilde{s}})\right) +\left( 1-r\right) u\left( w^{n}({\widetilde{s}} )\right) -c\left( {\widehat{s}},e\left( {\widetilde{s}}\right) \right) \right] =ru\left( w({\widehat{s}})\right) +\left( 1-r\right) u\left( w^{n}\left( {\widehat{s}}\right) \right) -c\left( {\widehat{s}},e\left( {\widehat{s}}\right) \right) =u\left( w_{0}\right) \), where the inequality follows from the fact that \(c_{s}>0\), while the first equality from (IC1). Therefore, (IC3) is also satisfied. \(\square \)

Proof of Lemma 6

We argue first that any contract satisfying (IC1) must satisfy the conditions stated in the lemma. To show that \(e\left( s\right) \) is a.e. decreasing in s, assume by contradiction that there exist \(s_{2}>s_{1}\) with \(e(s_{2})>e\left( s_{1}\right) \).Footnote 37 Let \(\Phi \left( {\widetilde{s}},s\right) \equiv ru\left( w({\widetilde{s}})\right) +(1-r)u(w^{n}({\widetilde{s}}))-c\left( s,e\left( {\widetilde{s}}\right) \right) \). From the fact that \(s_{2}\in {\arg \max _{{\widetilde{s}}\in [{\underline{s}},{\widehat{s}}]}} \Phi \left( {\widetilde{s}},s_{2}\right) \) it follows that \(\Phi \left( s_{2},s_{2}\right) -\Phi \left( s_{1},s_{2}\right) \ge 0\Longrightarrow -c\left( s_{2},e\left( s_{2}\right) \right) +c\left( s_{2},e\left( s_{1}\right) \right) \ge \left[ ru\left( w(s_{1})\right) +(1-r)u(w^{n}(s_{1}))\right] -\left[ ru\left( w(s_{2})\right) +(1-r)u(w^{n}(s_{2}))\right] \), while from \(s_{1}\in {\arg \max _{{\widetilde{s}} \in [{\underline{s}},{\widehat{s}}]}}\Phi \left( {\widetilde{s}} ,s_{1}\right) \), it follows that \(0\ge \Phi \left( s_{2},s_{1}\right) -\Phi \left( s_{1},s_{1}\right) \Longrightarrow \left[ ru\left( w(s_{1})\right) +(1-r)u(w^{n}(s_{1}))\right] -\left[ ru\left( w(s_{2})\right) +(1-r)u(w^{n}(s_{2}))\right] \ge -c\left( s_{1},\right. \left. e\left( s_{2}\right) \right) +c\left( s_{1},e\left( s_{1}\right) \right) \). Combining these two results, we obtain that \(c\left( s_{2},e\left( s_{2}\right) \right) -c\left( s_{2},e\left( s_{1}\right) \right) \le c\left( s_{1},e\left( s_{2}\right) \right) -c\left( s_{1},e\left( s_{1}\right) \right) \). But since \( s_{2}>s_{1} \) and \(e(s_{2})>e\left( s_{1}\right) \), this contradicts the fact that \(c_{es}>0\). Thus, (IC1)implies that e(s) must be a.e. decreasing in s (this also implies that e(s) is a.e. differentiable). The necessity of the other condition from the text of the lemma follows from the first order condition in \({\mathcal {A}}\)’s problem in (IC1).

Next, we show that a contract satisfying the two conditions from the text of the lemma will satisfy (IC1). Employing again the notation for \( \Phi \left( {\widetilde{s}},s\right) \) defined above, we will argue that \(\Phi \left( s,s\right) \ge \Phi \left( {\widetilde{s}},s\right) \) for all \( {\widetilde{s}},s\in [{\underline{s}},{\widehat{s}}]\), which will be enough to complete the proof. We have \(\frac{\partial }{\partial {\widetilde{s}}}\Phi \left( {\widetilde{s}},s\right) =ru^{\prime }\left( w({\widetilde{s}})\right) w^{\prime }({\widetilde{s}})+(1-r)u^{\prime }(w^{n}({\widetilde{s}}))w^{n\prime }({\widetilde{s}})-c_{e}\left( s,e\left( {\widetilde{s}}\right) \right) e^{\prime }({\widetilde{s}})\), and so note that \(\frac{\partial }{\partial {\widetilde{s}}}\Phi \left( {\widetilde{s}},s\right) \ge \frac{\partial }{ \partial {\widetilde{s}}}\Phi \left( {\widetilde{s}},{\widetilde{s}}\right) \) if and only if \(-c_{e}\left( s,e\left( {\widetilde{s}}\right) \right) e^{\prime }( {\widetilde{s}})\ge -c_{e}\left( {\widetilde{s}},e\left( {\widetilde{s}}\right) \right) e^{\prime }({\widetilde{s}})\).Footnote 38 Since \(c_{es}(\cdot )>0\) and \(e^{\prime }(\cdot )\le 0\), it follows that \(\frac{\partial }{\partial {\widetilde{s}}} \Phi \left( {\widetilde{s}},s\right) \ge \frac{\partial }{\partial \widetilde{ s}}\Phi \left( {\widetilde{s}},{\widetilde{s}}\right) \) if and only if \( {\widetilde{s}}\le s\). But by \(ru^{\prime }\left( w(s)\right) w^{\prime }(s)+\left( 1-r\right) u^{\prime }\left( w^{n}(s)\right) w^{n\prime }(s)=c_{e}(s,e(s))e^{\prime }(s)\), we have \(\frac{\partial }{\partial {\widetilde{s}}}\Phi \left( {\widetilde{s}},{\widetilde{s}}\right) =0\). Thus, \( \Phi \left( {\widetilde{s}},s\right) \) is increasing (decreasing) in \( {\widetilde{s}}\) for \({\widetilde{s}}\le s\) (\({\widetilde{s}}\ge s\)), implying immediately that it is indeed maximized when \({\widetilde{s}}\) equals s. \( \square \) \(\square \)

Appendix A2: Optimality conditions for the contract problem

Lemma 18 from appendix A6 states the existence of a solution to the optimal control problem in (14)–(22) under a compactness condition on the set of feasible effort levels. The Hamiltonian associated with this problem is

$$\begin{aligned} H\left( \cdot ,s\right)\equiv & {} \left[ y\left( e\right) -rh\left( u\right) -\left( 1-r\right) h\left( u^{n}\right) +h(u_{0}){\varvec{1}}_{{\widehat{s}}< {\overline{s}}}\right] f(s)\\&+\lambda _{1}x+\lambda _{2}\left[ -\frac{1-r}{r}k+ \frac{1}{r}c_{e}\left( s,e\right) x\right] \\&+\lambda _{3}\left[ ru+\left( 1-r\right) u^{n}-c\left( s,e\right) \right] f(s)+\lambda _{4}k\text {,} \end{aligned}$$

while the Lagrangian which accounts for the state constraints (19 ), (20) and (22) is

$$\begin{aligned} L\left( \cdot ,s\right)\equiv & {} H\left( \cdot ,s\right) \text { }+\gamma \left( s\right) \left[ u_{0}-(1-r)u^{n}\left( s\right) \right] {\varvec{1}} _{{\widehat{s}}<{\overline{s}}}\text { }f(s) \\&+\theta \left( s\right) \left[ ru({\widehat{s}})+\left( 1-r\right) u^{n}( {\widehat{s}})-c\left( {\widehat{s}},e\left( {\widehat{s}}\right) \right) -\left( 1-r\right) u^{n}(s)\right] {\varvec{1}}_{{\widehat{s}}={\overline{s}}}f(s) \end{aligned}$$

where \(\lambda _{1}\), \(\lambda _{2}\), \(\lambda _{3}\), \(\lambda _{4}\), \( \gamma \), and \(\theta \) are functions defined on \([{\underline{s}},{\widehat{s}} ] \), while \(\mu \), \(\phi \) and \(\rho \) scalars. Since the Lagrangian is linear in the control variables x and k, while the domain of these variables is unbounded,Footnote 39 a solution to this problem necessarily involves a so-called singular control, i.e., it must satisfy \(\frac{ \partial H}{\partial x}=0\) and \(\frac{\partial H}{\partial k}=0\) for all s (see, for instance, page 247 in Bryson and Ho (1975) for a discussion of singular controls on unbounded domains). By Pontryagin’s Maximum Principle which provides the necessary first order conditions in optimal control problems,Footnote 40 there exist almost everywhere differentiable functions \(\lambda _{1},\lambda _{2},\lambda _{3},\lambda _{4}\),Footnote 41 almost everywhere continuous functions \(\gamma \) and \(\theta \), and scalars \(\mu \), \(\phi \) and \(\rho \) such that conditions (33)–(47) below are satisfied almost everywhere.

$$\begin{aligned} \frac{\partial H}{\partial x}= & {} \lambda _{1}\left( s\right) +\lambda _{2}\left( s\right) \frac{1}{r}c_{e}\left( s,e\left( s\right) \right) =0 \end{aligned}$$
$$\begin{aligned} \frac{\partial H}{\partial k}= & {} -\frac{1-r}{r}\lambda _{2}\left( s\right) +\lambda _{4}\left( s\right) =0 \end{aligned}$$

Equations (33) and (34) are the first-order conditions that follow from the singularity of the control, and which ensure that the two control variables maximize the Hamiltonian. In dynamic optimization problems, the costate variables \(\lambda _{i}(s)\) have the interpretation of a shadow price of the associated state variable at s (see page 243 Caputo 2005 for an interpretation of the costate variables in dynamic optimization problems). Condition (33) equates, thus, the marginal benefit and marginal cost for \({\mathcal {P}}\) of increasing the level of effort required from type s, while condition (34) equates the marginal costs of delivering utility to the agent through u(s) and \( u^{n}(s)\), respectively.

$$\begin{aligned} \lambda _{1}^{\prime }(s)= & {} -\frac{\partial L}{\partial e}=\left[ -y^{\prime }\left( e(s)\right) +\lambda _{3}\left( s\right) c_{e}\left( s,e\left( s\right) \right) \right] f(s) \nonumber \\&-\lambda _{2}\left( s\right) \frac{1}{ r}c_{ee}\left( s,e\left( s\right) \right) x\left( s\right) \end{aligned}$$
$$\begin{aligned} \lambda _{2}^{\prime }(s)= & {} -\frac{\partial L}{\partial u}=r\left[ h^{\prime }(u\left( s\right) )-\lambda _{3}\left( s\right) \right] f(s) \end{aligned}$$
$$\begin{aligned} \lambda _{3}^{\prime }(s)= & {} -\frac{\partial L}{\partial v}=0 \end{aligned}$$
$$\begin{aligned} \lambda _{4}^{\prime }(s)= & {} -\frac{\partial L}{\partial u^{n}}=\left( 1-r\right) \left\{ h^{\prime }\left( u^{n}(s)\right) \right. \nonumber \\&\left. +\left[ \gamma \left( s\right) {\varvec{1}}_{{\widehat{s}}<{\overline{s}}}+\theta \left( s\right) {\varvec{1}}_{{\widehat{s}}={\overline{s}}}\right] -\lambda _{3}\left( s\right) \right\} f(s) \end{aligned}$$

Equations (35)–(38) are the equations of motion of the costate variables. Defining

$$\begin{aligned} {\widetilde{L}}\left( {\widehat{s}}\right)\equiv & {} \int _{{\underline{s}}}^{\widehat{ s}}L\left( \cdot ,s\right) \text { }f(s)ds+\mu \left[ ru\left( {\widehat{s}} \right) +(1-r)u^{n}({\widehat{s}})-c\left( {\widehat{s}},e\left( {\widehat{s}} \right) \right) -u_{0}\right] {\varvec{1}}_{{\widehat{s}}<{\overline{s}}} \nonumber \\&+\phi \left\{ v\left( {\widehat{s}}\right) -{\overline{u}}+\left[ 1-F\left( {\widehat{s}}\right) \right] u_{0}\right\} +\left( 1-r\right) \rho u^{n}( {\widehat{s}})\text {.} \end{aligned}$$

we have the following initial conditions on the costate variables,

$$\begin{aligned} \lambda _{1}({\underline{s}})= & {} 0\text {; }\lambda _{1}({\widehat{s}})=\frac{ \partial }{\partial e\left( {\widehat{s}}\right) }{\widetilde{L}}\left( \widehat{ s}\right) \nonumber \\= & {} -\left[ \mu {\varvec{1}}_{{\widehat{s}}<{\overline{s}}}+\varvec{ 1}_{{\widehat{s}}={\overline{s}}}\int _{{\underline{s}}}^{{\widehat{s}}}\theta (s)f(s)ds\right] c_{e}\left( {\widehat{s}},e\left( {\widehat{s}}\right) \right) \end{aligned}$$
$$\begin{aligned} \lambda _{2}({\underline{s}})= & {} 0\text {; }\lambda _{2}({\widehat{s}})=\frac{ \partial }{\partial u\left( {\widehat{s}}\right) }{\widetilde{L}}\left( \widehat{ s}\right) =r\left[ \mu {\varvec{1}}_{{\widehat{s}}<{\overline{s}}}+\varvec{ 1}_{{\widehat{s}}={\overline{s}}}\int _{{\underline{s}}}^{{\widehat{s}}}\theta (s)f(s)ds\right] \end{aligned}$$
$$\begin{aligned} \lambda _{3}({\underline{s}})\in & {} {\mathbb {R}} \text {; }\lambda _{3}({\widehat{s}})=\frac{\partial }{\partial v\left( {\widehat{s}}\right) }{\widetilde{L}}\left( {\widehat{s}}\right) =\phi \end{aligned}$$
$$\begin{aligned} \lambda _{4}({\underline{s}})= & {} 0\text {; }\lambda _{4}({\widehat{s}})=\frac{ \partial }{\partial u^{n}\left( {\widehat{s}}\right) }{\widetilde{L}}\left( {\widehat{s}}\right) \nonumber \\= & {} \left( 1-r\right) \left[ \mu {\varvec{1}}_{{\widehat{s}}< {\overline{s}}}+{\varvec{1}}_{{\widehat{s}}={\overline{s}}}\int _{{\underline{s}}}^{ {\widehat{s}}}\theta (s)f(s)ds+\rho \right] \end{aligned}$$

which are determined by the constraints on the state variables at the two end-points.Footnote 42 When such constraints do not exist (as, for instance, on \(u( {\underline{s}})\)), then the shadow price of the state variable at that end-point must be zero (otherwise, the state variable could be perturbed to increase the value of the objective function). When such constraints do exist (as, for instance, on \(u\left( {\widehat{s}}\right) \)), then the shadow price must be equal to the effect that a perturbation of that variable would have on the value of the objective function through the relaxation of the constraints in which it appears. If, for instance, instead of the equality from (41), we had \(\lambda _{2}({\widehat{s}})<\frac{\partial }{ \partial u\left( {\widehat{s}}\right) }{\widetilde{L}}\left( {\widehat{s}}\right) \) , then an increase in \(u\left( {\widehat{s}}\right) \) would increase the optimal value objective function, since the shadow value of the relaxation of the constraints (20) and (21) would more than offset the corresponding shadow price of \(u\left( {\widehat{s}}\right) \).

$$\begin{aligned} \gamma (s)\ge & {} 0\text {, and}=0\text { if }\left[ u_{0}-\left( 1-r\right) u^{n}\left( s\right) \right] {\varvec{1}}_{{\widehat{s}}<{\overline{s}}}\text { }>0\text {, for }s\in [{\underline{s}},{\widehat{s}}] \end{aligned}$$
$$\begin{aligned} \theta (s)\ge & {} 0\text {, and}=0\text { if }\left[ ru({\widehat{s}})+\left( 1-r\right) u^{n}({\widehat{s}})-c\left( {\widehat{s}},e\left( {\widehat{s}}\right) \right) \right. \nonumber \\&\qquad \qquad \;\;\left. -\left( 1-r\right) u^{n}(s)\right] {\varvec{1}}_{{\widehat{s}}= {\overline{s}}}\text { }>0\text {, for }s\in [{\underline{s}},{\widehat{s}}] \end{aligned}$$
$$\begin{aligned} \phi\ge & {} 0\text {, and}=0\text { if }v\left( {\widehat{s}}\right) >\overline{u }-\left[ 1-F\left( {\widehat{s}}\right) \right] u_{0}\text { } \end{aligned}$$
$$\begin{aligned} \rho\ge & {} 0\text {, and}=0\text { if }u^{n}({\widehat{s}})>0 \end{aligned}$$

(44)–(47) are the complementary slack conditions, with (46) and (47) determined by the two transversality conditions in (22). Note that (46) combined with the expression for \(\phi \) to be derived in the Proof of Lemma 11 implies (27). Since \({\widehat{s}}\) is a choice variable, we have the conditionFootnote 43

$$\begin{aligned}&H\left( e\left( {\widehat{s}}\right) ,u\left( {\widehat{s}}\right) ,u^{n}( {\widehat{s}}),v\left( {\widehat{s}}\right) ,x\left( {\widehat{s}}\right) ,k( {\widehat{s}}),\lambda _{1}\left( {\widehat{s}}\right) ,\lambda _{2}\left( {\widehat{s}}\right) ,\lambda _{3}\left( {\widehat{s}}\right) ,\lambda _{4}( {\widehat{s}}),{\widehat{s}}\right) \nonumber \\&\quad -\phi f({\widehat{s}})u_{0}{\varvec{1}}_{ {\widehat{s}}<{\overline{s}}}\text { }\ge 0\text {, and}\quad =0\text { if }{\widehat{s}}< {\overline{s}} \end{aligned}$$

To understand condition (48), note that there are two effects of an increase in \({\widehat{s}}\). First, the increase generates an additional (virtual) surplus from type \({\widehat{s}}\) which now exerts effort; this surplus is captured by the Hamiltonian evaluated at \({\widehat{s}}\). Second, the increase in \({\widehat{s}}\) also tightens the constraint \(v\left( \widehat{ s}\right) \ge {\overline{u}}-\left[ 1-F\left( {\widehat{s}}\right) \right] u_{0} \); the corresponding effect is elicited by the term \(-\phi f(\widehat{s })u_{0}{\varvec{1}}_{{\widehat{s}}<{\overline{s}}}\). Equation  (48) imposes that the sum of these two effects be nonnegative, since otherwise \(\widehat{s }\) would optimally be decreased, and be 0 whenever \({\widehat{s}}<\overline{s }\), since otherwise \({\widehat{s}}\) would be adjusted depending on the sign.

Lemma 19 from appendix A6 identifies a condition under which the solution to the necessary conditions (33)–(48 ) is guaranteed to solve problem (14)–(22), and states the uniqueness of the solution. There also exists a second-order necessary condition, which in the case of a singular control takes the form of the so-called generalized Legendre–Clebsch condition (see, for instance, page 246 in Bryson and Ho 1975). As we show in “Appendix A7”, in our problem, this condition is satisfied if, for instance, \(c_{ees}\ge 0\) along the trajectory of the solution to (33)–(48).

To complete the derivation of the necessary conditions for the problem in (7)–(12), we denote by \({\mathcal {V}}(u_{0})\) the value function of the optimal control problem in (14)–(22), as a function of \(u_{0}\). Claim 20 states and proves in appendix A6 that the function \({\mathcal {V}}(u_{0})\) is twice continuously differentiable. If \({\widehat{s}}<{\overline{s}}\), then accounting for the term in the objective function in (7) that is not incorporated into (14), it follows that it is necessary thatFootnote 44

$$\begin{aligned} \frac{\partial }{\partial u_{0}}\left\{ {\mathcal {V}}(u_{0})-h(u_{0})\right\} \le 0\quad \text {, and }\quad =0\text { if }u_{0}>0\text {.} \end{aligned}$$

Appendix A3: Proofs of results from Sect. 3.3

Proof of Lemma 8

Differentiating the equality in (34) with respect to s, we have \(-\frac{1-r}{r}\lambda _{2}^{\prime }\left( s\right) +\lambda _{4}^{\prime }\left( s\right) =0\). By (36) and (38), it must be that

$$\begin{aligned} \gamma (s){\varvec{1}}_{{\widehat{s}}<{\overline{s}}}+\theta (s){\varvec{1}} _{{\widehat{s}}={\overline{s}}}=h^{\prime }(u\left( s\right) )-h^{\prime }(u^{n}\left( s\right) )\text {.} \end{aligned}$$

Since \(\gamma (s){\varvec{1}}_{{\widehat{s}}<{\overline{s}}}+\theta (s) {\varvec{1}}_{{\widehat{s}}={\overline{s}}}\ge 0\), it follows that \(h^{\prime }(u\left( s\right) )-h^{\prime }(u^{n}\left( s\right) )\ge 0\), and thus, since h is strictly convex as the inverse of a concave and increasing function,Footnote 45\(u\left( s\right) \ge u^{n}\left( s\right) \) and then \(w\left( s\right) \ge w^{n}\left( s\right) \). Finally, if (i) \({\widehat{s}}={\overline{s}}\) and \(ru\left( w({\widehat{s}})\right) +\left( 1-r\right) u\left( w^{n}({\widehat{s}})\right) -c\left( {\widehat{s}},e\left( {\widehat{s}}\right) \right) -\left( 1-r\right) u\left( w^{n}\left( s\right) \right) >0\) for some \(s\in [{\underline{s}},{\widehat{s}}]\), or (ii) \({\widehat{s}}<{\overline{s}}\) and \(u\left( w_{0}\right) -\left( 1-r\right) u\left( w^{n}\left( s\right) \right) >0\) for some \(s\in [{\underline{s}},{\widehat{s}}]\), then from the complementary slack conditions (44) and (45) it follows that \(\gamma (s){\varvec{1}} _{{\widehat{s}}<{\overline{s}}}+\theta (s){\varvec{1}}_{{\widehat{s}}=\overline{s }}=0\) for that particular s, and thus from (50) we conclude that \(w\left( s\right) =w^{n}\left( s\right) \). Now, if \(u_{M}({\widehat{s}} )-\left( 1-r\right) u\left( w\left( s\right) \right) >0\), but \(w\left( s\right) >w^{n}\left( s\right) \), it would follow from (50) that \( \gamma (s){\varvec{1}}_{{\widehat{s}}<{\overline{s}}}+\theta (s){\varvec{1}} _{{\widehat{s}}={\overline{s}}}>0\), and thus from (44) and (45) that \(u_{M}({\widehat{s}})-\left( 1-r\right) u\left( w^{n}\left( s\right) \right) =0\), which leads to a contradiction. We conclude that \(u_{M}( {\widehat{s}})-\left( 1-r\right) u\left( w\left( s\right) \right) >0\) implies \( w\left( s\right) =w^{n}\left( s\right) \), proving the second statement of Lemma 8. \(\square \) \(\square \)

Proof of Proposition 9

If either of the two conditions from the second statement in lemma 8 is satisfied for some value of s, by the continuity of \( w^{n}\left( s\right) \),Footnote 46 that condition must hold on an interval \((s^{\prime },s^{\prime \prime })\subset [{\underline{s}},{\widehat{s}}]\), and thus by the result of Lemma 8, \(w\left( s\right) =w^{n}\left( s\right) \) for \(s\in (s^{\prime },s^{\prime \prime })\). This then implies \(w^{\prime }\left( s\right) =w^{n\prime }\left( s\right) \) on \((s^{\prime },s^{\prime \prime })\). Since (8) and \(e^{\prime }<0\) imply \(ru^{\prime }\left( w(s)\right) w^{\prime }(s)+\left( 1-r\right) u^{\prime }\left( w^{n}(s)\right) w^{n\prime }(s)<0\), it follows that \(w^{n\prime }\left( s\right) <0\), and therefore that the condition from Lemma 8 is also satisfied for higher values of s.Footnote 47 Therefore, once \(w\left( s\right) =w^{n}\left( s\right) \) for some value s, it must be that \( w\left( {\widetilde{s}}\right) =w^{n}\left( {\widetilde{s}}\right) \) for all \( {\widetilde{s}}\in (s,{\widehat{s}}]\). Denote by \(\overleftrightarrow {s}\equiv {\inf _{s\in [{\underline{s}},{\widehat{s}}]}}\left\{ s|w\left( s\right) =w^{n}\left( s\right) \right\} \). By the above argument, \(w\left( s\right) =w^{n}\left( s\right) \) and \(w^{\prime }\left( s\right) <0\) for \( s\in (\overleftrightarrow {s},{\widehat{s}}]\). For \(s\in [{\underline{s}}, \overleftrightarrow {s})\), by Lemma 8, we have \(w\left( s\right) >w^{n}\left( s\right) \) and \(w^{n}\left( s\right) =\frac{1}{1-r}\left\{ u\left( w_{0}\right) {\varvec{1}}_{{\widehat{s}}<{\overline{s}}}+\left[ ru\left( w({\widehat{s}})\right) +\left( 1-r\right) u\left( w^{n}({\widehat{s}} )\right) -c\left( {\widehat{s}},e\left( {\widehat{s}}\right) \right) \right] {\varvec{1}}_{{\widehat{s}}={\overline{s}}}\right\} \), the latter implying \( w^{n\prime }\left( s\right) =0\). From \(ru^{\prime }\left( w(s)\right) w^{\prime }(s)+\left( 1-r\right) u^{\prime }\left( w^{n}(s)\right) w^{n\prime }(s)<0\), it follows that \(w^{\prime }(s)<0\). \(\square \)

A consequence of the fact that \(w^{n\prime }(s)\le 0\) is that in the solution to the relaxed problem where we only imposed the non-negativity constraint on \(w^{n}({\widehat{s}})\), we have that in the optimal solution \( w^{n}(s)\ge 0\) for all \(s\in [{\underline{s}},{\widehat{s}}]\). Moreover, Lemma 8 implies that \(w(s)\ge 0\) for all \(s\in [{\underline{s}},{\widehat{s}}]\) as well. Therefore, the solution to the relaxed problem that we identify also solves the original problem.

Proof of Lemma 10

Integrating (36) between \({\underline{s}}\) and any arbitrary \(s\in [{\underline{s}},{\widehat{s}}]\), and accounting for the facts that \( \lambda _{3}(s)\) must be equal a constant for all \(s\in [{\underline{s}} ,{\widehat{s}}]\), (from 37)) which by (42) equals \(\phi \) , and that \(\lambda _{2}({\underline{s}})=0\) (from (41)), we have

$$\begin{aligned} \lambda _{2}(s)=r\int _{{\underline{s}}}^{s}\left[ h^{\prime }(u\left( t\right) )-\phi \right] f(t)dt\text {, for }s\in [{\underline{s}},{\widehat{s}}] \text {.} \end{aligned}$$

Applying this result at \({\widehat{s}}\), where \(\lambda _{2}({\widehat{s}})=r \left[ \mu {\varvec{1}}_{{\widehat{s}}<{\overline{s}}}+{\varvec{1}}_{ {\widehat{s}}={\overline{s}}}\int _{{\underline{s}}}^{{\widehat{s}}}\theta (s)f(s)ds \right] \) by (41), it follows that

$$\begin{aligned} \mu {\varvec{1}}_{{\widehat{s}}<{\overline{s}}}+{\varvec{1}}_{{\widehat{s}}= {\overline{s}}}\int _{{\underline{s}}}^{{\widehat{s}}}\theta (s)f(s)ds=\int _{ {\underline{s}}}^{{\widehat{s}}}\left[ h^{\prime }(u\left( s\right) )-\phi \right] f(s)ds\text {.} \end{aligned}$$

If \({\widehat{s}}<{\overline{s}}\), then from (21) it must be that \( ru\left( {\widehat{s}}\right) +(1-r)u^{n}({\widehat{s}})-c\left( {\widehat{s}} ,e\left( {\widehat{s}}\right) \right) -u_{0}=0\), and thus accounting also for (33) and (34), the condition (48) becomes \( y\left( e\left( {\widehat{s}}\right) \right) -rh\left( u\left( {\widehat{s}} \right) \right) -\left( 1-r\right) h\left( u^{n}\left( {\widehat{s}}\right) \right) +h(u_{0})=0\), which can be written as in (24) after observing that \(h^{\prime }(u(s))=\left( u^{-1}\right) ^{\prime }(u(w(s)))= \frac{1}{u^{\prime }(u^{-1}(u(w(s))))}=\frac{1}{u^{\prime }(w(s))}\), and similarly that \(h^{\prime }(u^{n}(s))=\frac{1}{u^{\prime }(w^{n}(s))}\) and \( h^{\prime }(u_{0})=\frac{1}{u^{\prime }(w_{0})}\).

On the other hand, if \({\widehat{s}}={\overline{s}}\), then (48) becomes \(\left[ y\left( e\left( {\overline{s}}\right) \right) -rh\left( u\left( {\overline{s}}\right) \right) -\left( 1-r\right) h\right. \left. \left( u^{n}\left( {\overline{s}}\right) \right) \right] +\phi \left[ ru\left( {\overline{s}} \right) +\left( 1-r\right) u^{n}\left( {\overline{s}}\right) -c\left( {\overline{s}},e\left( {\overline{s}}\right) \right) \right] \ge 0\). But by (52), when \({\widehat{s}}={\overline{s}}\), we have \(\phi =\int _{ {\underline{s}}}^{{\overline{s}}}\left[ h^{\prime }(u\left( s\right) )\right] f(s)ds-\int _{{\underline{s}}}^{{\overline{s}}}\theta (s)f(s)ds\). Since by (50), we have \(\theta (s)=h^{\prime }(u\left( s\right) )-h^{\prime }(u^{n}\left( s\right) )\), it follows that \(\phi =\int _{{\underline{s}}}^{ {\overline{s}}}\left[ h^{\prime }(u^{n}\left( s\right) )\right] f(s)ds\). Thus, (48) can again be written as in (25). \(\square \)

Proof of Lemma 11

Differentiating \(\lambda _{1}\left( s\right) +\lambda _{2}\left( s\right) \frac{1}{r}c_{e}\left( s,e\left( s\right) \right) =0\) from (33) with respect to s, we obtain \(\lambda _{1}^{\prime }\left( s\right) +\lambda _{2}^{\prime }\left( s\right) \frac{1}{r}c_{e}\left( s,e\left( s\right) \right) +\lambda _{2}\left( s\right) \frac{1}{r}c_{es}\left( s,e\left( s\right) \right) +\lambda _{2}\left( s\right) \frac{1}{r} c_{ee}\left( s,e\left( s\right) \right) e^{\prime }(s)=0\). Plugging in \( \lambda _{1}^{\prime }\left( s\right) \) and \(\lambda _{2}^{\prime }\left( s\right) \) from (35) and (36), it follows that

$$\begin{aligned}&\left[ -y^{\prime }\left( e\right) f(s)f(s)-\lambda _{2}\left( s\right) \frac{1}{r}c_{ee}\left( s,e\left( s\right) \right) x\left( s\right) +\lambda _{3}\left( s\right) c_{e}\left( s,e\left( s\right) \right) f(s)\right] + \nonumber \\&\qquad +\left[ rh^{\prime }(u\left( s\right) )f(s)-\lambda _{3}\left( s\right) rf(s) \right] \frac{1}{r}c_{e}\left( s,e\left( s\right) \right) +\lambda _{2}\left( s\right) \frac{1}{r}c_{es}\left( s,e\left( s\right) \right) \nonumber \\&\qquad +\lambda _{2}\left( s\right) \frac{1}{r}c_{ee}\left( s,e\left( s\right) \right) e^{\prime }(s) \nonumber \\&\quad =-y^{\prime }(e\left( s\right) )f(s)+h^{\prime }(u\left( s\right) )c_{e}\left( s,e\left( s\right) \right) f(s)+\lambda _{2}\left( s\right) \frac{1}{r}c_{es}\left( s,e\left( s\right) \right) =0\text {,} \nonumber \\ \end{aligned}$$

where we used the fact that \(x\left( s\right) =e^{\prime }(s)\). By (51), we have \(y^{\prime }(e\left( s\right) )f(s)=h^{\prime }(u(s))c_{e}\left( s,e\left( s\right) \right) f(s)+c_{es}\left( s,e\left( s\right) \right) \int _{{\underline{s}}}^{s}\left\{ h^{\prime }(u(\sigma ))-\phi \right\} f(\sigma )d\sigma \), which can be immediately written as in (28). To complete the Proof of Lemma 11, it remains to show that \(\phi \) equals the expression defined in (26).

Now, if \({\widehat{s}}<{\overline{s}}\), then given the definition of \(\widetilde{ L}\left( {\widehat{s}}\right) \) from (39), we have

$$\begin{aligned} \frac{\partial }{\partial u_{0}}{\mathcal {V}}(u_{0})&=\frac{\partial }{ \partial u_{0}}{\widetilde{L}}\left( {\widehat{s}}\right) =\int _{{\underline{s}}}^{ {\widehat{s}}}\left\{ h^{\prime }(u_{0})+\gamma (s)\frac{\partial }{\partial u_{0}}\left[ u_{0}-(1-r)u^{n}(s)\right] \right\} f(s)ds \\&\quad +\frac{\partial }{\partial u_{0}}\phi \left\{ v\left( {\widehat{s}}\right) - {\overline{u}}+\left[ 1-F\left( {\widehat{s}}\right) \right] u_{0}\right\} \\&\quad +\mu \frac{\partial }{\partial u_{0}}\left[ ru\left( {\widehat{s}}\right) +(1-r)u^{n}({\widehat{s}})-c\left( {\widehat{s}},e\left( {\widehat{s}}\right) \right) -u_{0}\right] \\&=\phi \left[ 1-F\left( {\widehat{s}}\right) \right] -\mu +\int _{{\underline{s}} }^{{\widehat{s}}}\left[ h^{\prime }(u_{0})+\gamma \left( s\right) \right] f(s)ds \\&=\phi \left[ 1-F({\widehat{s}})\right] +h^{\prime }(u_{0})F({\widehat{s}} )+\int _{{\underline{s}}}^{{\widehat{s}}}\left[ h^{\prime }(u\left( s\right) )-h^{\prime }(u^{n}\left( s\right) )\right] f(s)ds\\&\quad -\int _{{\underline{s}}}^{ {\widehat{s}}}\left[ h^{\prime }(u\left( s\right) )-\phi \right] f(s)ds \\&=\phi +h^{\prime }(u_{0})F({\widehat{s}})-\int _{{\underline{s}}}^{{\widehat{s}} }h^{\prime }(u^{n}\left( s\right) )f(s)ds\text {,} \end{aligned}$$

where for the first equality we employed the Dynamic Envelope TheoremFootnote 48 with respect to \(u_{0}\), \(\gamma \left( s\right) \) is substituted from (50), while \(\mu \) is substituted from (52) (we also took into account throughout that \({\widehat{s}}<\overline{ s}\)). Thus, from (49), it follows that (*): \(\phi -h^{\prime }(u_{0})\left[ 1-F({\widehat{s}})\right] -\int _{{\underline{s}}}^{ {\widehat{s}}}h^{\prime }(u^{n}\left( s\right) )f(s)ds\le 0\), and \(=0\) if \( u_{0}>0\). Now, if \(u_{0}=0\), then from (19), \(u^{n}({\widehat{s}} )\ge 0\), and the fact that \(u^{n}\) is weakly decreasing (as demonstrated by Proposition 9), it follows that \(u^{n}(s)=0\) for all \( s\le {\widehat{s}}\). Employing (*) and (42) we have that \( 0\le \phi \le h^{\prime }(0)\). Since by the Inada condition, \(h^{\prime }(0)=\frac{1}{u^{\prime }(0)}=0\), this implies that \(\phi =0\), which satisfies the definition of \(\phi \) from (26) for this case. On the other hand, if \(u_{0}>0\), then from (*) we have \(\phi =h^{\prime }(u_{0})\left[ 1-F({\widehat{s}})\right] +\int _{{\underline{s}}}^{ {\widehat{s}}}h^{\prime }(u^{n}\left( s\right) )f(s)ds\). Therefore, when \( {\widehat{s}}<{\overline{s}}\),

$$\begin{aligned} \phi =h^{\prime }(u_{0})\left[ 1-F({\widehat{s}})\right] +\int _{{\underline{s}} }^{{\widehat{s}}}h^{\prime }(u^{n}\left( t\right) )f(t)dt\text {, if }u_{0}>0 \text {, and }=0\text { if }u_{0}=0\text {, }\nonumber \\ \end{aligned}$$

which can then be immediately rewritten as in (26).

If \({\widehat{s}}={\overline{s}}\), then as argued in the Proof of Lemma 10, we have \(\phi =\int _{{\underline{s}}}^{{\overline{s}}} \left[ h^{\prime }(u^{n}\left( s\right) )\right] f(s)ds\), which can be written as in (26) as well. This completes the Proof of Lemma 11. \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Barbos, A. Optimal contracts with random monitoring. Int J Game Theory 51, 119–154 (2022).

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Optimal contracts
  • Random monitoring
  • Moral hazard
  • Optimal control

JEL Classification

  • D82
  • D86