1 Introduction

Bank supervision, unlike bank regulation, has not been until recently the subject of much academic interest. As stated by Eisenbach et al. (2016), regulation involves the establishment of rules under which banks operate, while supervision involves the assessment of safety and soundness of banks through monitoring, and the use of this information to request corrective actions. In contrast to regulation, that is based on verifiable information, supervision is about supervisory actions (partly) based on nonverifiable information.

This paper contributes to the theoretical literature on bank supervision by constructing a stylized model of a supervisor that collects nonverifiable information on the solvency of a bank and, on the basis of this information, decides on its early liquidation. The quality of the information on the bank’s solvency depends on the intensity of supervision (the nonverifiable costly effort of the supervisor). I assume that the supervisor is not a social welfare maximizer. In particular, its payoff function incorporates a liquidation cost, which may be associated with either reputational concerns or supervisory capture (e.g. revolving doors). The paper characterizes the effort and the liquidation decisions of the supervisor, and shows that supervision will be more intense the lower the costs of supervisory effort and the lower the supervisory bias against liquidation.

The model can be interpreted as a model of decentralized supervision , in which the bank is a local bank and the supervisor is a local supervisor, or as a model of centralized supervision, in which the bank is still a local bank but the supervisor is a central (or supranational) supervisor. It can also be used as a building block for a model of hierarchical supervision, in which the central and the local supervisors jointly supervise the bank in order to observe a nonverifiable signal of the bank’s solvency, and then the central supervisor decides on the liquidation of the bank. Under hierarchical supervision, the central and the local supervisors simultaneously choose their efforts, so they will be playing a game. The Nash equilibrium of this game describes the outcome of the hierarchical supervision model.

The main contribution of the paper is to characterize the conditions under which one of the three institutional arrangements, namely decentralized, hierarchical, and centralized supervision, dominates in welfare terms the other two. The analysis is based on two key assumptions: (i) the cost of effort is higher for the central than for the local supervisor, and (ii) the cost of liquidating the bank is lower for the central supervisor than for the local supervisor. The first assumption may be justified by reference to distance between the central supervisor and the local bank. In the words of Torres (2015): “The central supervisor has informational disadvantages relative to the national authorities, due to their better knowledge of banks, banking systems and regulatory frameworks, as well as their geographical and cultural proximity to them.” The second assumption may be justified by reference to the looser connections between the central supervisor and the bank. In the words of Torres (2015): “The existence of a supranational supervisor allows to increase the distance between supervisors and national lobbies and politicians, which in principle should reduce the risk of supervisors implementing excessively lax policies.”Footnote 1

The results show that hierarchical supervision dominates decentralized supervision when the bias of the local supervisor is high and the costs of getting local information from the center are low. But when these forces exceed certain threshold, it is better to concentrate all responsibilities in the central supervisor. The trade-off underlying these results is clear: Decentralized supervision is better because the local supervisor finds it cheaper to gather information, but it is worse because its objective function is biased against liquidation.Footnote 2 The results also show that hierarchical supervision is more likely to dominate when bank profitability is low (e.g. as a result of high competition) and when bank risk-taking is high (e.g. as a result of soft regulation).

Interestingly, the model of hierarchical supervision is isomorphic to a model in which the central supervisor gets a signal of the bank’s solvency, the local supervisor gets another signal which truthfully reports to the central supervisor, who then decides on the liquidation of the bank. The original model corresponds to an institutional arrangement in which the supervisors work in teams, while the alternative model corresponds to an arrangement in which the supervisors work independently, but there is no problem of strategic information transmission à la Crawford and Sobel (1982).Footnote 3

The results provides a rationale for the design of the Single Supervisory Mechanism (SSM), the new structure of bank supervision in Europe that comprises the European Central Bank (ECB) and the national supervisory authorities of the participating countries. Currently, the ECB is responsible for the supervision of significant (i.e. large) banks that comprise about 80% of the banking assets of the euro area. For these banks, supervision is carried out in cooperation with the national supervisors via the so-called Joint Supervisory Teams. National supervisors are in charge of the less significant (i.e. small) banks. To the extent that (i) the cost advantage of local supervisors is smaller for larger, more complex banks, and (ii) supervisory capture is more relevant for larger banks, the results of the model are consistent with the design of the SSM.

The model can also shed light on issues related to the organization of supervision in jurisdictions in which multiple agencies are involved in supervising banks. For example, state-chartered banks in the US are under a dual supervisory framework involving both federal and local supervisors.Footnote 4 Also, the Federal Reserve is responsible for the supervision of bank holding companies, but subsidiaries may be supervised by other federal agencies.

Two extensions of the model are discussed. The first one examines what happens when the supervisory bias is linked to being responsible for the liquidation decision. This means that moving from decentralized to hierarchical supervision implies a reduction in the liquidation cost of the local supervisor and an increase in the liquidation cost of the central supervisor. In this case, the results show that there will be an increase in the (cheaper) effort of the local supervisor that will compensate the reduction in the effort of the central supervisor, so hierarchical supervision is more likely to dominate.

A second extension shows that limiting the size of the central supervisor in the hierarchical supervision setup is welfare improving. The result is closely related to the well-known result that a Stackelberg leader that optimizes over the reaction function of the other agent does better than by playing its Nash equilibrium strategy. The intuition is that putting the central supervisor in a situation of overload forces the local supervisor to increase its (cheaper) effort in a way that compensates the reduction in the effort of the central supervisor.Footnote 5

Literature review Carletti et al. (2016) explore the working of a supervisory structure in which a centralized agency has legal power over decisions regarding banks, but has to rely on biased local supervisors to collect the information necessary to act. They focus on how this institutional design affects supervisors’ incentives to collect information and on how this, in turn, influences bank behavior, showing that when the agency problem between the central and the local supervisor is severe, centralized supervision leads to lower information collection and increased risk-taking.Footnote 6

Colliard (2015) considers a model in which local supervisors are more lenient, so that banks also have weaker incentives to hide information from them. These two forces can make a joint supervisory architecture optimal. However, more centralized supervision encourages banks to integrate more cross-border. Due to this complementarity, the economy can be trapped in an inferior equilibrium with both too little central supervision and too little financial integration.

Calzolari et al. (2016) study the impact of shared liability and deposit insurance arrangements on supervisors’ incentives to acquire information on the activities of multinational banks, showing that centralized supervision can induce these banks to expand abroad through branches rather than subsidiaries.

In an earlier contribution, Boyer and Ponce (2012) analyze whether banking supervision responsibilities should be concentrated in the hands of a single supervisor, showing that splitting supervisory powers among different supervisors is a superior arrangement in terms of welfare when the capture of supervisors by bankers is a concern.

The model in the paper may be considered as a special case of models in the literature on the theory of organizations.Footnote 7 As noted in the seminal paper by Jensen and Meckling (1992), “The assignment of decision rights influences incentives to acquire information. (...) Determining the optimal level of decentralization requires balancing the costs of bad decisions owing to poor information and those owing to inconsistent objectives.” It is also closely related to the recent literature on strategic information acquisition, which departs from the literature on strategic information transmission by assuming that information is not exogenously given, but is obtained through costly effort; see Argenziano et al. (2016).

Structure of the paper Section 2 presents the basic model of bank supervision in which a supervisor collects information on the solvency of a bank and, on the basis of this information, decides on its early liquidation. This setup may be interpreted as a model of decentralized supervision in which supervisory responsibilities are allocated to a local supervisor or a model of centralized supervision in which supervisory responsibilities are allocated to a central supervisor. Building on this setup, Sect. 3 presents the model of hierarchical supervision in which a central and a local supervisor jointly collect information and then the central supervisor decides on liquidation. Section 4 compares in terms of welfare three possible institutional arrangements: decentralized, hierarchical, and centralized supervision. Section 5 contains the extensions, and Sect. 6 the concluding remarks. The proofs of the analytical results are in the “Appendix”.

2 Model setup

Consider an economy with three dates \((t=0,1,2)\) and two agents: a bank and a supervisor. The bank raises a unit amount of deposits at \(t=0,\) and invests them in an asset that has a random final return R at \( t=2. \) The asset can be liquidated at \(t=1,\) in which case it yields a random liquidation return L. Deposits are insured and the deposit rate is normalized to zero.

It is assumed that

$$\begin{aligned} \left[ \begin{array}{c} L \\ R \end{array} \right] \sim N\left( \overline{R}\left[ \begin{array}{c} a \\ 1 \end{array} \right] ,\sigma ^{2}\left[ \begin{array}{cc} b &{} \quad c \\ c &{}\quad 1 \end{array} \right] \right) , \end{aligned}$$
(1)

where \(\overline{R}>1\), \(a<1\), \(b<1,\) and \(c>0.\) Moreover, to ensure that the covariance matrix is positive-definite it must be the case that \(c^{2}<b\) , so \(b<1\) implies \(c<1.\) Thus, the expected final return \(E(R)=\overline{R}\) is greater than the unit face value of the deposits, and it is also greater than the expected liquidation return \(E(L)=a\overline{R}.\) Moreover, the final return has a higher variance than the liquidation return, and both returns are positively correlated. Except for the assumption of normality, which is made for tractability, the rest of the assumptions capture realistic features of the distribution of banks’ returns.

The supervisor chooses at \(t=0\) the intensity of supervision e (nonverifiable effort of the supervisor), which leads to the observation at \( t=1\) of a nonverifiable signal

$$\begin{aligned} s=R+\varepsilon \end{aligned}$$
(2)

on the final return of the bank’s investment. The noise term \(\varepsilon \) is independent of L and R,  and has a distribution \(N(0,\sigma ^{2}/e).\) Footnote 8 Thus, the effort e of the supervisor increases the precision (inverse of the variance) of the noise term.

From here it follows that

$$\begin{aligned} \left[ \begin{array}{c} L \\ R \\ s \end{array} \right] \sim N\left( \overline{R}\left[ \begin{array}{c} a \\ 1 \\ 1 \end{array} \right] ,\sigma ^{2}\left[ \begin{array}{ccc} b &{} c &{} \quad c \\ c &{} \quad 1 &{} \quad 1 \\ c &{}\quad 1 &{} \quad 1+e^{-1} \end{array} \right] \right) . \end{aligned}$$
(3)

By the properties of normal distributions we have

$$\begin{aligned}&\left. E(L\mid s)=a\overline{R}+\frac{c(s-\overline{R})}{1+e^{-1}},\right. \end{aligned}$$
(4)
$$\begin{aligned}&\left. E(R\mid s)=\overline{R}+\frac{s-\overline{R}}{1+e^{-1}}.\right. \end{aligned}$$
(5)

Since \(c<1\) the slope of \(E(L\mid s)\) is smaller than the slope of \(E(R\mid s),\) which implies

$$\begin{aligned} E(L\mid s)>E(R\mid s)\,\text { if and only if }\,s<\overline{s}, \end{aligned}$$

where

$$\begin{aligned} \overline{s}=\left[ 1-\frac{1-a}{1-c}(1+e^{-1})\right] \overline{R} \end{aligned}$$
(6)

is the efficient liquidation threshold. Note that higher supervisory effort e increases the efficient liquidation threshold \( \overline{s}.\) Also, note that for \(e=0\) we have \(\overline{s}=-\infty ,\) so with a completely noisy signal the bank should never be liquidated.

Substituting \(\overline{s}\) from (6) into (4) and (5) gives

$$\begin{aligned} E(L\mid \overline{s})=E(R\mid \overline{s})=\frac{a-c}{1-c}\overline{R}. \end{aligned}$$

I will assume that parameter values satisfy

$$\begin{aligned} \frac{a-c}{1-c}\overline{R}\le 1. \end{aligned}$$
(7)

This means that the efficient liquidation threshold \(\overline{s}\) is such that the corresponding expected final return is less than or equal to the face value of the deposits, so efficient liquidation occurs only if the bank is effectively bankrupt.Footnote 9

The supervisor chooses its effort e at \(t=0,\) observes the signal s at \( t=1,\) and based on this observation decides on the liquidation of the bank at this date. Supervisory effort is costly. Specifically, I assume that the cost function takes the simple quadratic form

$$\begin{aligned} c(e)=\gamma _{0}+\frac{\gamma }{2}e^{2}, \end{aligned}$$
(8)

where \(\gamma _{0}>0\) is a fixed cost of setting up the supervisory arrangements, and \(\gamma >0\) is the key cost of effort parameter.

The supervisor liquidates the bank at \(t=1\) if

$$\begin{aligned} E(L\mid s)-\delta >E(R\mid s), \end{aligned}$$
(9)

where \(\delta >0\) is a liquidation cost. This cost may be associated with either reputational concerns or supervisory capture (e.g. revolving doors). Thus, the supervisor liquidates the bank when the social benefits of liquidation, which are \(E(L\mid s)-E(R\mid s),\) are greater than the liquidation cost \(\delta \).

Substituting (4) and (5) into (9) implies that the bank will be liquidated by the supervisor when \(s<\widehat{s},\) where

$$\begin{aligned} \widehat{s}=\overline{s}-\frac{\delta }{1-c}(1+e^{-1}) \end{aligned}$$
(10)

is the supervisor’s liquidation threshold. Footnote 10

Figure 1 shows the determination of the efficient liquidation threshold \( \overline{s}\) by the intersection of the lines \(E(L\mid s)\) and \(E(R\mid s),\) and of the supervisor’s liquidation threshold \(\widehat{s}\) by the intersection of the lines \(E(L\mid s)-\delta \) and \(E(R\mid s).\) For signals in the range between \(\widehat{s}\) and \(\overline{s}\) the supervisor does not liquidate the bank when it would be efficient to do so.

The supervisor’s effort decision at \(t=0\) is obtained by maximizing its expected payoff at \(t=1,\) given by \(E[\max \left\{ E(L\mid s)-\delta ,E(R\mid s)\right\} ],\) net of the cost of effort c(e),  that is

$$\begin{aligned} \widehat{e}=\arg \max _{e}v(e), \end{aligned}$$

where

$$\begin{aligned} v(e)=\int _{-\infty }^{\widehat{s}}\left[ E(L\mid s)-\delta \right] dF(s)+\int _{\widehat{s}}^{\infty }E(R\mid s)dF(s)-c(e), \end{aligned}$$
(11)

F(s) denotes the cdf of the signal s,  and the liquidation threshold \( \widehat{s}\) is given by (10).

The following result provides a closed form expression for v(e). For the result it is convenient to write \(s=E(s)+SD(s)x,\) where \(x\sim N(0,1),\) and define \(\widehat{x}\) to be such that \(\widehat{s}=E(s)+SD(s)\widehat{x}.\)

Fig. 1
figure 1

Supervisor’s liquidation threshold

Proposition 1

The supervisor’s payoff function may be written as

$$\begin{aligned} v(e)=\overline{R}-\left[ (1-a)\overline{R}+\delta \right] \left[ \Phi ( \widehat{x}))+\frac{\phi (\widehat{x})}{\widehat{x}}\right] -c(e), \end{aligned}$$
(12)

where \(\Phi (x)\) is the normal cdf and \(\phi (x)\) is the normal density, and

$$\begin{aligned} \widehat{x}=-\frac{\left[ (1-a)\overline{R}+\delta \right] }{(1-c)\sigma } \left( 1+e^{-1}\right) ^{1/2}. \end{aligned}$$
(13)

Figure 2 plots the payoff function of the supervisor v(e) for the parameters that will be used in the numerical analysis below.Footnote 11 For \(e=0\) we have \(\widehat{x}=-\infty ,\) so the bank is never liquidated, and \(v(0)=\overline{R}-\gamma _{0}\) (the expected final return minus the fixed cost of supervision). For \(e=\infty \) we have \(v(e)=-\infty ,\) since the expected payoff at \(t=1\) is clearly bounded while the cost of supervisory effort goes to infinity. The function v(e) is initially convex, and then becomes concave, so there may be corner solutions with \(\widehat{e}=0\) or interior solutions with \(\widehat{e}>0.\) Footnote 12

Fig. 2
figure 2

Supervisor’s payoff function

The following result presents some comparative statics for the case where the solution is interior. In particular, it shows that supervisory effort \( \widehat{e}\) is decreasing in the cost of effort parameter \(\gamma \) and in the liquidation cost \(\delta \) incurred by the supervisor. It also shows that \(\widehat{e}\) is decreasing in the expected return \(\overline{R}\) and increasing in the standard deviation \(\sigma \) of the bank’s investment return.

Proposition 2

Whenever the supervisor chooses a positive level of effort \(\widehat{e}\) we have

$$\begin{aligned} \frac{\partial \widehat{e}}{\partial \gamma }<0,\ \frac{\partial \widehat{e} }{\partial \delta }<0,\text { }\frac{\partial \widehat{e}}{\partial \overline{ R}}<0, \textit{and}\ \frac{\partial \widehat{e}}{\partial \sigma }>0. \end{aligned}$$

Figure 3 illustrates these results. Panels A and B show that increases in the cost of effort parameter \(\gamma \) and in the liquidation cost \(\delta \) reduce supervisory effort \(\widehat{e}\), which jumps to zero for sufficiently high values of \(\gamma \) and \(\delta .\) Panel C shows that increases in the expected return \(\overline{R}\) reduce supervisory effort \( \widehat{e},\) which jumps to zero for sufficiently high values of \(\overline{ R}.\) Finally, Panel D shows supervisory effort \(\widehat{e}\) is zero for sufficiently low values of the standard deviation \(\sigma ,\) jumping to positive and increasing levels for values of \(\sigma \) beyond a critical point.

Fig. 3
figure 3

Comparative statics. a Effect of cost of effort. b Effect of supervisory liquidation costs. c Effect of expected asset return. d Effect of volatility of asset return

The comparative statics results are not surprising. Supervision will be more intense when banks are easier to supervise (lower \(\gamma \)), or less profitable (lower \(\overline{R}\)), or riskier (higher \(\sigma \)). And it will be less intense when the supervisor is closer to lobbies and pressure groups (higher \(\delta \)) that always prefer delaying intervention and gambling for resurrection.

Summing up, I have set up a simple model of a bank and a supervisor in which the supervisor exerts costly effort in order to observe a nonverifiable signal of the bank’s solvency, which is used to decide the bank’s early liquidation. Importantly, the supervisor is not a social welfare maximizer—it has a bias against liquidation. I have characterized the supervisor’s effort decision, and derive some comparative statics results on its determinants.

The model may be interpreted as a model of decentralized supervision, in which the bank is a local bank and the supervisor is a local supervisor, or as a model of centralized supervision, in which the bank is still a local bank but the supervisor is a central (or supranational) supervisor. In this setup, it would be reasonable to assume that the cost of effort is higher for the central supervisor than for the local supervisor, an assumption that may be justified by reference to geographical as well as cultural distance between the central supervisor and the local bank. It may also be reasonable to assume that the cost of liquidating the bank is lower for the central supervisor than for the local supervisor, an assumption that may be justified by reference to the looser connections between the central supervisor and national lobbies and pressure groups. The trade-off between the higher costs of supervision and the lower incidence of supervisory capture will be examined in Sect. 4. But before doing this, the following section presents a model of hierarchical supervision in which the central and the local supervisors jointly supervise the bank in order to obtain a nonverifiable signal of the bank’s solvency, and then the central supervisor decides on the bank’s early liquidation.

3 Hierarchical supervision

Consider an economy with a local bank and two supervisors, a central and a local supervisor, denoted by subindices c and l. The supervisors independently choose at \(t=0\) nonverifiable efforts \(e_{c}\) and \(e_{l}\), respectively, which leads to the observation at \(t=1\) of a single nonverifiable signal

$$\begin{aligned} s=R+\varepsilon \end{aligned}$$
(14)

on the final return of the bank’s investment.Footnote 13 The noise term \(\varepsilon \) is independent of L and R,  and has a distribution \(N(0,\sigma ^{2}(e_{c}+e_{l})^{-1}).\) Thus, the higher the efforts \(e_{c}\) and \(e_{l}\) the higher the precision of the noise term.

From here it follows that

$$\begin{aligned} \left[ \begin{array}{c} L \\ R \\ s \end{array} \right] \sim N\left( \overline{R}\left[ \begin{array}{c} a \\ 1 \\ 1 \end{array} \right] ,\sigma ^{2}\left[ \begin{array}{ccc} b &{} \quad c &{} \quad c \\ c &{} \quad 1 &{} \quad 1 \\ c &{} \quad 1 &{} \quad 1+(e_{c}+e_{l})^{-1} \end{array} \right] \right) . \end{aligned}$$
(15)

Compared to the case of a single supervisor presented in Sect. 2, the only difference is that now the precision of the signal s depends on the sum \( e_{c}+e_{l}\) of the efforts of the two supervisors instead of on the effort e of the single supervisor.Footnote 14

Supervisory effort is costly, and the cost of effort is assumed to be higher for the (distant) central supervisor than for the (close) local supervisor. Specifically, I assume that parameter \(\gamma \) in the cost function (8) takes the value \(\gamma _{c}\) for the central supervisor and \(\gamma _{l}\) for the local supervisor, where \(\gamma _{c}>\gamma _{l}>0.\) The corresponding cost functions will be written \(c_{c}(e_{c})\) and \( c_{l}(e_{l}),\) respectively.

The central supervisor decides whether to liquidate the bank based on the observation of the signal s at \(t=1.\) I assume that the liquidation cost \( \delta _{c}\) of the central supervisor is lower than that of the local supervisor \(\delta _{l}\). Moreover, to simplify the presentation I will assume that the central supervisor does not have a bias against early liquidation, so \(\delta _{c}=0.\) Footnote 15 Thus, the central supervisor liquidates the bank at \(t=1\) if

$$\begin{aligned} E(L\mid s)>E(R\mid s). \end{aligned}$$

By the properties of normal distributions we have

$$\begin{aligned}&\left. E(L\mid s)=a\overline{R}+\frac{c(s-\overline{R})}{ 1+(e_{c}+e_{l})^{-1}},\right. \end{aligned}$$
(16)
$$\begin{aligned}&\left. E(R\mid s)=\overline{R}+\frac{s-\overline{R}}{1+(e_{c}+e_{l})^{-1}} .\right. \end{aligned}$$
(17)

Hence, it follows that

$$\begin{aligned} E(L\mid s)>E(R\mid s)\text { if and only if }s<s^{*}, \end{aligned}$$

where

$$\begin{aligned} s^{*}=\left[ 1-\frac{1-a}{1-c}[1+(e_{c}+e_{l})^{-1}]\right] \overline{R} \end{aligned}$$
(18)

is the efficient liquidation threshold.

As before, I assume that parameter values satisfy assumption (7). This implies that the threshold \(s^{*}\) is such that \(E(L\mid s^{*})=E(R\mid s^{*})\le 1,\) so liquidation takes place when the bank is effectively bankrupt.

At \(t=0\) the central and the local supervisors independently choose their efforts \(e_{c}\) and \(e_{l},\) so they will be playing a game. I will characterize the Nash equilibrium of this game, and show some comparative static results.

The payoff function of the central supervisor is

$$\begin{aligned} v_{c}(e_{c},e_{l})=\int _{-\infty }^{s^{*}}E(L\mid s)dF(s)+\int _{s^{*}}^{\infty }E(R\mid s)dF(s)-c_{c}(e_{c}), \end{aligned}$$
(19)

and the payoff function of the local supervisor is

$$\begin{aligned} v_{l}(e_{c},e_{l})=\int _{-\infty }^{s^{*}}[E(L\mid s)-\delta _{l}]dF(s)+\int _{s^{*}}^{\infty }E(R\mid s)dF(s)-c_{l}(e_{l}), \end{aligned}$$
(20)

where F(s) denotes the cdf of the signal s,  and the liquidation threshold \(s^{*}\) is given by (18). It should be noted that in these expressions the central (local) supervisor does not take into account the cost of effort of the local (central) supervisor.Footnote 16 More importantly, since the threshold \(s^{*}\) depends on the efforts \(e_{c}\) and \(e_{l}\) of the two supervisors, it must be the case that each supervisor observes the level of effort chosen by the other supervisor.Footnote 17 This may be rationalized by assuming that the efforts of the supervisors are related to the quality of the staff that they independently choose ex-ante, but which can be observed ex-post.

The reaction functions of the two supervisors are given by

$$\begin{aligned} e_{c}(e_{l})= & {} \arg \max _{e_{c}}v_{c}(e_{c},e_{l}), \end{aligned}$$
(21)
$$\begin{aligned} e_{l}(e_{c})= & {} \arg \max _{e_{l}}v_{l}(e_{c},e_{l}). \end{aligned}$$
(22)

The intersection of these functions is a Nash equilibrium of the game played by the supervisors.

The following result provides closed form expressions for \(v_{c}(e_{c},e_{l}) \) and \(v_{l}(e_{c},e_{l}).\) As before, it is convenient to write \( s=E(s)+SD(s)x,\) where \(x\sim N(0,1),\) and define \(x^{*}\) to be such that \(s^{*}=E(s)+SD(s)x^{*}.\)

Proposition 3

The supervisors’ payoff functions may be written as

$$\begin{aligned} v_{c}(e_{c},e_{l})= & {} \overline{R}-(1-a)\overline{R}\left[ \Phi (x^{*})+ \frac{\phi (x^{*})}{x^{*}}\right] -c_{c}(e_{c}), \end{aligned}$$
(23)
$$\begin{aligned} v_{l}(e_{c},e_{l})= & {} \overline{R}-(1-a)\overline{R}\left[ \Phi (x^{*})+ \frac{\phi (x^{*})}{x^{*}}\right] -\delta _{l}\Phi (x^{*})-c_{l}(e_{l}), \end{aligned}$$
(24)

where

$$\begin{aligned} x^{*}=-\frac{(1-a)\overline{R}}{(1-c)\sigma } [1+(e_{c}+e_{l})^{-1}]^{1/2}. \end{aligned}$$
(25)
Fig. 4
figure 4

Nash equilibrium

The analysis in the previous section shows that the supervisors’ payoff functions are not everywhere concave. For this reason, the reaction functions (21) and (22) may have corner or interior solutions. In what follows, I will restrict attention to parameter values for which the solutions are interior.

Figure 4 shows the Nash equilibrium of the game played by the two supervisors, denoted \((e_{c}^{*},e_{l}^{*})\). Notice that for the chosen parameter values the reaction functions satisfy \(\left. e_{c}^{\prime }(e_{l})<0,\right. \) \(e_{l}^{\prime }(e_{c})<0,\) and \(e_{c}^{\prime }(e_{l})e_{l}^{\prime }(e_{c})<1.\) Footnote 18 That is, they are both downward sloping (strategic substitutes), and the slope of the reaction function of the local supervisor is (in absolute value) lower than that of the central supervisor, so the Nash equilibrium is stable. A sufficient condition for this to obtain is that the cost of effort parameters \(\gamma _{c}\) and \(\gamma _{l}\) are not too large.Footnote 19

Figure 5 illustrates some comparative statics results of the game between the two supervisors. Panel A shows that increases in the cost of effort of the central supervisor \(\gamma _{c}\) shifts to the left its reaction function, leading to a reduction in the equilibrium effort \(e_{c}^{*}\) of the central supervisor and an increase in the equilibrium effort \(e_{l}^{*} \) of the local supervisor. Panel B shows that an increase in the liquidation cost \(\delta _{l}\) of the local supervisor shifts down its reaction function, leading to a reduction in the equilibrium effort \( e_{l}^{*}\) of the local supervisor and an increase in the equilibrium effort \(e_{c}^{*}\) of the central supervisor. Panel C shows that an increase in the expected return \(\overline{R}\) of the bank’s investment shifts to the left the reaction function of the central supervisor and shifts down the reaction function of the local supervisor, leading to a reduction in the equilibrium effort of at least one of the two supervisors—although in the numerical results both \(e_{c}^{*}\) and \( e_{l}^{*}\) go down. Finally, Panel D shows that a decrease in the standard deviation \(\sigma \) of the bank’s investment return shifts to the left the reaction function of the central supervisor and shifts down the reaction function of the local supervisor, leading to a reduction in the equilibrium effort of at least one of the two supervisors—although in the numerical results both \(e_{c}^{*}\) and \(e_{l}^{*}\) go down.

Fig. 5
figure 5

Comparative statics of Nash equilibrium. a Increase in cost of effort of central supervisor. b Increase in liquidation costs of local supervisor. c Increase in expected asset return. d Decrease in volatility of asset return

These comparative static results are in line with the ones obtained in the model with a single supervisor, where supervisory effort is decreasing in the cost of effort \(\gamma ,\) the liquidation cost \(\delta ,\) and the expected return \(\overline{R},\) and is increasing in the standard deviation \( \sigma \). But in the game between the two supervisors the first two changes lead to an increase in the equilibrium effort of the supervisor not affected by the parameter changes.

It is interesting to note that the model of hierarchical supervision presented in this section is completely isomorphic to a model in which the central supervisor gets a signal \(\left. s_{c}=R+\varepsilon _{c},\right. \) where \(\varepsilon _{c}\sim N(0,\sigma ^{2}/e_{c})\), the local supervisor gets a signal \(s_{l}=R+\varepsilon _{l},\) where \(\varepsilon _{l}\sim N(0,\sigma ^{2}/e_{l}),\) and then it truthfully reports it to the central supervisor, who decides on the liquidation of the bank.Footnote 20 By the properties of normal distributions we have

$$\begin{aligned}&\left. E(L\mid s_{c},s_{l})=a\overline{R}+\frac{c(s_{cl}-\overline{R})}{ 1+(e_{c}+e_{l})^{-1}},\right. \end{aligned}$$
(26)
$$\begin{aligned}&\left. E(R\mid s_{c},s_{l})=\overline{R}+\frac{s_{cl}-\overline{R}}{ 1+(e_{c}+e_{l})^{-1}},\right. \end{aligned}$$
(27)

where \(s_{cl}\) is a weighted average of the two signals with weights proportional to their precision, that is

$$\begin{aligned} s_{cl}=\frac{e_{c}}{e_{c}+e_{l}}s_{c}+\frac{e_{l}}{e_{c}+e_{l}}s_{l}. \end{aligned}$$
(28)

The random variable \(s_{cl}\) is normally distributed, with \(E(s_{cl})= \overline{R}\) and \(Var(s_{cl})=\sigma ^{2}\left[ 1+(e_{c}+e_{l})^{-1}\right] ,\) so it has the same distribution as the random variable s in (15). Moreover, it is also the case that \(Cov(s_{cl},L)=c\) and \( Cov(s_{cl},R)=1. \) Since \(s_{cl}\) has the same properties as s,  it follows that all the results for the original model of hierarchical supervision, in which the two supervisors put in effort to get a single signal, extend to the alternative model, in which each supervisor puts in effort to get a signal and then the local supervisor truthfully sends its signal to the central supervisor. The original model corresponds to an institutional design in which the supervisors work in teams, while the alternative model corresponds to a design in which the supervisors work independently, but there is no problem of strategic information transmission, that is there are procedures in place that prevent the local supervisor from misrepresenting its signal.

4 Optimal institutional design

This section compares in welfare terms three possible institutional arrangements: decentralized, hierarchical, and centralized supervision. Under decentralized supervision, only the local supervisor collects information and decides on the liquidation of the bank. Under hierarchical supervision, the central and the local supervisor jointly collect information and then the central supervisor decides on the liquidation of the bank. Finally, under centralized supervision, only the central supervisor collects information and decides on the liquidation of the bank. The aim is to characterize the conditions under which one of the three institutional arrangements dominates the other two.

The comparison between these institutional arrangements focusses on two key parameters of the model: the cost of effort of the central supervisor, captured by parameter \(\gamma _{c},\) which is higher than parameter \(\gamma _{l}\) corresponding to the local supervisor, and the liquidation cost of the local supervisor, captured by parameter \( \delta _{l},\) which is higher than parameter \(\delta _{c}=0\) corresponding to the central supervisor. As noted above, the first assumption may be justified by reference to geographical as well as cultural distance between the central supervisor and the local bank, while the second may be justified by reference to the looser connections between the central supervisor and local lobbies and pressure groups.

Social welfare has two components. First, the expected bank returns, given the effort and the liquidation decisions of the supervisors. Second, with negative sign, the costs of supervisory efforts. Supervisory liquidation costs are not taken into account, since they are assumed to be linked to supervisory capture (e.g., possible transfers from banks to supervisors that cancel out in welfare terms).Footnote 21

By the results in Sect. 2, social welfare under decentralized supervision is given by

$$\begin{aligned} w_{l}= & {} \int _{-\infty }^{\widehat{s}_{l}}E(L\mid s)dF(s)+\int _{\widehat{s} _{l}}^{\infty }E(R\mid s)dF(s)-c_{l}(\widehat{e}_{l})\nonumber \\= & {} \overline{R}-\left[ (1-a)\overline{R}+\delta _{l}\right] \left[ \Phi ( \widehat{x}_{l})+\frac{\phi (\widehat{x}_{l})}{\widehat{x}_{l}}\right] -c_{l}(\widehat{e}_{l}), \end{aligned}$$
(29)

where \(\widehat{e}_{l}\) is the effort chosen by the local supervisor and

$$\begin{aligned} \widehat{x}_{l}=-\frac{\left[ (1-a)\overline{R}+\delta _{l}\right] }{ (1-c)\sigma }\big ( 1+\widehat{e}_{l}{}^{-1}\big ) ^{1/2}. \end{aligned}$$

Similarly, social welfare under centralized supervision is given by

$$\begin{aligned} w_{c}= & {} \int _{-\infty }^{\widehat{s}_{c}}E(L\mid s)dF(s)+\int _{\widehat{s} _{c}}^{\infty }E(R\mid s)dF(s)-c_{c}(\widehat{e}_{c}) \nonumber \\= & {} \overline{R}-(1-a)\overline{R}\left[ \Phi (\widehat{x}_{c})+\frac{\phi ( \widehat{x}_{c})}{\widehat{x}_{c}}\right] -c_{c}(\widehat{e}_{c}), \end{aligned}$$
(30)

where \(\widehat{e}_{c}\) is the effort chosen by the central supervisor and

$$\begin{aligned} \widehat{x}_{c}=-\frac{(1-a)\overline{R}}{(1-c)\sigma }\big ( 1+\widehat{e} _{c}{}^{-1}\big ) ^{1/2}. \end{aligned}$$

Finally, by the results in Sect. 3, social welfare under hierarchical supervision is given by

$$\begin{aligned} w_{h}= & {} \int _{-\infty }^{s^{*}}E(L\mid s)dF(s)+\int _{s^{*}}^{\infty }E(R\mid s)dF(s)-c_{c}({e}_{c}^{*})-c_{l}({e}_{l}^{*}) \nonumber \\= & {} \overline{R}-(1-a)\overline{R}\left[ \Phi (x^{*})+\frac{\phi (x^{*})}{x^{*}}\right] -c_{c}({e}_{c}^{*})-c_{l}({e}_{l}^{*}), \end{aligned}$$
(31)

where \((e_{c}^{*},e_{l}^{*})\) is the Nash equilibrium of the game played by the supervisors and

$$\begin{aligned} x^{*}=-\frac{(1-a)\overline{R}}{(1-c)\sigma }[1+(e_{c}^{*}+e_{l}^{*})^{-1}]^{1/2}. \end{aligned}$$

I can now compare in terms of welfare the three alternative institutional arrangements by computing \(w_{l},\) \(w_{h},\) and \(w_{c}\) for different values of the cost of effort of the central supervisor \(\gamma _{c}\) and the liquidation cost of the local supervisor \(\delta _{l}.\)

The forces driving the comparison are easy to explain. The higher the cost of effort of the central supervisor \(\gamma _{c}\) the lower the likelihood that centralized supervision will be optimal, because as shown in Fig. 3a the central supervisor will have an incentive to exert too little effort. Similarly, the higher the liquidation cost the local supervisor \(\delta _{l}\) the lower the likelihood that decentralized supervision will be optimal, because as shown in Fig. 3b the local supervisor will have an incentive to exert too little effort. Relative to the case of a single supervisor, hierarchical supervision entails incurring twice the fixed cost \(\gamma _{0}, \) so by the convexity of the cost function (8) it will only be optimal when both supervisors have an incentive to exert significant effort. This requires that the bias of the local supervisor is high but not too high and the costs of getting local information from the center are low but not too low.

Figure 6 shows the results.Footnote 22 Decentralized supervision dominates in Region D, where \(\gamma _{c}\) is large and \(\delta _{l}\) is small, that is when the cost advantage of the local supervisor is sufficiently large to compensate the bias in its liquidation decision. Hierarchical supervision dominates in Region H, where the cost advantage of the local supervisor is not too large or the liquidation cost of the local supervisor is not too small. Finally, centralized supervision dominates in Region C, where \(\gamma _{c}\) is small and \(\delta _{l}\) is large, that is when the cost disadvantage of the central supervisor is sufficiently small to compensate the bias of the local supervisor.

Fig. 6
figure 6

Optimal institutional design

Next, I analyze the effect on the regions in Fig. 6 of changes in the expected return \(\overline{R}\) and in the standard deviation \(\sigma \) of the bank’s investment return. Figure 7 illustrates the results. Panel A shows that an increase in \(\overline{R}\) expands Regions D and C where decentralized and centralized supervision are optimal at the expense of Region H where hierarchical supervision is optimal. Similarly, Panel B shows that a decrease in \(\sigma \) expands Regions D and C where decentralized and centralized supervision are optimal at the expense of the Region H where hierarchical supervision is optimal.Footnote 23 The intuition is that when the bank is more profitable (higher \(\overline{R}\)) or safer (lower \(\sigma \)) the supervisors will exert less effort (see Fig. 5c, d), Thus, given the convexity of the cost function (8), the relative advantage of having two supervisors will decrease.

Fig. 7
figure 7

Comparative statics of optimal institutional design. a Increase in expected asset return. b Decrease in volatility of asset return

Summing up, I have shown that hierarchical supervision dominates decentralized supervision when the possible capture of the local supervisor is a significant concern and the costs of getting local knowledge are not too large. But when these two forces go beyond certain threshold it is better to eliminate the local supervisor and concentrate all responsibilities in the central supervisor. Moreover, hierarchical supervision is more likely to dominate when the risk of bank failure (due to low profitability or high risk-taking) is high. These latter results illustrate the way in which banking supervision should be adjusted in response to changes in the environment, and hence may interpreted as macroprudential policies.

5 Extensions

5.1 Linking liquidation costs to supervisory responsibilities

This extension examines what happens when part of the liquidation cost incurred by the supervisors is linked to whether they are responsible for the liquidation decision.

To formalize this idea, suppose that \(\delta _{c}=0\) and \(\delta _{l}>0\) are the liquidation costs for the central and the local supervisor, respectively, in the absence of supervisory responsibilities, and that such responsibilities add \(\Delta \) to these costs. Thus, the liquidation cost of the local supervisor will be \(\delta _{l}+\Delta \) under decentralized supervision and \(\delta _{l}\) under hierarchical supervision, whereas the liquidation cost of the central supervisor will be \(\Delta \) under both centralized and hierarchical supervision.

By our previous results, under decentralized supervision the effort decision of the local supervisor is given by

$$\begin{aligned} \widehat{e}_{l}=\arg \max _{e_{l}}\left\{ \overline{R}-\left[ (1-a)\overline{R }+\delta _{l}+\Delta \right] \left[ \Phi (x_{l})+\frac{\phi (x_{l})}{x_{l}} \right] -c_{l}(e_{l})\right\} , \end{aligned}$$

where

$$\begin{aligned} x_{l}=x_{l}(e_{l})=-\frac{(1-a)\overline{R}+\delta _{l}+\Delta }{(1-c)\sigma }\big ( 1+e_{l}^{-1}\big ) ^{1/2}. \end{aligned}$$

Social welfare under decentralized supervision is

$$\begin{aligned} w_{l}=\overline{R}-\left[ (1-a)\overline{R}+\delta _{l}+\Delta \right] \left[ \Phi (\widehat{x}_{l})+\frac{\phi (\widehat{x}_{l})}{\widehat{x}_{l}}\right] +(\delta _{l}+\Delta )\Phi (\widehat{x}_{l})-c_{l}(\widehat{e}_{l}), \end{aligned}$$

where \(\widehat{x}_{l}=x_{l}(\widehat{e}_{l}).\) Similarly, under centralized supervision the effort decision of the central supervisor is given by

$$\begin{aligned} \widehat{e}_{c}=\arg \max _{e_{c}}\left\{ \overline{R}-\left[ (1-a)\overline{R }+\Delta \right] \left[ \Phi (x_{c})+\frac{\phi (x_{c})}{x_{c}}\right] -c_{c}(e_{c})\right\} , \end{aligned}$$

where

$$\begin{aligned} x_{c}=x_{c}(e_{c})=-\frac{(1-a)\overline{R}+\Delta }{(1-c)\sigma }\big ( 1+e_{c}^{-1}\big ) ^{1/2}. \end{aligned}$$

Social welfare under centralized supervision is

$$\begin{aligned} w_{c}=\overline{R}-\left[ (1-a)\overline{R}+\Delta \right] \left[ \Phi ( \widehat{x}_{c})+\frac{\phi (\widehat{x}_{c})}{\widehat{x}_{c}}\right] +\Delta \Phi (\widehat{x}_{c})-c_{c}(\widehat{e}_{c}), \end{aligned}$$

where \(\widehat{x}_{c}=x_{c}(\widehat{e}_{c}).\)

Finally, since the liquidation cost of the central supervisor is now \(\Delta >0,\) the payoff functions of the supervisors become

$$\begin{aligned} v_{c}(e_{c},e_{l})= & {} \overline{R}-\left[ (1-a)\overline{R}+\Delta \right] \left[ \Phi (x)+\frac{\phi (x)}{x}\right] -c_{c}(e_{c}), \\ v_{l}(e_{c},e_{l})= & {} \overline{R}-\left[ (1-a)\overline{R}+\Delta \right] \left[ \Phi (x)+\frac{\phi (x)}{x}\right] -(\delta _{l}-\Delta )\Phi (x)-c_{l}(e_{l}), \end{aligned}$$

where

$$\begin{aligned} x=x(e_{c}+e_{l})=-\frac{(1-a)\overline{R}+\Delta }{(1-c)\sigma } [1+(e_{c}+e_{l})^{-1}]^{1/2}. \end{aligned}$$
Fig. 8
figure 8

Linking liquidation costs to supervisory responsabilities

Let \((e_{c}^{*},e_{l}^{*})\) denote the Nash equilibrium of this game. Social welfare under hierarchical supervision is then

$$\begin{aligned} w_{h}=\overline{R}-[(1-a)\overline{R}+\Delta ]\left[ \Phi (x^{*})+\frac{ \phi (x^{*})}{x^{*}}\right] +\Delta \Phi (x^{*})-c_{c}(e_{c}^{*})-c_{l}(e_{l}^{*}), \end{aligned}$$

where \(x^{*}=x(e_{c}^{*}+e_{l}^{*}).\)

Figure 8 shows the results.Footnote 24 The assumption that part of the liquidation cost incurred by the supervisors is linked to whether they are responsible for the liquidation decision expands Region H where hierarchical supervision is optimal at the expense of Regions D and C where decentralized and centralized supervision are optimal.

The intuition is as follows. Consider the effect of introducing the additional cost \(\Delta \) on a point in Fig. 8 on the original boundary between Regions D and H. By our comparative statics results in Sect. 3, this will lead to a shift to the left of the reaction function of the central supervisor (due to the increase in its liquidation cost by \(\Delta \) ) and a shift upwards of the reaction function of the local supervisor (due to the equivalent reduction in its liquidation cost by \(\Delta \)), leading to a reduction in the equilibrium effort \(e_{c}^{*}\) of the central supervisor and an increase in the (cheaper) equilibrium effort \(e_{l}^{*} \) of the local supervisor, which explains that the original boundary point is now inside Region H.

Similarly, consider the effect of introducing the additional cost \(\Delta \) on a point in Fig. 8 on the original boundary between Regions H and C. By our comparative statics results in Sect. 3, this will lead to a shift to the left of the reaction function of the central supervisor (due to the increase in its liquidation cost by \(\Delta \)), leading to a reduction in the equilibrium effort \(e_{c}^{*}\) of the central supervisor and an increase in the (cheaper) equilibrium effort \(e_{l}^{*}\) of the local supervisor. This increase, which does not obtain under centralized supervision, explains that the original boundary point is now inside Region H.

5.2 Limiting the size of the central supervisor

This extension considers whether it would be desirable from a welfare perspective to limit (in some statutory manner) the size of the central supervisor. In terms of the model, a size limit implies an upper bound to the effort of the central supervisor. How could this be welfare improving?

To answer this question it is convenient to start considering the case where the central supervisor were a Stackelberg leader. By standard results for games with strategic substitutes, in a Stackelberg equilibrium the central supervisor would reduce its effort \(e_{c}\) and the local supervisor would increase its effort \(e_{l}\), relative to the Nash equilibrium \((e_{c}^{*},e_{l}^{*})\). Figure 9 illustrates the result. By the definition of Nash equilibrium, the indifference curve of the central supervisor at the point \((e_{c}^{*},e_{l}^{*})\) is tangent to the horizontal line \( e_{l}=e_{l}^{*}.\) But since the reaction function of the local supervisor is downward sloping, reducing the effort of the central supervisor increases its payoff.

Fig. 9
figure 9

The central supervisor as a Stackelberg leader

The question is whether this argument also applies when we replace the indifference curve of the central supervisor by the social indifference curves. To check that this is indeed the case, notice that using the definition (23) of the payoff function of the central supervisor \( v_{c}(e_{c},e_{l})\) and the definition (31) of the social welfare function \(w(e_{c},e_{l}),\) the social indifference curve going through the Nash equilibrium point \((e_{c}^{*},e_{l}^{*})\) may be written as

$$\begin{aligned} w(e_{c},e_{l})=v_{c}(e_{c},e_{l})-c_{l}(e_{l})=w_{h}^{*}, \end{aligned}$$

where \(w_{h}^{*}=w(e_{c}^{*},e_{l}^{*}).\) This implies

$$\begin{aligned} \left. \frac{\partial w(e_{c},e_{l})}{\partial e_{c}}\right| _{(e_{c}^{*},e_{l}^{*})}=\left. \frac{\partial v(e_{c},e_{l})}{ \partial e_{c}}\right| _{(e_{c}^{*},e_{l}^{*})}=0, \end{aligned}$$

where the second equality follows from the definition of Nash equilibrium. Thus, the social indifference curve at the point \((e_{c}^{*},e_{l}^{*})\) is also tangent to the horizontal line \(e_{l}=e_{l}^{*},\) so moving up the reaction function of the local supervisor increases social welfare. Figure 10 illustrates the result.

Fig. 10
figure 10

Limiting the size of the central supervisor

Hence, the commitment to reduce the effort of the central supervisor in the hierarchical supervision setup is welfare improving. The intuition is that a reduction in the effort of the central supervisor forces the local supervisor to increase its (cheaper) effort. However, the model assumes that the efforts of the supervisors are not verifiable, so there is an issue about how this could be implemented. One obvious way to move in this direction would be to limit the size of the central supervisor.

6 Concluding remarks

This paper constructs a model of hierarchical supervision in which a central and a local supervisor independently choose supervisory efforts which determine the precision of a signal of the solvency of a local bank. The central supervisor uses this information to decide on the bank’s early liquidation. Importantly, the local supervisor is assumed to be characterized by a lower cost of effort (due to proximity) and a higher cost of liquidating the bank (due to capture). The model of hierarchical supervision is compared in terms of welfare with two alternative arrangements, namely decentralized supervision, where the local supervisor is fully in charge, and centralized supervision, where the central supervisor is fully in charge. The results show that moving supervision from local to hierarchical to central is more likely to be optimal when the cost of getting local knowledge is sufficiently low and/or the possible capture of the local supervisor is a significant concern.

To the extent that supervisory capture may be more relevant for large banks, the results provide support for the design of Single Supervisory Mechanism of the European Central Bank (ECB), in which significant banks are supervised by the ECB in collaboration with the National Competent Authorities (NCAs), and less significant banks are supervised by the NCAs.

I would like to conclude with a few remarks. First, the model assumes that the liquidation cost is fully driven by capture. However, it may be the case that (at least part of) the liquidation cost is also a social cost, that is internalized by the local supervisor but not by the central supervisor. In this case, having a more biased (local) supervisor could be better than having a less biased (central) supervisor, so the region in the parameter space where decentralized supervision dominates becomes larger.

Second, the model assumes that bank is completely passive. As in Carletti et al. (2016), it would be interesting to endogenize bank’s choice of risk under different supervisory arrangements. Moreover, it would also be interesting to introduce bank capital regulation, and to analyze the possible trade-offs between regulation and supervision. In particular, one may conjecture that the tougher the regulation, and hence the lower the bank’s risk-taking, the less valuable supervision will be, which as noted above reduces the region of the parameter space where hierarchical supervision dominates.

Third, the model is static, but one could easily think of some dynamic implications. For example, it may be the case that in good times supervisors gradually reduce their capabilities, so they may not be able to increase them quickly when bad times arrive, in which case we could end up having (involuntary) supervisory forbearance.

Finally, the model focuses on supervision of a domestic bank. It would be interesting to explore the case of an international bank that operates across several local jurisdictions. In fact, one could argue that the rationale for the creation of the Single Supervisory Mechanism of the ECB was the fact that in a multi-country system, like the euro area, the supervisory decisions of a biased national authority may entail negative externalities on other countries. But these potential externalities also appear in the simple setup of this paper. In particular, since deposits are assumed to be insured, a bank failure may impact on the solvency of the country in which the bank is located, and through a fiscal channel affect other countries in the area.