Mean field game model of corruption

A simple model of corruption that takes into account the effect of the interaction of a large number of agents by both rational decision making and myopic behavior is developed. Its stationary version turns out to be a rare example of an exactly solvable model of mean-field-game type. The results show clearly how the presence of interaction (including social norms) influences the spread of corruption.


Introduction
Analysis of the spread of corruption in bureaucracy is a well recognized area of the application of game theory, which attracted attention of many researchers. General surveys can be found in [2], [26], [38]. In his Prize lecture [25], L. Hurwicz gives a nice introduction in laymen terms of various problems arising in an attempt to find out 'who will guard the guardians?' and which mechanisms can be exploited to enforce the legal behavior? In a series of papers [34], [35] the authors analyze the dynamic game, where entrepreneurs have to apply to a set of bureaucrats (in a prescribed order) in order to obtain permission for their business projects; for an approval the bureaucrats ask for bribes with their amounts being considered as strategies of the bureaucrats. The existence of an intermediary undertaking the contacts with bureaucrats for a fee may moderate the outcomes of this game referred to as petty corruption, as each bureaucrat is assumed to ask for a small bribe, so that the large bureaucratic losses of entrepreneurs occur from a large number of bureaucrats. This is a kind of extension of the classical ultimatum game, as if an entrepreneur declines to pay the required graft, the game stops. In the series of works [46], [47], [41] the authors develop an hierarchical model of corruption, where the inspectors of each level audit the inspectors of the previous level and report their finding to the inspector of the next upper level. For a graft they may choose to make a falsified report. The inspector of the highest level is assumed to be honest but very costly

The model and the objectives of analysis
An agent is supposed to be in one of the three states: honest H, corrupted C, reserved R, where R is the reserved job of low salary that an agent receives as a punishment if her corrupted behavior is discovered.
The change between H and C is subject to the decisions of the agents (though the precise time of the execution of their intent is noisy) the change from C to R are random with distributions depending on the level of the efforts (say, a budget used) b of the principal (a government representative) invested in chasing a corrupted behavior, the change R to H (so-to-say, a new recruitment) may be possible and is included as a random event with a certain rate.
Let n H , n C , n R denote the numbers of agents in the corresponding states with N = n H + n C + n R the total number of agents. By a state of the system we shall mean either the 3-vector n = (n H , n C , n R ) or its normalized version The control parameter u of each player in states H or C may have two values, 0 and 1, meaning that the player is happy with her state (H or C) or she prefers to switch one to another; there is no control in the state R. When the updating decision 1 is made, the updating effectively occurs with some rates λ. The recovery rate, that is the rate of change from R to H (we assume that once recruited the agents start by being honest) is a given constant r.
Apart from taking a rational decision to swap H and C, an honest agent can be pushed to become corruptive by her corruptive peers, the effect being proportional to the fraction of corrupted agents with certain coefficient q inf , which is analogous to the infection rate in epidemiologic models. On the other hand, the honest agents can contribute to chasing and punishing corrupted behavior, this effect of a desirable social norm being proportional to the fraction of honest agents with certain coefficient q soc . The presence of the coefficients q inf , q soc reflecting the social interaction, makes the dynamics of individual agents dependent on the distribution of other agents, thus bringing the model to the setting of mean-field games. It is of our major concern to find out how the presence of interaction influences the spread of corruption.
Thus if all agents use the strategy u H , u C ∈ {0, 1} and the efforts of the principle is b, the evolution of the state x is clearly given by the ODE It is instructive to see how this ODE can be rigorously deduced from the Markov model of interaction. Namely, if all agents use the strategy u H , u C ∈ {0, 1} and the efforts of the principle is b, the generator of the Markov evolution on the states n is (where the unchanged values in the arguments of F on the r.h.s are omitted) L N F (n H , n C , n R ) = n C (b + q soc n H N )F (n C − 1, n R + 1) + n R rF (n R − 1, n H + 1) For any N, this generator describes a Markov chain on the finite state space {n = (n H , n C , n R ) : n H + n C + n R = N}, where any agent, independently of others, can be recruited with rate r (if in state R) or change from C to H or vice versa if desired (with rate λ), and where the change of the state due to binary interactions are taken into account by the terms containing q soc and q inf . In terms of x the generator L N F takes the form where {e j } is the standard basis in R 3 . If F is a differentiable function, the generator L N turns to in the limit N → ∞. This is a first order partial differential operator and its characteristics are given by the ODE (1). This Markov model is important not only as a tool to derive (1), but it helps to understand the dynamics of individual players (in statistical mechanics terms corresponding to the so-called tagged particles), which are central for a mean-field game analysis of agents trying to deviate from the behavior of a crowd. Namely, if x(t) and b(t) are given, the dynamics of each individual player is the Markov chain on the 3 states with the generator depending on the individual control u ind ∈ {0, 1}, so thatġ = L ind g is the Kolmogorov backward equation of this chain. Assume that an employed agent receives a wage w H per unit of time and, if corrupted, an average payoff w C (that includes w H plus some additional illegal reward); she has to pay a fine f when her illegal behavior is discovered; the reserved wage for fired agents is w R . If the distribution of other payers is x(t) = (x R , x H , x C )(t), the HJB equation describing the optimal payoff g = g t (starting at time t with time horizon T ) of an agent is (5) Therefore, starting with some control used by all players, we can find the dynamics x(t) from equation (1) (with u com used for u). Then each individual should solve the Markov control problem (5) thus finding the individually optimal strategy The basic MFG consistency equation can now be explicitly written as Instead of analyzing this rather complicated dynamic problem, we shall look for a simpler and practically more relevant problem of consistent stationary strategies.
There are two standard stationary problems arising from HJB (5), one being the search for the average payoff for long period games, and another the search for discounted optimal payoff. The first is governed by the solutions of HJB of the form (T − t)µ + g, linear in t (with µ describing the optimal average payoff), so that g satisfies the stationary HJB equation: and the discounted optimal payoff (with the discounting coefficient δ) satisfies the stationary HJB The analysis of these two settings is mostly analogous (as they are in some sense equivalent, see e. g. [43]). We shall concentrate on the first one.
For a fixed b, the stationary MFG consistency problem is in finding ( where u C (x), u H (x) give maximum in the solution to (7). Thus x is a fixed point of the limiting dynamics of the distribution of large number of agents such that the corresponding stationary control is individually optimal subject to this distribution. Remark 1. Notice that our stationary MFG consistency is close to the concept of the Wardrop equilibria, see e. g. [21], but is quite different nevertheless.
Fixed points can practically model a stationary behavior only if they are stable. Thus we are interested in stable solutions (x, u C , u H ) = (x, u C (x), u H (x)) to the stationary MFG consistency problem, where a solution is stable if the corresponding stationary distribution x = (x R , x H , x C ) is a stable equilibrium to (1) (with u C , u H fixed by this solution). As mentioned above, our major concern is to find out how the presence of interaction (specified by the coefficients q soc , q inf ) affects the stable equilibria.

Results
Our first result describe explicitly all solutions to the stationary MFG consistency problem stated above and the second result deals with the stability of these solutions.
We shall say that in a solution to the stationary MFG consistency problem the optimal individual behavior is corruption if u C = 0, u H = 1: if you are corrupt stay corrupt, and if you are honest, start corrupted behavior as soon as possible; the optimal individual behavior is honesty if u C = 1, u H = 0: if you are honest stay honest, if you are involved in corruption try to clean yourself from corruption as soon as possible.
The basic assumptions on our coefficients are The key parameter for our model turns out to be the quantitȳ (which can take values ±∞ if q soc = 0). (7), where and x * H is the unique solution on the interval (0, 1) of the quadratic equation Q(x H ) = 0, where Under this solution the optimal individual behavior is corruption: u C = 0, u H = 1.
(ii) Ifx < 1, there may be 1,2 or 3 solutions to the stationary MFG problem (9), (7). Namely, the point x H = 1, x C = x R = 0 is always a solution, under which the optimal individual behavior is being honest: then there is another solution with the optimal individual behavior being honest, that is u C = 1, u H = 0: there is a solution with the corruptive optimal behavior of the same structure as in (i), that is, with x * H being the unique solution to Q(x H ) = 0 on (0,x] and x * C given by (12).

Remark 2.
As seen by inspection, Q[(b + λ)/(q inf − q soc )] > 0 (if q inf − q soc > 0), so that forx slightly less than x * * H = (b + λ)/(q inf − q soc ) one has also Q(x) > 0, in which case one really has three points of equilibria given by x * H , x * * H , x H = 1 with 0 < x * <x < x * * < 1. Remark 3. In case of the stationary problem arising from the discounting payoff, that is from equation (8), the role of the classifying parameterx from (11) is played by the quantityx Theorem 3.2. Assume (10).
(i) The solution x * = (x * R , x * C , x * H ) (given by Theorem 3.1) with individually optimal behavior being corruption is stable if then x H = 1 is the unique stationary MFG solution with individually optimal strategy being honest; and this solution is stable. If (14) holds, there are two stationary MFG solution with individually optimal strategy being honest, one with x H = 1 and another with x H = x * * H given by (15); the first solution is unstable and the second is stable. We are not presenting necessary and sufficient condition for the stability of solutions with optimally corrupted behavior. Condition (18) is only sufficient, but it covers a reasonable range of parameters where the 'epidemic' spread of corruption and social cleaning are effects of comparable order.
As a trivial consequence of our theorems we can conclude that in the absence of interaction, that is for q inf = q soc = 0, the corruption is individually optimal if and honesty is individually optimal otherwise (which is of course a reformulation of the standard result for a basic model of corruption, see e. g. [2]). In the first case the unique equilibrium is and in the second case the unique equilibrium is x H = 1. Both are stable.

Discussion
The results above show clearly how the presence of interaction (including social norms) influences the spread of corruption. When q inf = q soc = 0, one has one equilibrium that corresponds to corrupted or honest behavior depending on a certain relation (19) between the parameters of the game. If social norms or 'epidemic' myopic behavior are allowed in the model, which is quite natural for a realistic process, the situation becomes much more complicated. In particular, in a certain range of parameters, one has two stable equilibria, one corresponding to an optimally honest and another to an optimally corrupted behavior. This means in particular that similar strategies of a principal (defined by the choice of parameters b, f, w H ) can lead to quite different outcomes depending on the initial distributions of honest-corrupted agents or even on the small random fluctuations in the process of evolution.
The coefficients b and f enter exogenously in our system and can be used as tools for shifting the (precalculated) stable equilibria in the desired direction. These coefficients are not chosen strategically, which is an appropriate assumption for situations when the principal may have only poor information about the overall distribution of states of the agents. It is of course natural to extend the model by treating the principal as a strategic optimizer who chooses b (or even can choose f ) in each state to optimize certain payoff. This would place the model in the group of MFG models with a major player, which is actively studied in the current literature.
Classifying agents as corrupted and honest only is a strong simplification of reality. In the spirit of [41] and [29] it is natural to consider the hierarchy i = 1, · · · , n of the possible positions of agents in a bureaucratic staircase with both basic wages w i H and the illegal payoff w i C in the corresponding states H i and C i increasing with i. Once a corruptive behavior of an agent from state C i is detected, she is supposed to be downgraded to the reserved state R = H 0 , and the upgrading from i to i + 1 can be modeled as a random event with a given rate. This multi-layer model of corruption could bring insights on the spread of corruption among the representatives of the different levels of power.
Theoretically, the main questions left open by our analysis are the precise link between the stationary and dynamic MFG solutions and the precise statement of the law o large numbers. Namely, (i) Can we solve the dynamic MFG consistency problem (6) and whether its solutions will approach the solutions of the stationary problems described by our Theorems? (ii) Considering a stochastic game of N players in the Markov model where each player evolves according to (4) with chosen control u C , u H and the distribution x t reflects the aggregated distribution so obtained, do our stationary MFG solutions represent approximate Nash equilibria to this game? This latter question is an MFG version of the well known problem of evolutionary game theory about the correspondence between the results of taking limits N → ∞ and t → ∞ in a different order, where rather deep results were obtained, see e. g. [10] and references therein.

Proof of Theorem 3.1
Clearly solutions to (7) are defined up to an additive constant. Thus we can and will assume that g(R) = 0. Moreover, we can reduce the analysis to the case w R = 0 by subtracting it from all equations of (7) and thus shifting by w R the values w H , w C , µ. Under these simplifications, the first equation to (7) is µ = rg(H), so that (7) becomes the system for the pair (g(H), g(C)) with µ = rg(H).
Assuming g(C) ≥ g(H), that is u C = 0, u H = 1, so that the corruptive behavior is optimal, system (21) turns to Solving this system of two linear equations we get or, in other words, which by restoring w R (shifting w C , w H by w R ) gives Since x H ∈ (0, 1), this is automatically satisfied ifx > 1, that is under the assumption of (i). On the other hand, it definitely cannot hold ifx < 0.
Assuming g(C) ≤ g(H), that is u C = 1, u H = 0, so that the honest behavior is optimal, system (21) turns to Solving this system of two linear equations we get so that g(C) ≤ g(H) is equivalent to the inverse of condition (23).
If g(C) ≥ g(H), that is u C = 0, u H = 1, the fixed point equation (9) becomes Since x R = 1 − x H − x C , the third equation is a consequence of the first two equations, which yields the system From the first equation we have From this it is seen that if x H ∈ (0, 1) (as it should be), then also x C ∈ (0, 1) and .
Plugging x C in the second equation of (27) we find for x H the quadratic equation Q(x H ) = 0 with Q given by (13).
Since Q(0) < 0 and Q(1) > 0, the equation Q(x H ) = 0 has exactly one positive root x * H ∈ (0, 1). Hence x * H satisfies (23) if and only if eitherx > 1 (that is we are under the assumption of (i)) or if (16) holds proving the last statement of (ii).
If g(C) ≤ g(H), that is u C = 1, u H = 0, the fixed point equation (9) becomes Again here x R = 1 − x H − x C and the third equation is a consequence of the first two equations, which yields the system From the first equation we again get (28). Plugging this x C in the second equation of (27) we find the equation with two explicit solutions yielding the first and the second statements of (ii).
6 Proof of Theorem 3.2 (i) When individually optimal behavior is to be corrupted, that is u C = 0, u H = 1, system (1) written in terms of (x H , x C ) becomes Written in terms of y = x H − x * H , z = x C − x * C it takes the form ẏ = −y(r + λ + q inf x * C ) − z(r + q inf x * H ) − q inf yz, z = y[λ + (q inf − q soc )x * C ] + z[x * H (q inf − q soc ) − b]z + (q inf − q soc )yz.
The condition of stability is the requirement that both eigenvalues of the linear approximation around the fixed point have real negative parts, or equivalently that the trace of the linear approximation is negative and the determinant is positive: To analyze the stability of the fixed point x H = 1, x C = 0 we write it in terms of x C and y = 1 − x H asẏ = −ry + x C (r − λ + q inf ) − q inf yx C , According to the linear approximation, the fixed point y = 0, x C = 0 of this system is stable if q inf − q soc − λ − b < 0 proving the first statement in (ii).
Assume (14) holds. To analyze the stability of the fixed point x * * H we write system (35) in terms of the variables The characteristic equation of the matrix of linear approximation is seen to be Under (14) both the free term and the coefficient at ξ are positive. Hence both roots have negative real parts implying stability.