1 Introduction

Toy models for the mean-field games of corruption and botnet defense in cyber-security were developed in Kolokoltsov and Malafeyev (2015) and Kolokoltsov and Bensoussan (2015). These were games with three and four states of the agents respectively. Here we develop a more general mean-field-game model with 2d states, \(d\in \mathbf {N}\), that extends the models of Kolokoltsov and Malafeyev (2015) and Kolokoltsov and Bensoussan (2015). In order to tackle new technical difficulties arising from a larger state-space we introduce new asymptotic regimes, small discount and small interaction asymptotics. Hence the properties that we obtain for the new model do not cover more precise results of Kolokoltsov and Malafeyev (2015) and Kolokoltsov and Bensoussan (2015) (with the full classification of the bifurcation points), but capture their main qualitative and quantitative features and provide regular solutions away from the points of bifurcations. Apart from new modeling, this paper contributes to one of the key questions in the modern study of mean-field games, namely, what is the precise link between stationary and time -dependent solutions. This problem is sorted out here for a concrete model, but the method can be definitely used in more general situations.

On the one hand, our model is a performance of the general pressure-and-resistance-game framework of Kolokoltsov (2014) and the nonlinear Markov battles of Kolokoltsov (2012), and on the other hand, it represents a simple example of mean-field- and evolutionary-game modeling of networks. Initiating the development of the latter, we stress already here that two-dimensional arrays of states arise naturally in many situations, one of the dimensions being controlled mostly by the decision of agents (say, the level of tax evasion in the context of inspection games) and the other one by a principal (major player) or evolutionary interactions (say, the level of agents in bureaucratic staircase, the type of a computer virus used by botnet herder, etc).

We shall dwell upon two basic interpretations of our model: corrupted bureaucrats playing against the principal (say, governmental representative, also referred in literature as benevolent dictator) or computer owners playing against a botnet herder (which then takes the role of the principal), which tries to infect the computers with viruses. Other interpretations can be done, for instance, in the framework of the inspection games (inspector and tax payers) or of the disease spreading in epidemiology (among animals or humans), or the defense against a biological weapon. Here we shall keep the principal in the background concentrating on the behavior of small players (corrupted bureaucrats or computer owners), which we shall refer to as agents or players.

The paper is organized as follows. In the next section we introduce our model specifying in this context the basic notions of the mean-field-game (MFG) consistency problem in its dynamic and stationary versions. We also introduce our basic asymptotic regimes of fast execution of personal decision, small discounting and small interactions.

In Sect. 3 we calculate explicitly all non-degenerate solutions of the stationary MFG problem in our chosen asymptotic regimes showing also that all these solutions are stable points of the corresponding dynamics. The main technical tool here presents the method of stability of dynamical systems around an equilibrium.

Section 4 contains our main result that shows how from a stationary solution one can construct a class of full time-dependent solutions of the forward–backward system of MFG equations satisfying the so-called turnpike property around the stationary one, that is, the solutions with a large horizon spend most of the time near a stationary solution, apart from small times near the initial and end points. According to Basna et al. (2014), solutions to MFG equations represent symmetric \(\epsilon \)-Nash equilibria for the corresponding N-player game with a finite state space.

We complete this introductory section with short bibliographical notes on closely related papers.

Analysis of the spread of corruption in bureaucracy is a well recognized area of the application of game theory, which attracted attention of many researchers. General surveys can be found in Aidt (2009), Jain (2001), Levin and Tsirik (1998). More recent literature is reviewed in Kolokoltsov and Malafeyev (2015) and Katsikas et al. (2016), see also Malafeyev et al. (2014), Alferov et al. (2015) for electric and engineering interpretations of corruption games.

The use of game theory in modeling attacker-defender has been extensively adopted in the computer security domain recently, see Bensoussan et al. (2010), Li et al. (2009) and Lye and Wing (2005) and bibliography there for more details.

Mean-field games present a quickly developing area of the game theory. Their study was initiated by Lasry and Lions (2006) and Huang et al. (2006) and has been quickly developing since then, see Bardi et al. (2013), Bensoussan et al. (2013), Gomes and Saude (2014), Caines (2014) for recent surveys, and Cardaliaguet et al. (2013), Carmona and Delarue (2013), Gomes and Saude (2014) specifically for long-time behavior, probabilistic interpretation and finite-state games.

2 The model

We assume that any agent in the group of N agents has 2d states: iI and iS, where \(i\in \{1, \ldots , d\}\) is referred to as a strategy. In the first interpretation the letters S or I designate the senior or initial position of a bureaucrat in the hierarchical staircase and i designates the level or type of corruptive behavior (say, the level of bribes one asks from customers or, more generally, the level of illegal profit he/she aims at). In the literature on corruption the state I is often denoted by R and is referred to as the reserved state. It is interpreted as a job of the lowest salary given to the not trust-worthy bureaucrats. In the second interpretation the letters S or I designate susceptible or infected states of computers and i denotes the level or the type of defense system available on the market.

We assume that the choice of a strategy depends exclusively on the decision of an agent. The control parameter u of each player may have d values denoting the strategy the agent prefers at a moment. As long as this coincides with the current strategy, the updating of a strategy does not occur. Once the decision to change i to j is made, the actual updating is supposed to occur with a certain rate \(\lambda \). Following Kolokoltsov and Bensoussan (2015), we shall be mostly interested in the asymptotic regime of fast execution of individual decisions, that is, \(\lambda \rightarrow \infty \).

The change between S and I may have two causes: the action of the principal (pressure game component) and of the peers (evolutionary component). In the first interpretation the principal can promote the bureaucrats from the initial to the senior position or degrade them to the reserved initial position, whenever their illegal behavior is discovered. The peers can also take part in this process contributing to the degrading of corrupted bureaucrats, for instance, when they trespass certain social norms. In the second interpretation the principal, the botnet herder, infects computers with the virus by direct attacks turning S to I, and the virus then spreads through the network of computers by a pairwise interaction. The recovery change from I to S is due to some system of repairs which can be different in different protection levels i.

Let \(q^i_+\) denote the recovery rates of upgrading from iI to iS and \(q^i_-\) the rates of degrading (punishment or infection) from state iS to iI, which are independent of the state of other agents (pressure component), and let \(\beta _{ij}/N\) denote the rates at which any agent in state iI can stimulate the degrading (punishment or infection) of another agent from jS to jI (evolutionary component). For simplicity we ignore here the possibility of upgrading changes from jS to jI due to the interaction with peers.

In the detailed description of our model its states are N-tuples \(\{\omega _1, \ldots , \omega _N\}\) with each \(\omega _j\) being one of iS or jI describing the positions of all N players of the game. If each player \(l\in \{1, \ldots , N\}\) has a strategy \(u_t\), the systems evolves according to the continuous-time Markov chain with the transitions occurring at the rates specified above. When the fees \(w^i_I\) and \(w_S^i\) for staying in the corresponding states per unit time are specified together with the terminal payments \(g_T(iI),g_T(iS)\) at each state at the terminal time T, we are in the setting of a stochastic dynamic game of N players. In the time-dependent mean-field game approach one is interested in the approximate (for large N) symmetric Nash equilibria of such game. Alternatively, in the stationary mean-field game approach one is looking for the approximate stationary symmetric Nash equilibria (with time independent controls) in an infinite-horizon version of this game, where the cost can be taken either as an average per unit time, or, if discounting is present, the total cost of the whole infinite-time game.

In a symmetric Nash equilibrium, all players are supposed to play the same strategy. In this case players become indistinguishable, leading to a reduced description of the game, where the state-space is the set \(\mathbf {Z}_+^{2d}\) of vectors

$$\begin{aligned} n=(n_{\{iI\}},n_{\{iS\}})=(n_{1I}, \ldots , n_{dI},n_{1S}, \ldots , n_{dS}) \end{aligned}$$

with coordinates presenting the number of agents in the corresponding states. Alternatively, in the normalized version, the state-space becomes the subset of the standard simplex \(\Sigma ^N_{2d}\) in \(\mathbf {R}^{2d}\) consisting of vectors

$$\begin{aligned} x=(x_I,x_S)=(x_{1I}, \ldots , x_{dI}, x_{1S}, \ldots , x_{dS})=n/N, \end{aligned}$$

with \(N=n_{1S}+ n_{1I}+ \cdots + n_{dS}+ n_{dI}\) the total number of agents. The functions f on \(\mathbf {Z}_+^{2d}\) and F on \(\Sigma ^N_{2d}\) are supposed to be linked by the scaling transformation: \(f(n)=F(n/N)\).

Assuming that all players have the same strategy \(u^{com}_t=\{u^{com}_t(iS), u^{com}_t(iI)\}\), the Markov chain introduced above reduces to the time-nonhomogeneous Markov chain on \(\mathbf {Z}_+^{2d}\) or \(\Sigma ^N_{2d}\), which can be described by its (time-dependent) generator. Omitting the unchanged values in the arguments of F on the r.h.s., this generator writes down as

$$\begin{aligned} L_N^tf(n)= & {} \lambda n_{jI}\sum _{j,i}\mathbf {1}(u_t^{com}(jI)=i)[f(n_{jI}-1, n_{iI}+1)-f(n)]\\&+\,\lambda n_{jI}\sum _{j,i}\mathbf {1}(u_t^{com}(jS)=i)[f(n_{jS}-1, n_{iS}+1)-f(n)] \\&+\sum _j n_{jI}q_j^+ [f(n_{jI}-1, n_{jS}+1)-f(n)]\\&+\sum _j n_{jS}q_j^- [f(n_{jS}-1, n_{jI}+1)-f(n)] \\&+\,\frac{1}{N}\sum _{j,i} n_{iI}n_{jS}\beta _{ij} [f(n_{jI}-1, n_{jS}+1)-f(n)], \end{aligned}$$

for the Markov chain on \(\mathbf {Z}_+^{2d}\), and as

$$\begin{aligned} L_N^tF(x)= & {} \lambda Nx_{jI}\sum _{j,i}\mathbf {1}(u_t^{com}(jI)=i)[F(x-e_{jI}/N+e_{iI}/N)-F(x)] \\&+\,\lambda Nx_{jI}\sum _{j,i}\mathbf {1}(u_t^{com}(jS)=i)[F(x-e_{jS}/N+e_{iS}/N)-F(x)] \\&+\,N\sum _j x_{jI}q_j^+ [F(x-e_{jI}/N+e_{jS}/N)-F(x)]\\&+\,N\sum _j x_{jS}q_j^- [F(x-e_{jS}/N+e_{jI}/N)-F(x)] \\&+\,N\sum _{j,i} x_{iI}x_{jS}\beta _{ij} [F(x-e_{jI}/N+e_{jS}/N)-F(x)], \end{aligned}$$

for the Markov chain on \(\Sigma ^N_{2d}\). Here and below \(\mathbf {1}(M)\) denotes the indicator function of a set M and \(\{e_{jI},e_{iS}\}\) is the standard orthonormal basis in \(\mathbf {R}^{2d}\).

Remark 1

These generators can be considered as an alternative, analytic, definition of our Markov evolution (depending on controls \(u_t\)) described above probabilistically in terms of the transition rates.

We have written the generator of the Markov chain arising from common (symmetric) controls of the players. It is complicated enough. Of course, one could write down also the generator of the Markov chain arising from different individual strategies, which would look much more awkward. Any attempt to solve the game (say, find Nash equilibria) working with concrete large N leads necessarily to tremendous problems (even when numeric solutions are sought), which are commonly referred to as the ‘curse of dimensionality’. The basic idea of the mean-field game approach (or alternative approaches based on the law of large numbers) is to turn the curse of dimensionality into the ‘blessing of dimensionality’ by passing to the limit \(N\rightarrow \infty \).

With this idea in mind, assuming F is differentiable and expanding it in the Taylor series one finds that, as \(N\rightarrow \infty \), the last generators tend to

$$\begin{aligned} L^tF(x)= & {} \lambda x_{jI}\sum _{j,i}\mathbf {1}(u_t^{com}(jI)=i)\left[ \frac{\partial F}{\partial x_{iI}}-\frac{\partial F}{\partial x_{jI}}\right] \nonumber \\&+\,\lambda x_{jI}\sum _{j,i}\mathbf {1}(u_t^{com}(jS)=i)\left[ \frac{\partial F}{\partial x_{iS}}-\frac{\partial F}{\partial x_{jS}}\right] \\&+\sum _j x_{jI}q_j^+ \left[ \frac{\partial F}{\partial x_{jS}}-\frac{\partial F}{\partial x_{jI}}\right] +\sum _j x_{jS}q_j^- \left[ \frac{\partial F}{\partial x_{jI}}-\frac{\partial F}{\partial x_{jS}}\right] \\&+\sum _{j,i} x_{iI}x_{jS}\beta _{ij} \left[ \frac{\partial F}{\partial x_{jI}}-\frac{\partial F}{\partial x_{jS}}\right] . \end{aligned}$$

This operator \(L^t\) is the first order partial differential operator, which therefore generates a deterministic Markov process, whose dynamics is given by the system of characteristics arising from \(L^t\):

$$\begin{aligned} \dot{x}_{iI}= & {} \lambda \sum _{j\ne i} x_{jI} \mathbf {1}(u^{com}(jI)=i)-\lambda \sum _{j\ne i} x_{iI} \mathbf {1}(u^{com}(iI)=j)\nonumber \\&+\, x_{iS} q_-^i -x_{iI} q_+^i +\sum _j x_{iS}x_{jI} \beta _{ji}, \nonumber \\ \dot{x}_{iS}= & {} \lambda \sum _{j\ne i} x_{jS} \mathbf {1}(u^{com}(jS)=i)- \lambda \sum _{j\ne i} x_{iS} \mathbf {1}(u^{com}(iS)=j)\nonumber \\&-\, x_{iS} q_-^i +x_{iI} q_+^i -\sum _j x_{iS}x_{jI} \beta _{ji}, \end{aligned}$$
(1)

for all \(i=1, \ldots ,d\) (of course all \(x_{iS},x_{iI}\) are functions of time t).

Remark 2

We have sketched the derivation of system (1) as the dynamic law of large numbers for the Markov chain specified by the generator \(L_N^t\) of our initial model. The details of the rigorous derivation (showing that really the Markov chains converge to the deterministic limit given by Eq. (1), not just the formal expressions for the generators do) can be found e.g. in Kolokoltsov (2012) or (2014). Since system (1) clearly describes the intuitive meaning of the rates of changes involved in the process, in applied literature one usually writes down systems of this kind directly, without explicit reference to the corresponding Markov chains.

As was already mentioned above, the optimal behavior of agents depends on the payoffs in different states, terminal payoff and possibly costs for transitions. For simplicity we shall ignore here the latter. Talking about corrupted agents it is natural to talk about maximizing profit, while talking about infected computers it is natural to talk about minimizing costs. To unify the exposition we shall deal with the minimization of costs, which is equivalent to the maximization of their opposite values.

Recall that \(w_I^i\) and \(w_S^i\) denote the costs per time-unit of staying in iI and iS respectively. According to our interpretation of S as a better state, \(w^i_S<w^i_I\) for all i.

Given the evolution of the states \(x=x(s)\) of the whole system on a time interval [tT], the individually optimal costs g(iI) and g(iS) and individually optimal control \(u^{ind}_s(iI)\) and \(u^{ind}_s(iS)\) of an arbitrary agent can be found from the HJB equation

$$\begin{aligned}&{\dot{g}}_t(iI)+\lambda \min _u \sum _{j=1}^d \mathbf {1}(u(iI)=j)(g_t(jI)-g_t(iI))+q^i_+(g_t(iS)-g_t(iI)) \nonumber \\&\quad +\,w_I^i=0, \nonumber \\&{\dot{g}}_t(iS)+\lambda \min _u \sum _{j=1}^d \mathbf {1}(u(iS)=j)(g_t(jS)-g_t(iS)) +q^i_- (g_t(iI)-g_t(iS)) \nonumber \\&\quad + \sum _{j=1}^d \beta _{ji} x_{jI}(s)(g_t(iI)-g_t(iS))+w_S^i=0, \end{aligned}$$
(2)

which holds for all i and is complemented by the terminal condition \(g_T(iI),g_T(iS)\).

Remark 3

Equation (2) is derived in a standard manner by the following argument. Assuming g is differentiable in t and \(\tau \) is small, one represents the value of the optimal payoff \(g_t(iI)\) in the state iI using the optimality principle as

$$\begin{aligned} g_t(iI)= & {} w^i_I\tau +\min _u\Bigg [q^i_+\tau g_{t+\tau }(iS) +\tau \lambda \sum _j \mathbf {1}(u(iI)=j)g_{t+\tau }(jI)\\&+\,\Bigg (1-q^i_+\tau -\lambda \tau \sum _j\mathbf {1}(u(iI)=j)\Bigg )g_{t+\tau }(iI)\Bigg ]. \end{aligned}$$

Expanding the last \(g_{t+\tau }\) in the Taylor series and cancelling first \(g_t(iI)\) and then \(\tau \) yields

$$\begin{aligned}&{\dot{g}}_t(iI) +w^i_I+q^i_+ (g_{t+\tau }(iS)-g_t(iI)) + \lambda \min _u\\&\quad \times \sum _j \mathbf {1}(u(iI)=j)(g_{t+\tau }(jI) -g_t(iI))=o(1), \end{aligned}$$

with \(o(1)\rightarrow 0\), as \(\tau \rightarrow 0\). Passing to the limit \(\tau \rightarrow 0\) in this equation yields the first equation of (2). The second one is obtained analogously.

The basic MFG consistency equation for a time interval [tT] can now be written as \(u_s^{com}=u_s^{ind}\).

Remark 4

The reasonability of this condition in the setting of the large number of players is more or less obvious. And in fact in many situations it was proved rigorously that its solutions represent the \(\epsilon \)-Nash equilibrium for the corresponding Markov model of N players, with \(\epsilon \rightarrow 0\) as \(N\rightarrow \infty \), see e.g. Basna et al. (2014) for finite state models considered here.

In this paper we shall mostly work with discounted payoff with the discounting coefficient \(\delta >0\), in which case the HJB equation for the discounted optimal payoff \(e^{-t\delta }g_t\) of an individual player with any time horizon T writes down as (by putting \(e^{-t\delta }g_t\) instead of \(g_t\) in (2))

$$\begin{aligned} {\dot{g}}_t(iI)+\lambda \min _u \sum _{j=1}^d \mathbf {1}(u(iI)&=j)(g_t(jI)-g_t(iI)) +q^i_+(g_t(iS)-g_t(iI))\nonumber \\&\quad +w_I^i=\delta g_t(iI), \nonumber \\ {\dot{g}}_t(iS)+\lambda \min _u \sum _{j=1}^d \mathbf {1}(u(iS)&=j)(g_t(jS)-g_t(iS)) +q^i_- (g_t(iI)-g_t(iS)) \nonumber \\&\quad + \sum _{j=1}^d \beta _{ji} x_{jI}(s)(g_t(iI)-g_t(iS))+w_S^i=\delta g_t(iS) \end{aligned}$$
(3)

that hold for all i and is complemented by certain terminal conditions \(g_T(iI),g_T(iS)\).

Notice that since this is an equation in a Euclidean space with Lipschitz coefficients, it has a unique solution for \(s\le T\) and any given boundary condition g at time T and any bounded measurable functions \(x_{iI}(s)\).

For the discounted payoff the basic MFG consistency equation \(u_s^{com}=u_s^{ind}\) for a time interval [tT] can be reformulated by saying that xug solve the coupled forward–backward system (1), (3), so that \(u_s^{com}\) used in (1) coincide with the minimizers in (3). The main objective of the paper is to provide a general class of solutions of the discounted MFG consistency equation with stationary (time-independent) controls \(u^{com}\).

As a first step to this objective we shall analyse the fully stationary solutions, when the evolution (1) is replaced by the corresponding fixed point condition:

$$\begin{aligned}&\lambda \sum _{j\ne i} x_{jI} \mathbf {1}(u^{com}(jI)=i)-\lambda \sum _{j\ne i} x_{iI} \mathbf {1}(u^{com}(iI)=j)\nonumber \\&\quad +\,x_{iS} q_-^i -x_{iI} q_+^i +\sum _j x_{iS}x_{jI} \beta _{ji}=0, \nonumber \\&\lambda \sum _{j\ne i} x_{jS} \mathbf {1}(u^{com}(jS)=i)-\lambda \sum _{j\ne i} x_{iS} \mathbf {1}(u^{com}(iS)=j)\nonumber \\&\quad -\,x_{iS} q_-^i +x_{iI} q_+^i -\sum _j x_{iS}x_{jI} \beta _{ji}=0. \end{aligned}$$
(4)

There are two standard stationary optimization problems naturally linked with a dynamic one, one being the search for the average payoff for long period game, and another the search for discounted optimal payoff. The first is governed by the solutions of HJB of the form \((T-t)\mu +g\), (with g not depending on time), that is, linear in time t. Then \(\mu \) describes the optimal average payoff and g satisfies the stationary HJB equation:

$$\begin{aligned}&\lambda \min _u \sum _{j=1}^d \mathbf {1}(u(iI)=j)(g(jI)-g(iI)) +q^i_+(g(iS)-g(iI)) +w_I^i =\mu , \nonumber \\&\lambda \min _u \sum _{j=1}^d \mathbf {1}(u(iS)=j)(g(jS)-g(iS)) +q^i_- (g(iI)-g(iS)) \nonumber \\&\qquad \qquad \qquad \qquad \qquad + \sum _{j=1}^d \beta _{ji} x_{jI}(g(iI)-g(iS))+w_S^i =\mu . \end{aligned}$$
(5)

In the second problem, if the discounting coefficient is \(\delta \), the stationary discounted optimal payoff g satisfies the stationary version of (3):

$$\begin{aligned}&\lambda \min _u \sum _{j=1}^d \mathbf {1}(u(iI)=j)(g(jI)-g(iI)) +q^i_+(g(iS)-g(iI)) +w_I^i =\delta g(iI), \nonumber \\&\lambda \min _u \sum _{j=1}^d \mathbf {1}(u(iS)=j)(g(jS)-g(iS)) +q^i_- (g(iI)-g(iS)) \nonumber \\&\qquad \qquad \qquad \qquad \qquad + \sum _{j=1}^d \beta _{ji} x_{jI}(g(iI)-g(iS))+w_S^i =\delta g(iS). \end{aligned}$$
(6)

In Kolokoltsov and Malafeyev (2015) and Kolokoltsov and Bensoussan (2015) we concentrated on the first approach, and here we shall concentrate on the second one, with a discounted payoff. The stationary MFG consistency condition is the coupled system of Eqs. (4) and (6), so that the individually optimal stationary control \(u^{ind}\) found from (6) coincides with the common stationary control \(u^{com}\) from (4).

For simplicity we shall be interested in non-degenerate controls \(u^{ind}\) characterized by the condition that the minimum in (6) is always attained on a single value of u.

Remark 5

(i) Non-degeneracy assumption is common in the MFG modeling, since the argmax of the control provides the coupling with the forward equation on the states, where non-uniqueness creates a nontrivial question of choosing a representative. (ii) In our case non-degeneracy is a very natural assumption of the ‘general position’. It is equivalent to the assumption that, for a solution g, the minimum of g(iI) is achieved on only one i and the minimum of g(jS) is also achieved on only one j. Any ‘strange’ coincidence of two values of g can be removed by arbitrary weak changes in the parameters of the model. (iii) If non-degeneracy is relaxed, we get of course much more complicated solutions, with the support of the equilibrium distributed between the states providing the minima of g(iI) and g(iS).

A new technical novelty as compared with Kolokoltsov and Bensoussan (2015) and Kolokoltsov and Malafeyev (2015) will be systematic working in the asymptotic regimes of small discount \(\delta \) and small interaction coefficients \(\beta _{ij}\). This approach leads to more or less explicit calculations of stationary MFG solutions and their further justification.

Remark 6

In Kolokoltsov and Bensoussan (2015) and Kolokoltsov and Malafeyev (2015) we managed to obtain explicit solutions to the models with three and four states relying less strongly on the asymptotic approach [only ‘large \(\lambda \)’ regime was already introduced in Kolokoltsov and Bensoussan (2015)]. Thus obtained solutions were already complicated enough, but they told us what general features one could expect in more general situations. Even if some explicit formulas would be possible in the present case, they would be extremely lengthy without any clear insights revealed. Searching for an appropriate small parameter is the most fundamental approach in all natural sciences. Especially a nonlinearity is usually analysed as a small perturbation to linear problems. In the same lines is our ‘small \(\beta _{ij}\) assumption’. Large \(\lambda \) limit is also very natural: why one should wait long time to execute one’s own decisions? Finally our ‘small \(\delta \)’ asymptotics means in practical terms that the planning horizon is not very large, which is quite common for every day reasoning.

3 Stationary MFG problem

We start by identifying all possible stationary non-degenerate controls that can occur as solutions of (6). Let [i(I), k(S)] denote the following strategy: switch to strategy i when in I and to k when in S, that is, \(u(jI)=i\) and \(u(jS)=k\) for all j.

Proposition 3.1

Non-degenerate controls solving (6) could be only of the type [i(I), k(S)].

Proof

Let i be the unique minimum point of g(iI) and k the unique minimum point of g(iS). Then the optimal strategy is [i(I), k(S)]. \(\square \)

Let us consider first the control [i(I), i(S)] denoting it by \(\hat{u}^i\):

$$\begin{aligned} \hat{u}^i(jS)=\hat{u}^i(jI)=i, \quad j=1, \ldots , d. \end{aligned}$$

We shall refer to the control \(\hat{u}^i\) as the one with the strategy i individually optimal.

The control \(\hat{u}^i\) and the corresponding distribution x solve the stationary MFG problem if they solve the corresponding HJB (6), that is

$$\begin{aligned} \left\{ \begin{aligned}&q^i_+(g(iS)-g(iI)) +w^i_I =\delta g(iI), \\&q^i_-(g(iI)-g(iS)) + \sum _{k}\beta _{ki} x_{kI}(g(iI)-g(iS))+w_S^i =\delta g(iS), \\&\lambda (g(iI)-g(jI)) +q^j_+(g(jS)-g(jI)) +w_I^j =\delta g(jI), \quad j\ne i, \\&\lambda (g(iS)-g(jS)) +q^j_- (g(jI)-g(jS)) \\&\quad + \sum _k\beta _{kj} x_{kI}(g(jI)-g(jS))+w_S^j =\delta g(jS), \quad j\ne i, \end{aligned} \right. \end{aligned}$$
(7)

where for all \(j\ne i\)

$$\begin{aligned} g(iI) \le g(jI), \quad g(iS) \le g(jS), \end{aligned}$$
(8)

and x is a fixed point of the evolution (4) with \(u^{com}=\hat{u}^i\), that is

$$\begin{aligned} \left\{ \begin{aligned}&x_{iS} q_-^i -x_{iI} q_+^i +\sum _j x_{iS}x_{jI} \beta _{ji} +\lambda \sum _{j\ne i} x_{jI}=0, \\&-x_{iS} q_-^i +x_{iI} q_+^i -\sum _j x_{iS}x_{jI} \beta _{ji} +\lambda \sum _{j\ne i} x_{jS}=0, \\&x_{jS} q_-^j -x_{jI} q_+^j +\sum _k x_{jS}x_{kI} \beta _{kj} -\lambda x_{jI}=0, \quad j\ne i,\\&-x_{jS} q_-^j +x_{jI} q_+^j -\sum _k x_{jS}x_{kI} \beta _{kj} -\lambda x_{jS}=0, \quad j\ne i. \end{aligned} \right. \end{aligned}$$
(9)

This solution \((\hat{u}^i,x)\) is stable if x is a stable fixed point of the evolution (1) with \(u^{com}=\hat{u}^i\), that is, of the evolution

$$\begin{aligned} \left\{ \begin{aligned}&\dot{x}_{iI}=x_{iS} q_-^i -x_{iI} q_+^i +\sum _j x_{iS}x_{jI} \beta _{ji} +\lambda \sum _{j\ne i} x_{jI}, \\&\dot{x}_{iS} =-x_{iS} q_-^i +x_{iI} q_+^i -\sum _j x_{iS}x_{jI} \beta _{ji} +\lambda \sum _{j\ne i} x_{jS}, \\&\dot{x}_{jI} = x_{jS} q_-^j -x_{jI} q_+^j +\sum _k x_{jS}x_{kI} \beta _{kj} -\lambda x_{jI}, \quad j\ne i,\\&\dot{x}_{jS} = -x_{jS} q_-^j +x_{jI} q_+^j -\sum _k x_{jS}x_{kI} \beta _{kj} -\lambda x_{jS}, \quad j\ne i. \end{aligned} \right. \end{aligned}$$
(10)

Adding together the last two equations of (9) we find that \(x_{jI}=x_{jS}=0\) for \(j\ne i\), as one could expect. Consequently, the whole system (9) reduces to the single equation

$$\begin{aligned} x_{iS}q^i_- +x_{iI} \beta _{ii} x_{iS} -x_{iI} q^i_+ =0, \end{aligned}$$

which, for \(y=x_{iI}\), \(1-y=x_{iS}\), yields the quadratic equation

$$\begin{aligned} Q(y)= \beta _{ii}y^2+y(q^i_+ -\beta _{ii} +q^i_-) -q^i_-=0, \end{aligned}$$

with the unique solution on the interval (0, 1):

$$\begin{aligned} x^*=\frac{1}{2\beta _{ii}}\left[ \beta _{ii}-q^i_+-q^i_- +\sqrt{(\beta _{ii} +q^i_-)^2+(q^i_+)^2-2 q^i_+ (\beta _{ii} -q^i_-)}\right] . \end{aligned}$$
(11)

To analyze stability of the fixed point \(x_{iI}=x^*, x_{iS}=1-x^*\) and \(x_{jI}=x_{jS}=0\) for \(j\ne i\), we introduce the variables \(y=x_{iI}-x^*\). In terms of y and \(x_{jI},x_{jS}\) with \(j\ne i\), system (10) rewrites as

$$\begin{aligned} \left\{ \begin{aligned}&\dot{y} =\left[ 1-x^*-y-\sum _{j\ne i} (x_{jI}+x_{jS})\right] \left[ q_- +\sum _{k\ne i} x_{kI} \beta _{ki}+(y+x^*)\beta _{ii}\right] -(y+x^*) q^i_+ +\lambda \sum _{j\ne i} x_{jI}, \\&\dot{x}_{jI} =x_{jS} \left[ q_-^j +\sum _{k\ne i}x_{kI} \beta _{kj}+(y+x^*) \beta _{ij}\right] -x_{jI} q_+^j -\lambda x_{jI}, \quad j\ne i,\\&\dot{x}_{jS} =-x_{jS} \left[ q_-^j +\sum _{k\ne i}x_{kI} \beta _{kj}+(y+x^*) \beta _{ij}\right] +x_{jI} q_+^j -\lambda x_{jS}, \quad j\ne i. \end{aligned} \right. \end{aligned}$$
(12)

Its linearized version around the fixed point zero is

$$\begin{aligned} \left\{ \begin{aligned}&\dot{y} =(1-x^*)\left( \sum _{k\ne i} x_{kI} \beta _{ki}+y\beta _{ii}\right) -\left[ y+\sum _{k\ne i} (x_{kI}+x_{kS})\right] (q_-^i +x^*\beta _{ii}) -y q^i_+ +\sum _{k\ne i} \lambda x_{kI}, \\&\dot{x}_{jI} =x_{jS} (q_-^j +x^* \beta _{ij}) -x_{jI} q_+^j -\lambda x_{jI}, \quad j\ne i,\\&\dot{x}_{jS} =-x_{jS} (q_-^j +x^* \beta _{ij}) +x_{jI} q_+^j -\lambda x_{jS}, \quad j\ne i. \end{aligned} \right. \end{aligned}$$

Since the equations for \(x_{jI}, x_{jS}\) contain neither y nor other variables, the eigenvalues of this linear system are

$$\begin{aligned} \xi _i=(1-2x^*)\beta _{ii} -q^i_- -q^i_+, \end{aligned}$$

and \((d-1)\) pairs of eigenvalues arising from \((d-1)\) systems

$$\begin{aligned} \left\{ \begin{aligned}&\dot{x}_{jI} =x_{jS} (q_-^j +x^* \beta _{ij}) -x_{jI} q_+^j -\lambda x_{jI}, \quad j\ne i,\\&\dot{x}_{jS} =-x_{jS} (q_-^j +x^* \beta _{ij}) +x_{jI} q_+^j -\lambda x_{jS}, \quad j\ne i, \end{aligned} \right. \end{aligned}$$

that is

$$\begin{aligned} \left\{ \begin{aligned}&\xi _1^j = -\lambda -(q_+^j+q^j_-+x^* \beta _{ii}) \\&\xi _2^j = -\lambda . \end{aligned} \right. \end{aligned}$$

These eigenvalues being always negative, the condition of stability is reduced to the negativity of the first eigenvalue \(\xi _i\):

$$\begin{aligned} 2x^*> 1-\frac{q^i_+ +q^i_-}{\beta _{ii}}. \end{aligned}$$

But this is true due to (11) implying that this fixed point is always stable (by the Grobman–Hartman theorem).

Next, the HJB equation (7) takes the form

$$\begin{aligned} \left\{ \begin{aligned}&q^i_+(g(iS)-g(iI)) +w^i_I =\delta g(iI), \\&q^i_-(g(iI)-g(iS)) + \beta _{ii} x^*(g(iI)-g(iS))+w_S^i =\delta g(iS), \\&\lambda (g(iI)-g(jI)) +q^j_+(g(jS)-g(jI)) +w_I^j =\delta g(jI), \quad j\ne i, \\&\lambda (g(iS)-g(jS)) +q^j_- (g(jI)-g(jS)) + \beta _{ij} x^*(g(jI)-g(jS))\\&\quad +w_S^j =\delta g(jS), \quad j\ne i, \end{aligned} \right. \end{aligned}$$
(13)

Subtracting the first equation from the second one yields

$$\begin{aligned} g(iI)-g(iS)=\frac{w^i_I-w^i_S}{q_-^i+q_+^i+\beta _{ii}x^* +\delta }. \end{aligned}$$
(14)

In particular, \(g(iI)>g(iS)\) always, as expected. Next, by the first equation of (13),

$$\begin{aligned} \delta g(iI)=w^i_I- \frac{q^i_+(w^i_I-w^i_S)}{q_-^i+q_+^i+\beta _{ii}x^* +\delta }. \end{aligned}$$
(15)

Consequently,

$$\begin{aligned} \delta g(iS)=w^i_I- \frac{(q^i_++\delta )(w^i_I-w^i_S)}{q_-^i+q_+^i+\beta _{ii}x^* +\delta } =w^i_S+ \frac{(q^i_-+\beta _{ii}x^*)(w^i_I-w^i_S)}{q_-^i+q_+^i+\beta _{ii}x^* +\delta }. \end{aligned}$$
(16)

Subtracting the third equation of (13) from the fourth one yields

$$\begin{aligned} (\lambda +q^j_++q^j_- +\beta _{ii}x^*+\delta )(g(jI)-g(jS))-\lambda (g(iI)-g(iS))=w^i_I-w^i_S, \end{aligned}$$

implying

$$\begin{aligned} g(jI)-g(jS)= & {} \frac{w^j_I-w^j_S+\lambda (g(iI)-g(iS))}{\lambda +q^j_++q^j_-+\beta _{ij}x^*+\delta }=g(iI)-g(iS)\nonumber \\&+\,[(w^j_I-w^j_S)-(g(iI)-g(iS))(q^j_++q^j_-\nonumber \\&+\,\beta _{ij}x^*+\delta )]\lambda ^{-1} +O(\lambda ^{-2}). \end{aligned}$$
(17)

From the fourth equation of (13) it now follows that

$$\begin{aligned} (\delta +\lambda ) g(jI)=w^j_I-q^j_+(g(jI)-g(jS))+\lambda g(iI), \end{aligned}$$

so that

$$\begin{aligned} g(jI)=g(iI)+[w^j_I-q^j_+(g(iI)-g(iS)) -\delta g(iI)]\lambda ^{-1} +O(\lambda ^{-2}). \end{aligned}$$
(18)

Consequently,

$$\begin{aligned} g(jS)= & {} g(jI)-(g(jI)-g(jS))=g(iS)+[w^j_S+(q^j_- +\beta _{ii}x^*)\nonumber \\&\times (g(iI)-g(iS)) -\delta g(iS)]\lambda ^{-1} +O(\lambda ^{-2}). \end{aligned}$$
(19)

Thus conditions (8) in the main order in \(\lambda \rightarrow \infty \) become

$$\begin{aligned}&w^j_I-q^j_+(g(iI)-g(iS)) -\delta g(iI)\ge 0, \\&w^j_S+(q^j_- +\beta _{ii}x^*)(g(iI)-g(iS)) -\delta g(iS) \ge 0, \end{aligned}$$

or equivalently

$$\begin{aligned} w^j_I-w^i_I\ge & {} \frac{(q^j_+ -q^i_+)(w^i_I-w^i_S)}{q_-^i+q_+^i+\beta _{ii}x^* +\delta }, \quad \nonumber \\ w^j_S-w^i_S\ge & {} \frac{[q^i_- -q^j_- +(\beta _{ii}-\beta _{ij})x^*](w^i_I-w^i_S)}{q_-^i+q_+^i+\beta _{ii}x^* +\delta }. \end{aligned}$$
(20)

In the first order in small \(\beta _{ij}\) this gets the simpler form, independent of \(x^*\):

$$\begin{aligned} \frac{w^j_I-w^i_I}{w^i_I-w^i_S} \ge \frac{q^j_+ -q^i_+}{q_-^i+q_+^i +\delta }, \quad \frac{w^j_S-w^i_S}{w^i_I-w^i_S} \ge \frac{q^i_- -q^j_-}{q_-^i+q_+^i +\delta }. \end{aligned}$$
(21)

Conversely, if inequalities in (21) are strict, then (20) also hold with the strict inequalities for sufficiently small \(\beta _{ij}\). Consequently (8) also hold with the strict inequalities.

Summarizing, we proved the following.

Proposition 3.2

If (21) holds for all \(j\ne i\) with the strict inequality, then for sufficiently large \(\lambda \) and sufficiently small \(\beta _{ij}\) there exists a unique solution to the stationary MFG consistency problem (4) and (6) with the optimal control \(\hat{u}^i\), the stationary distribution is \(x_i^I=x^*, x_i^S=1-x^*\) with \(x^*\) given by (11) and it is stable; the optimal payoffs are given by (15), (16), (18), (19). Conversely, if for all sufficiently large \(\lambda \) there exists a solution to the stationary MFG consistency problem (4) and (6) with the optimal control \(\hat{u}^i\), then (20) holds.

Remark 7

Notice that condition (21) states roughly that the losses arising from changing from i to j due to everyday fees are larger than the gains that may arise from the better promotion climate in j as compared to i.

Let us turn to control [i(I), k(S)] with \(k\ne i\) denoting it by \(\hat{u}^{i,k}\):

$$\begin{aligned} \hat{u}^{i,k}(jS)=k, \quad \hat{u}^{i,k}(jI)=i, \quad j=1, \ldots , d. \end{aligned}$$

The fixed point condition under \(u^{com}=\hat{u}^{i,k}\) takes the form

$$\begin{aligned} \left\{ \begin{aligned}&x_{iS} q_-^i -x_{iI} q_+^i +\sum _j x_{iS}x_{jI} \beta _{ji} +\lambda \sum _{j\ne i} x_{jI} =0 \\&-x_{iS} q_-^i +x_{iI} q_+^i -\sum _j x_{iS}x_{jI} \beta _{ji} -\lambda x_{iS}=0 \\&x_{kS} q_-^k -x_{kI} q_+^k +\sum _j x_{kS}x_{jI} \beta _{jk} -\lambda x_{kI} =0 \\&-x_{kS} q_-^i +x_{kI} q_+^k -\sum _j x_{kS}x_{jI} \beta _{jk} +\lambda \sum _{j\ne k} x_{jS}=0 \\&x_{lS} q_-^l -x_{lI} q_+^l +\sum _j x_{lS}x_{jI} \beta _{jl} -\lambda x_{lS} =0 \\&-x_{lS} q_-^l +x_{lI} q_+^l -\sum _j x_{lS}x_{jI} \beta _{jl} -\lambda x_{lI} =0, \end{aligned} \right. \end{aligned}$$
(22)

where \(l\ne i,k\).

Adding the last two equations yields \(x_{lI}+x_{lS}=0\) and hence \(x_{lI}=x_{lS}=0\) for all \(l \ne i,k\), as one could expect. Consequently, for indices ik the system gets the form

$$\begin{aligned} \left\{ \begin{aligned}&x_{iS} q_-^i -x_{iI} q_+^i + x_{iS}x_{iI} \beta _{ii} + x_{iS}x_{kI} \beta _{ki} +\lambda x_{kI} =0 \\&-x_{iS} q_-^i +x_{iI} q_+^i -x_{iS}x_{iI} \beta _{ii} - x_{iS}x_{kI} \beta _{ki} -\lambda x_{iS}=0 \\&x_{kS} q_-^k -x_{kI} q_+^k + x_{kS}x_{kI} \beta _{kk} + x_{kS}x_{iI} \beta _{ik} -\lambda x_{kI} =0 \\&-x_{kS} q_-^i +x_{kI} q_+^k - x_{kS}x_{kI} \beta _{kk} - x_{kS}x_{iI} \beta _{ik} +\lambda x_{iS} =0 \end{aligned} \right. \end{aligned}$$
(23)

Adding the first two equation (or the last two equations) yields \(x_{kI}=x_{iS}\). Since by normalization

$$\begin{aligned} x_{kS}=1- x_{iS}-x_{kI}-x_{iI}= 1-x_{iI}-2x_{kI}, \end{aligned}$$

we are left with two equations only:

$$\begin{aligned} \left\{ \begin{aligned}&x_{kI} q_-^i -x_{iI} q_+^i + x_{kI}x_{iI} \beta _{ii} + x^2_{kI} \beta _{ki} +\lambda x_{kI} =0 \\&(1-x_{iI}-2x_{kI}) (q_-^k +x_{kI} \beta _{kk} + x_{iI} \beta _{ik}) -(\lambda +q^k_+)x_{kI} =0. \end{aligned} \right. \end{aligned}$$
(24)

From the first equation we obtain

$$\begin{aligned} x_{iI}=\frac{ \lambda x_{kI}+\beta _{ki}x_{kI}^2+q^i_-x_{kI}}{q^i_+-x_{kI}\beta _{ii}}=\frac{ \lambda x_{kI}}{q^i_+-x_{kI}\beta _{ii}}(1+O(\lambda ^{-1})). \end{aligned}$$

Hence \(x_{kI}\) is of order \(1/\lambda \), and therefore

$$\begin{aligned} x_{iI}=\frac{ \lambda x_{kI}}{q^i_+}(1+O(\lambda ^{-1})) \Longleftrightarrow x_{kI}=\frac{x_{iI} q^i_+}{\lambda }(1+O(\lambda ^{-1})). \end{aligned}$$
(25)

In the major order in large \(\lambda \) asymptotics, the second equation of (24) yields

$$\begin{aligned} (1-x_{iI})(q^k_- +\beta _{ik} x_{iI})-q^i_+x_{iI}=0 \end{aligned}$$

or for \(y=x_{iI}\)

$$\begin{aligned} Q(y)= \beta _{ik}y^2+y(q^i_+ -\beta _{ik} +q^k_-) -q^k_-=0, \end{aligned}$$

which is effectively the same equation as the one that appeared in the analysis of the control [i(I), i(S)]. It has the unique solution on the interval (0, 1):

$$\begin{aligned} x_{iI}^*=\frac{1}{2\beta _{ik}}\left[ \beta _{ik}-q^i_+-q^k_- +\sqrt{(\beta _{ik} +q^k_-)^2+(q^i_+)^2-2 q^i_+ (\beta _{ik} -q^k_-)}\right] . \end{aligned}$$
(26)

Let us note that for small \(\beta _{ik}\) it expands as

$$\begin{aligned} x_{iI}^*=\frac{q^k_-}{q^k_-+q^i_+}+O(\beta )=\frac{q^k_-}{q^k_-+q^i_+}+\frac{q^k_- q^i_+ }{(q^k_-+q^i_+)^3}\beta +O(\beta ^2). \end{aligned}$$
(27)

Similar (a bit more lengthy) calculations, as for the control [i(I), i(S)] show that the obtained fixed point of evolution (1) is always stable. We omit the detail, as they are the same as given in Kolokoltsov and Bensoussan (2015) for the case \(d=2\).

Let us turn to the HJB equation (7), which under control [i(I), k(S)] takes the form

$$\begin{aligned} \left\{ \begin{aligned}&q^i_+(g(iS)-g(iI)) +w^i_I =\delta g(iI), \\&\lambda (g(kS)-g(iS))+{\tilde{q}}^i_-(g(iI)-g(iS))+w_S^i =\delta g(iS), \\&\lambda (g(iI)-g(kI)) +q^k_+(g(kS)-g(kI)) +w_I^k =\delta g(kI), \\&{\tilde{q}}^k_- (g(kI)-g(kS))+w^k_S=\delta g(kS) \\&\lambda (g(iI)-g(jI)) +q^j_+(g(jS)-g(jI)) +w_I^j =\delta g(jI), \quad j\ne i,k, \\&\lambda (g(kS)-g(jS)) +{\tilde{q}}^j_- (g(jI)-g(jS))+w_S^j =\delta g(jS), \quad j\ne i,k, \end{aligned} \right. \end{aligned}$$
(28)

supplemented by the consistency condition

$$\begin{aligned} g(iI) \le g(jI), \quad g(kS) \le g(jS), \end{aligned}$$
(29)

for all j, where we introduced the notation

$$\begin{aligned} {\tilde{q}}^j_- ={\tilde{q}}^j_-(i,k)= q^j_- + \beta _{ij}x_{iI}+\beta _{kj}x_{kI}. \end{aligned}$$
(30)

The first four equations do not depend on the rest of the system and can be solved independently. To begin with, we use the first and the fourth equation to find

$$\begin{aligned} g(iS)= g(iI)+\frac{\delta g(iI)-w^i_I}{q^i_+}, \quad g(kI)= g(kS)+\frac{\delta g(kS)-w^k_S}{{\tilde{q}}^k_-}. \end{aligned}$$
(31)

Then the second and the third equations can be written as the system for the variables g(kS) and g(iI):

$$\begin{aligned} \left\{ \begin{aligned}&\lambda g(kS)-(\lambda +\delta )g(iI) -(\lambda +\delta +{\tilde{q}}^i_-)\frac{\delta g(iI)-w^i_I}{q^i_+} +w_S^i =0 \\&\lambda g(iI)- (\lambda +\delta ) g(kS)-(\lambda +\delta +q^k_+)\frac{\delta g(kS)-w^k_S}{{\tilde{q}}^k_-} +w_I^k =0, \end{aligned} \right. \end{aligned}$$

or simpler as

$$\begin{aligned} \left\{ \begin{aligned}&\lambda q^i_+ g(kS)-[\lambda (q^i_+ +\delta ) +\delta (q^i_++{\tilde{q}}^i_-+\delta )]g(iI) = -w^i_I(\lambda +\delta +{\tilde{q}}^i_-)-w_S^i q^i_+ \\&[\lambda ({\tilde{q}}^k_- +\delta ) +\delta ({\tilde{q}}^k_-+q^k_++\delta )] g(kS) -\lambda {\tilde{q}}^k_- g(iI) =w^k_I {\tilde{q}}^k_-+w^k_S(\lambda +\delta +q^k_+). \end{aligned} \right. \end{aligned}$$
(32)

Let us find the asymptotic behavior of the solution for large \(\lambda \). To this end let us write

$$\begin{aligned} g(iS)=g^0(iS)+\frac{g^1(iS)}{\lambda } + O(\lambda ^{-2}) \end{aligned}$$

with similar notations for other values of g. Dividing (32) by \(\lambda \) and preserving only the leading terms in \(\lambda \) we get the system

$$\begin{aligned} \left\{ \begin{aligned}&q^i_+ g^0(kS)-(q^i_+ +\delta ) g^0(iI) = -w^i_I, \\&({\tilde{q}}^k_- +\delta ) g^0(kS) -{\tilde{q}}^k_- g^0(iI) =w^k_S. \end{aligned} \right. \end{aligned}$$
(33)

Solving this system and using (31) to find the corresponding leading terms \(g^0(iS)\), \(g^0(kI)\) yields

$$\begin{aligned} \begin{aligned}&g^0(iS)=g^0(kS)=\frac{1}{\delta } \frac{{\tilde{q}}^k_-w^i_I+q^i_+w^k_S+\delta w^k_S}{{\tilde{q}}^k_-+q^i_+ +\delta }, \\&g^0(kI)=g^0(iI)=\frac{1}{\delta } \frac{{\tilde{q}}^k_-w^i_I+q^i_+w^k_S+\delta w^i_I}{{\tilde{q}}^k_-+q^i_+ +\delta }. \end{aligned} \end{aligned}$$
(34)

The remarkable equations \(g^0(iS)=g^0(kS)\) and \(g^0(kI)=g^0(iI)\) arising from the calculations have natural interpretation: for instantaneous execution of personal decisions the discrimination between strategies i and j is not possible. Thus to get the conditions ensuring (29) we have to look for the next order of expansion in \(\lambda \).

Keeping in (32) the terms of zero-order in \(1/\lambda \) yields the system

$$\begin{aligned} \left\{ \begin{aligned}&q^i_+ g^1(kS)-(q^i_+ +\delta ) g^1(iI) =\delta (q^i_++{\tilde{q}}^i_-+\delta )g^0(iI) -w^i_I(\delta +{\tilde{q}}^i_-)-w_S^i q^i_+ \\&({\tilde{q}}^k_- +\delta ) g^1(kS) -{\tilde{q}}^k_- g^1(iI)= -\delta ({\tilde{q}}^k_-+q^k_++\delta ) g^0(kS) +w^k_I {\tilde{q}}^k_- +w^k_S(\delta +q^k_+). \end{aligned} \right. \end{aligned}$$
(35)

Taking into account (34), conditions \(g(iI)\le g(kI)\) and \(g(kS)\le g(iS)\) turn to

$$\begin{aligned} {\tilde{q}}^k_- g^1(iI)\le g^1(kS)({\tilde{q}}^k_- + \delta ), \quad q^i_+ g^1(kS)\le g^1(iI) (q^i_+ +\delta ). \end{aligned}$$
(36)

Solving (35) we obtain

$$\begin{aligned}&g^1(kS)\delta ({\tilde{q}}^k_-+q^i_+ +\delta ) ={\tilde{q}}^k_- [ q^i_+ w^i_S + (q^i_+ +\delta ) w^k_I + ({\tilde{q}}^i_- -{\tilde{q}}^k_- -q^i_+ -\delta )w^i_I ] \nonumber \\&\quad +[q^i_+(q^k_+ -{\tilde{q}}^k_- -q^i_+)+\delta (q^k_+ -q^i_+)]w^k_S, \nonumber \\&g^1(iI)\delta ({\tilde{q}}^k_-+q^i_+ +\delta ) =q^i_+ [ {\tilde{q}}^k_- w^k_I + ({\tilde{q}}^k_- +\delta ) w^i_S +(q^k_+ -q^i_+ -{\tilde{q}}^k_- -\delta )w^k_S ] \nonumber \\&\quad +[{\tilde{q}}^k_-({\tilde{q}}^i_--q^i_+ -{\tilde{q}}^k_-)+\delta ({\tilde{q}}^i_- -{\tilde{q}}^k_-)]w^i_I, \end{aligned}$$
(37)

We can now check the conditions (36). Remarkably enough the r.h.s and l.h.s. of both inequalities always coincide for \(\delta =0\), so that the actual condition arises from comparing higher terms in \(\delta \). In the first order with respect to the expansion in small \(\delta \) conditions (36) turn out to take the following simple form

$$\begin{aligned} {\tilde{q}}^k_- (w^k_I-w^i_I) +w^k_S(q^k_+-q^i_+) \ge 0, \quad q^i_+ (w^i_S-w^k_S) +w^i_I({\tilde{q}}^i_- -{\tilde{q}}^k_-) \ge 0. \end{aligned}$$
(38)

From the last two equations of (28) we can find g(jS) and g(jI) for \(j\ne i,k\) yielding

$$\begin{aligned}&g(jI) =g(iI) +\frac{1}{\lambda } [w^j_I-\delta g(iI)+q^j_+ (g(iI)-g(kS))] +O(\lambda ^{-2}), \nonumber \\&g(jS) =g(kS) +\frac{1}{\lambda } [w^j_S-\delta g(kS)+{\tilde{q}}^j_- (g(iI)-g(kS))] +O(\lambda ^{-2}). \end{aligned}$$
(39)

From these equations we can derive the rest of the conditions (29), namely that \(g(iI) \le g(jI)\) for \(j\ne k\) and \(g(kS) \le g(jS)\) for \(j\ne i\). In the first order in the small \(\delta \) expansion they become

$$\begin{aligned} q^j_+(w^i_I-w^k_S)+w^j_I ({\tilde{q}}^k_-+q^i_+) \ge 0, \quad {\tilde{q}}^j_-(w^i_I-w^k_S)+w^j_S ({\tilde{q}}^k_-+q^i_+) \ge 0. \end{aligned}$$
(40)

Since for small \(\beta _{ij}\), the difference \({\tilde{q}}^j_- -q^j_-\) is small, we proved the following result.

Proposition 3.3

Assume

$$\begin{aligned}&q^j_+(w^i_I-w^k_S)+w^j_I (q^k_-+q^i_+)> 0, \quad j\ne k, \nonumber \\&q^j_-(w^i_I-w^k_S)+w^j_S (q^k_-+q^i_+)> 0, \quad j\ne i, \nonumber \\&q^k_- (w^k_I-w^i_I) +w^k_S(q^k_+-q^i_+)> 0, \quad q^i_+ (w^i_S-w^k_S) +w^i_I(q^i_- -q^k_-) > 0. \end{aligned}$$
(41)

Then for sufficiently large \(\lambda \), small \(\delta \) and small \(\beta _{ij}\) there exists a unique solution to the stationary MFG consistency problem (4) and (6) with the optimal control \(\hat{u}^{i,k}\), the stationary distribution is concentrated on strategies i and k with \(x_{iI}^*\) being given by (26) or (27) up to terms of order \(O(\lambda ^{-1})\), and it is stable; the optimal payoffs are given by (34), (37), (39).

Conversely, if for all sufficiently large \(\lambda \) and small \(\delta \) there exists a solution to the stationary MFG consistency problem (4) and (6) with the optimal control \(\hat{u}^{i,k}\), then (38) and (40) hold.

4 Main result

By the general result already mentioned above, see Basna et al. (2014), a solution of MFG consistency problem constructed above and considered on a finite time horizon will define an \(\epsilon \)-Nash equilibrium for the corresponding game of finite number of players. However, solutions given by Propositions 3.2 and 3.3 work only when the initial distribution and terminal payoff are exactly those given by the stationary solution. Of course, it is natural to ask what happens for other initial conditions. Stability results of Propositions 3.2 and 3.3 represent only a step in the right direction here, as they ensure stability only under the assumption that all (or almost all) players use from the very beginning the corresponding stationary control, which might not be the case. To analyse the stability properly, we have to consider the full time-dependent problem. For possibly time varying evolution x(t) of the distribution, the time-dependent HJB equation for the discounted optimal payoff \(e^{-t\delta }g\) of an individual player with any time horizon T has form (3).

In order to have a solution with a stationary u we have to show that solving the linear equation obtained from (3) by fixing this control will be consistent in the sense that this control will actually give minimum in (3) in all times.

For definiteness, let us concentrate on the stationary control \(\hat{u}^i\), the corresponding linear equation getting the form

$$\begin{aligned} \left\{ \begin{aligned}&{\dot{g}}(iI) + q^i_+(g(iS)-g(iI)) +w^i_I =\delta g(iI), \\&{\dot{g}}(iS) + q^i_-(g(iI)-g(iS)) + \sum _{k}\beta _{ki} x_{kI}(t)(g(iI)-g(iS))+w_S^i =\delta g(iS), \\&{\dot{g}}(jI) + \lambda (g(iI)-g(jI)) +q^j_+(g(jS)-g(jI)) +w_I^j =\delta g(jI), \quad j\ne i, \\&{\dot{g}}(jS) + \lambda (g(iS)-g(jS)) +q^j_- (g(jI)-g(jS)) \\&\quad + \sum _k\beta _{kj} x_{kI}(t)(g(jI)-g(jS))+w_S^j =\delta g(jS), \quad j\ne i, \end{aligned} \right. \end{aligned}$$
(42)

(the dependence of g on t is omitted for brevity) with the supplementary requirement (8), but which has to hold now for time-dependent solution g.

Theorem 4.1

Assume the strengthened form of (21) holds, that is

$$\begin{aligned} \frac{w^j_I-w^i_I}{w^i_I-w^i_S}> \frac{q^j_+ -q^i_+}{q_-^i+q_+^i +\delta }, \quad \frac{w^j_S-w^i_S}{w^i_I-w^i_S} > \frac{q^i_- -q^j_-}{q_-^i+q_+^i +\delta } \end{aligned}$$
(43)

for all \(j\ne i\). Assume moreover that

$$\begin{aligned} w^j_I-w^i_I\ge 0, \quad w^j_S-w^i_S\ge 0, \end{aligned}$$
(44)

for all \(j\ne i\). Then for any \(\lambda >0\) and all sufficiently small \(\beta _{ij}\) the following holds. For any \(T >t\), any initial distribution x(t) and any terminal values \(g_T\) such that \(g_T(jI)-g_T(jS) \ge 0\) for all j, \(g_T(iI)-g_T(iS)\) is sufficiently small and

$$\begin{aligned} g_T(iI) \le g_T(jI) \quad \text {and} \quad g_T(iS) \le g_T(jS), \quad j\ne i, \end{aligned}$$
(45)

there exists a unique solution to the discounted MFG consistency equation such that u is stationary and equals \(\hat{u}^i\) everywhere. Moreover, this solution is such that, for large \(T-t\), x(s) tends to the fixed point of Proposition 3.2 for \(s\rightarrow T\) and \(g_s\) stays near the stationary solution of Proposition 3.2 almost all time apart from a small initial period around t and some final period around T.

Remark 8

(i) The last property of our solution can be expressed by saying that the stationary solution provides the so-called turnpike for the time-dependent solution, see e.g. Kolokoltsov and Yang (2012) and Zaslavski (2006) for reviews in stochastic and deterministic settings. (ii) Condition (44) is natural: having better fees in a non-optimal state may create instability. In fact, with the terminal \(g_T\) vanishing, it is seen from (46) below, that if \(w^j_I-w^i_I< 0\), then the solution will be directly kicked of the region \(g(iI)\le g(jI)\), so that the stability of this region would be destroyed. It is an interesting question, what kind of solutions to the forward–backward system one could construct whenever if \(w^j_I-w^i_I< 0\) occurs. (iii) Similar time-dependent class of turnpike solutions can be constructed from the stationary control of Proposition 3.3.

Proof

To show that starting with the terminal condition belonging to the cone specified by (45) we shall stay in this cone for all \(t\le T\), it is sufficient to prove that on a boundary point of this cone that can be achieved by the evolution the inverted tangent vector of system (42) is not directed outside of the cone. This (more or less obvious) observation is a performance of the general result of Bony, see e. g. Redheffer (1972). From (42) we find that

$$\begin{aligned} {\dot{g}}(jI)-{\dot{g}}(iI)= & {} (\lambda +\delta ) (g(jI)-g(iI))+q^j_+(g(jI)-g(jS))\nonumber \\&-\,q^i_+(g(iI)-g(iS)) -(w_I^j-w^i_I). \end{aligned}$$
(46)

Therefore, the condition for staying inside the cone (45) for a boundary point with \(g(jI)=g(iI)\) reads out as \({\dot{g}}(jI)-{\dot{g}}(iI)\le 0\) or

$$\begin{aligned} (w_I^j-w^i_I) \ge q^j_+(g(jI)-g(jS))-q^i_+(g(iI)-g(iS)). \end{aligned}$$
(47)

Since \(g(jI)= g(iI)\),

$$\begin{aligned} 0 \le g(jI)-g(jS) \le g(iI)-g(iS). \end{aligned}$$

Therefore, if \(q_+^i\ge q^j_+\), the r.h.s. of (47) is non-positive and hence (47) holds by the first assumption of (44). Hence we can assume further that \(q_+^i< q^j_+\).

Again by

$$\begin{aligned} 0 \le g(jI)-g(jS) \le g(iI)-g(iS), \end{aligned}$$

a simpler sufficient condition for (47) is

$$\begin{aligned} (w_I^j-w^i_I) \ge (q^j_+ -q^i_+)(g(iI)-g(iS)). \end{aligned}$$
(48)

Subtracting the first two equations of (42) we find that

$$\begin{aligned} {\dot{g}}(iI) -{\dot{g}}(iS) =a(s) (g(iI)-g(iS)) -( w^i_I -w^i_S) \end{aligned}$$

with

$$\begin{aligned} a(t) =q^i_+ +q^i_- +\delta + \sum _{k}\beta _{ki} x_{kI}(t). \end{aligned}$$

Consequently,

$$\begin{aligned} g_t(iI) -g_t(iS)= & {} \exp \left\{ -\int _t^T a(s) \, ds\right\} (g_T(iI) -g_T(iS))\nonumber \\&+\,( w^i_I -w^i_S)\int _t^T \exp \left\{ -\int _t^s a(\tau ) \, d\tau \right\} ds. \end{aligned}$$
(49)

Therefore, condition (48) will be fulfilled for all sufficiently small \(g_T(iI) -g_T(iS)\) whenever

$$\begin{aligned} (w_I^j-w^i_I) > (q^j_+ -q^i_+)( w^i_I -w^i_S)\int _t^T \exp \left\{ -\int _t^s a(\tau ) \, d\tau \right\} ds. \end{aligned}$$
(50)

But since \(a(t) \ge q^i_+ +q^i_- +\delta \), we have

$$\begin{aligned} \exp \left\{ -\int _t^s a(\tau ) \, d\tau \right\} \le \exp \{-(s-t)(q^i_+ +q^i_- +\delta )\}, \end{aligned}$$

so that (48) holds if

$$\begin{aligned} \frac{w_I^j-w^i_I}{w^i_I -w^i_S} \ge \frac{q^j_+ -q^i_+}{q^i_+ +q^i_- +\delta } \left( 1- \exp \{ -(T-t)(q^i_+ +q^i_- +\delta )\}\right) , \end{aligned}$$
(51)

which is true under the first assumptions of (43), because \(q_+^i< q^j_+\).

Similarly, to study a boundary point with \(g(jS)=g(iS)\) we find that

$$\begin{aligned} {\dot{g}}(jS)-{\dot{g}}(iS)= & {} (\lambda +\delta ) (g(jS)-g(iS)) -\left( q^j_- + \sum _k\beta _{kj} x_{kI}\right) (g(jI)-g(jS))\\&+\left( q^i_- +\sum _{k}\beta _{ki} x_{kI}\right) (g(iI)-g(iS)) -(w_S^j-w^i_S). \end{aligned}$$

Therefore, the condition for staying inside the cone (45) for a boundary point with \(g(jS)=g(iS)\) reads out as

$$\begin{aligned} (w_S^j-w^i_S)\ge & {} \left( q^i_- +\sum _{k}\beta _{ki} x_{kI}\right) (g(iI)-g(iS))\nonumber \\&-\left( q^j_- + \sum _k\beta _{kj} x_{kI}\right) (g(jI)-g(jS)). \end{aligned}$$
(52)

Now \(0 \le g(iI)-g(iS) \le g(jI)-g(jS)\), so that (52) is fulfilled if

$$\begin{aligned} (w_S^j-w^i_S) \ge \left( q^i_- +\sum _{k}\beta _{ki} x_{kI} -q^j_- - \sum _k\beta _{kj} x_{kI}\right) (g(iI)-g(iS)) \end{aligned}$$
(53)

for all times. Taking into account the requirement that all \(\beta _{ij}\) are sufficiently small, we find as above that it holds under the second assumptions of (43) and (44).

Similarly (but actually even simpler) one shows that the condition \(g_t(jI)-g_t(jS) \ge 0\) remains valid for all times and all j.

The last statement of the theorem concerning x(s) follows from the observation that the eigenvalues of the linearized evolution x(s) are negative and well separated from zero implying the global stability of the fixed point of the evolution for sufficiently small \(\beta \). The last statement of the theorem concerning g(s) follows by similar stability argument for the linear evolution (42) taking into account that away from the initial point t, the trajectory x(t) stays arbitrary close to its fixed point. \(\square \)