Continuous-Time Mean Field Games with Finite State Space and Common Noise

We formulate and analyze a mathematical framework for continuous-time mean field games with finitely many states and common noise, including a rigorous probabilistic construction of the state process and existence and uniqueness results for the resulting equilibrium system. The key insight is that we can circumvent the master equation and reduce the mean field equilibrium to a system of forward-backward systems of (random) ordinary differential equations by conditioning on common noise events. In the absence of common noise, our setup reduces to that of Gomes, Mohr and Souza (Appl Math Optim 68(1): 99–143, 2013) and Cecchin and Fischer (Appl Math Optim 81(2):253–300, 2020).

Mean field games constitute a class of dynamic, multi-player stochastic differential games with identical agents. The key characteristic of the mean field approach is that (i) the payoff and state dynamics of each agent depend on other agents' decisions only through an aggregate statistic (typically, the aggregate distribution of states); and (ii) no individual agent's actions can change the aggregate outcome. Thus, in solving an individual agent's optimization problem, the feedback effect of his own actions on the aggregate outcome can be discarded, breaking the notorious vicious circle ("the optimal strategy depends on the aggregate outcome, which depends on the strategy, which depends …"). This significantly facilitates the identification of rational expectations equlibria. A standard assumption that further simplifies the analysis is that randomness is idiosyncratic (equivalently, there is no common noise), i.e. that the random variables appearing in one agent's optimization are independent of those in any other's. As a result, all randomness is "averaged out" in the aggregation of individual decisions, and the equilibrium dynamics of the aggregate distribution are deterministic.
In the literature, mean field games are most often studied in settings with a continuous state space and deterministic or diffusive dynamics, i.e. stochastic differential equations (SDEs) driven by Brownian motion. The corresponding dynamic programming equations thus become parabolic partial differential equations, and the aggregate dynamics are represented by a flow of Borel probability measures; see, e.g., the monographs [4] and [9] and the references therein. Formally, the mean field game is typically formulated in terms of a controlled McKean-Vlasov SDE, where the coefficients depend on the current state and control and the distribution of the solution; intuitively, these McKean-Vlasov dynamics codify the dynamics that pertain to a representative agent. The mathematical link to N -player games is subsequently made through suitable propagation of chaos results in the mean field limit N → ∞; see, e.g., [14,25,28,42,43]. In this context, the analysis of McKean-Vlasov SDEs has also seen significant progress recently; see, e.g., [6,8,19,48]. In the presence of common noise, i.e. sources of risk that affect all agents and do not average out in the mean field limit, the mathematical analysis becomes even more involved as the dynamics of the aggregate distribution become stochastic, leading to conditional McKean-Vlasov dynamics; see, e.g., [1,12,21,51]. We refer to [10] for background and further references on continuous-state mean field games with common noise.
There is also a strand of literature on mean field games with finite state spaces, including [2,15,18,24,30,31,34,49] as well as [9, §7.2]. In a recent article, [22] provide an extension of [31] to mean field interactions that occur not only through the agents' states, but also through their controls. To the best of our knowledge, however, to date there has been no extension of these results to settings that include common noise. In the context of finite-state mean field games, we are only aware of two contributions that include common stochasticity (both via the master equation and with a different focus/setting than this paper): [5] analyze the master equation for finite-state mean field games with common noise, and [3] include a common continuous-time Gaussian noise in the aggregate distribution dynamics.
In this article, we set up a mathematical framework for finite-state mean field games with common noise. 1 Our setup extends that of [31] and [15] by common noise events at fixed points in time. We provide a rigorous formulation of the underlying stochastic dynamics, and we establish a verification theorem for the optimal strategy and an aggregation theorem to determine the resulting aggregate distribution. This leads to a characterization of the mean field equilibrium in terms of a system of (random) forward-backward differential equations. The key insight is that, after conditioning on common noise configurations, we obtain classical piecewise dynamics subject to jump conditions at common noise times.
The remainder of this article is organized as follows: In Sect. 2 we set up the mathematical model, provide a probabilistic construction of the state dynamics, and formulate the agent's optimization problem. In Sect. 3 we state the dynamic programming equation and establish a verification theorem for the agent's optimization, given an ex ante aggregate distribution (Theorem 6). Section 4 provides the dynamics of the ex post distribution (Theorem 9) and, on that basis, a system of random forward-backward ODEs for the mean field equilibrium (Definition 10) as well as corresponding existence and uniqueness results (Theorems 13 and 16 ). In Sect. 5 we showcase our results in two benchmark applications: agricultural production and infection control. The Appendix provides the proofs of Theorems 13 and 16.

Mean Field Model
We first provide an informal description of the individual agents' state dynamics, optimization problem, and the resulting mean field equilibrium. The agent's state process X = {X t } takes values in the finite set S. Between common noise events, transitions from state i to state j occur with intensity Q i j (t, W t , M t , ν t ), where W t represents the common noise events that have occurred up to time t; M t the timet aggregate distribution of agents; and ν t the agent's control. In addition, upon the realization of a common noise event W k at time T k , the state jumps from X T k − to X T k = J X T k − (T k , W T k , M T k − ). With this, the agent aims to maximize where ψ and are suitable reward functions and the aggregate distribution process M = {M t } is given by Here μ represents the aggregate distribution of states as a function of the common noise factors. We obtain a rational expectations equilibrium by determining μ such that the representative agent's ex ante expectations equal the ex post aggregate distribution resulting from all agents' optimal decisions, i.e.
where ν and μ denote the equilibrium strategy and the equilibrium aggregate distribution. In the remainder of this section, we provide a rigorous mathematical formulation of this model.

Probabilistic Setting and Common Noise
Throughout, we fix a time horizon T > 0 and a finite set W and work on a probability space ( , A, P) that carries a finite sequence W 1 , . . . , W n of i.i.d. random variables that are uniformly distributed 2 on W. We refer to W 1 , . . . , W n as common noise factors and to P as the reference probability. The common noise factor W k is revealed at time T k , where Both n and the common noise times T 0 , T 1 , . . . , T n+1 are fixed and deterministic. The piecewise constant filtration G = {G t } generated by common noise events is given by where N denotes the set of P-null sets. For each configuration of common noise factors w ∈ W n we write With this convention, W = {W t } represents a piecewise constant, G-adapted process.
With a slight abuse of notation, if f : [0, T ] × W n → R m is non-anticipative, we write Note that for f regular, the one-sided limits f (T k −, w) lim t↑T k f (t, w) exist for all k ∈ [1 : n], w ∈ W n .

Optimization Problem
The agent's state and action spaces are given by and we identify the space of aggregate distributions on S with the space of probability vectors The coefficients in the state dynamics and payoff functional are bounded and Borel measurable functions such that Q(·, ·, m, u), ψ(·, ·, m, u) and J (·, ·, m) are non-anticipative for all fixed m ∈ M and u ∈ U; Q satisfies the intensity matrix conditions Q i j (t, w, m, u) ≥ 0, i, j ∈ S, i = j and j∈S Q i j (t, w, m, u) = 0, i ∈ S, for (t, w, m, u) ∈ [0, T ] × W n × M × U; and for each k ∈ [1 : n] the function is Borel measurable with w k ∈W κ k (w k |w 1 , . . . , w k−1 , m) = 1 for all w 1 , . . . , w k−1 ∈ W and m ∈ M. We further suppose that ( , A, P) supports, for each i, j ∈ S, i = j, a standard (i.e., unit intensity) Poisson process N i j = {N i j t } and an S-valued random variable X 0 such that The corresponding full filtration F = {F t } is given by Note that G t ⊆ F t for all t ∈ [0, T ], that both G and F satisfy the usual conditions, and that N i j is a standard (F, P)-Poisson process for i, j ∈ S, i = j. Given a regular, non-anticipative function μ, the G-adapted, M-valued ex ante aggregate distribution M = {M t } is given by and the agent's optimization problem reads 3 where the class of admissible strategies for (P μ ) is given by the set of closed-loop controls Note that A subsumes the class of Markovian feedback controls considered in, e.g., [31] or [34], and that each ν ∈ A canonically induces an F-adapted U-valued process via denotes the expectation operator with respect to the probability measure P ν given by and the agent's state process X is given by subject to the jump conditions Here N i j triggers transitions from state i to state j, and P ν is defined in such a way that N i j has P ν -intensity Q i j (t, W t , M t , ν t ); see Lemma 2 below. 4 In summary, in order to formulate a mean field model within the above setting, it suffices to specify • the agent's state space S, action space U and the common noise space W, • the transition intensities Q(t, w, m, u), transition kernels κ k (w k |w 1 , . . . , w k−1 , m) and common noise jumps J (t, w, m), and finally • the reward functions ψ(t, w, m, u) and (w, m).

State Dynamics
In what follows, we show that the preceding construction implies the dynamics described informally above.
Lemma 2 (P ν -dynamics) For each admissible strategy ν ∈ A, P ν is a well-defined probability measure on ( , A), absolutely continuous with respect to P, and satisfies P ν = P on σ (X 0 ).

Moreover, N i j is a counting process with
Finally, for all k ∈ [1 : n] we have where G denotes the common noise filtration and, in particular, Proof We fix ν ∈ A and split the proof into four steps.
Step 1: P ν is well-defined by (1). Since N i j is a standard Poisson process under P, the compensated processN and note that ϑ is an (F, P)-martingale. Indeed, for each k ∈ [0 : n] we have ϑ t = ϑ T k for t ∈ [T k , T k+1 and, using that W k is independent of F T k − and uniformly distributed on W under P, it follows that Hence the Doléans-Dade exponential E[ϑ] is a local (F, P)-martingale, and we have for t ∈ [0, T ]. Since N i j T k = 0 for all i, j ∈ S, i = j, and k ∈ [1 : n] a.s., we have [θ ν , ϑ] = 0, and thus the process is a local (F, P)-martingale. Since where (8) it follows that sup t∈[0,T ] |Z ν t | is P-integrable, so Z ν is in fact an (F, P)-martingale. Since Z ν is non-negative with Z ν 0 = 1 by construction, we conclude that P ν is a well-defined probability measure on A, absolutely continuous with respect to P, with density process Step 2: P ν -intensity of N i j . Let i, j ∈ S with i = j. Since P ν P it is clear that N i j is a P ν -counting process, so it suffices to show that the process is a local (F, P ν )-martingale. To show this, by Step 1 it suffices to demonstrate that Z ν · νN i j is a local (F, P)-martingale. Noting that and using integration by parts, the local martingale property follows since Step 3: P ν = P on σ (X 0 ). For any function g : S → R we have by the (F, P)-martingale property of Z ν .
Step 4: Distribution of W k under P ν . Let k ∈ [1 : n] and w 1 , . . . , w k ∈ W. Since E[θ ν ] T k = E[θ ν ] T k − a.s. and W k is uniformly distributed on W and independent of F T k − under P, iterated conditioning yields and the proof is complete.
Lemma 2 implies in particular that P ν ( N i j t = 0) = 0 for every t ∈ [0, T ], so as a consequence we have , the agent's ex ante beliefs concerning the common noise factors are the same, irrespective of his control.

Solution of the Optimization Problem
In the following, we solve the agent's maximization problem (P μ ) using the associated dynamic programming equation (DPE). This is the same methodology as in [31] and [15]; see [22] for an alternative approach (to extended mean field games, but without common noise) based on backward SDEs.
The DPE for the value function of the agent's optimization problem (P μ ) reads for i ∈ S, subject to suitable consistency conditions for t = T k , k ∈ [1 : n], and the terminal condition

Assumption 3 There exists a Borel measurable function
Assumption 3 is satisfied e.g. if U is compact and Q and ψ are continuous with respect to u ∈ U. Note that, since ψ i (·, ·, m, u) and Q i· (·, ·, m, u) are non-anticipative for m ∈ M, u ∈ U, we can assume without loss of generality that h( and thus obtain the following reduced-form DPE, which we use in the following: if v is non-anticipative and satisfies the ordinary differential equation wherev : Observe that (DP μ ) represents a system of (random) ODEs, coupled via w ∈ W n . The ODEs run backward in time on each segment [T k , T k+1 × W n , k ∈ [0 : n], and their terminal conditions for t ↑ T k+1 are specified by (TC μ ) for k = n and by (CC μ ) for k < n. Note that for t ∈ [T k , T k+1 the relevant common noise factors W 1 , . . . , W k are known.

Remark 5
While the significance of the DPE (DP μ ) and the terminal condition (TC μ ) are clear, the consistency conditions (CC μ ) warrant a brief comment: For i ∈ S, k ∈ [1 : n] and w ∈ W n the state process jumps from state i to state j We next link the solution of the DPE to the underlying stochastic control problem.

Theorem 6 (Verification) Suppose μ : [0, T ] × W n → M is regular and nonanticipative and v is a solution of (DP μ ) subject to (CC μ ) and (TC μ ). Then v is the agent's value function for problem (P
and an optimal control is given by Proof Let ν ∈ A be an admissible strategy. Until further notice we fix k ∈ [0 : n].
Step 1: Step 2: Jump dynamics at T k . We recall from Lemma 2 that In view of the jump dynamics (3) and the consistency condition (CC μ ), we thus obtain Step 3: Optimality. Combining (11) and (12) for k = [1 : n] and using (TC μ ) yields Since νN i j is a compensated counting process and v and Q are bounded, M i j is in fact an (F, P ν )-martingale. Hence taking P ν -expectations in (13), using the tower property of conditional expectation and the fact that P ν and P coincide on σ (X 0 ) by Lemma 2, and finally that v solves the DPE, we obtain i∈S If we replace ν with ν, the same argument applies with equality in (14); we thus conclude that v is the value function of (P μ ), and that the strategy ν is optimal.
The optimal strategy is Markovian in the agent's state; this is unsurprising given the literature, see e.g. [31,Theorem 1] or [22,Proposition 3.9] and [15,Theorem 4]. Note, however, that the time-t optimal strategy may depend on all common noise events that have occurred up to time t, as W t = (W 1 , . . . , W k ) for t ∈ [T k , T k+1 . In the following, we denote by P the probability measure P P ν where ν is the optimal control specified in Theorem 6. It follows from Lemma 2 that

Equilibrium
Having solved the agent's optimization problem for a given ex ante function μ, we now turn to the resulting mean field equilibrium. We first identify the aggregate distribution resulting from the optimal control. Remark 7 This paper generally adopts a "representative agent" point of view; an alternative justification of mean field equilibrium is via convergence of Nash equilibria of symmetric N -player games in the limit N → ∞; see, among others, [2,14,15,18,20,22,24,28]. In the setting of this article (albeit under additional regularity conditions) a mean field limit justification can be provided along the lines of the proof of Theorem 7 in [31] by conditioning on common noise configurations, similarly as in the proof of Theorem 9 below.

Aggregation
Given an ex ante aggregate distribution specified in terms of a regular, non-anticipative function μ and a corresponding solution v of (DP μ ) subject to (CC μ ), (TC μ ), Theorem 6 yields an optimal strategy ν for the agent's optimization problem (P μ ). With P denoting the probability measure associated with ν, the resulting ex post aggregate distribution is given by the M-valued, G-adapted process M = { M t }, where G denotes the common noise filtration. We note that M is càdlàg since G is piecewise constant and X is càdlàg. Equilibrium obtains if M t = μ(t, W t ) for all t ∈ [0, T ]. To proceed, we aim for a more explicit description of M and, in particular, its dynamics. Thus we define for k ∈ [1 : n] where P k : W n × M → {0, 1} d×d is given by and we set m 0 P(X 0 ∈ · ) = P(X 0 ∈ · ) ∈ M.
that satisfies the consistency conditions Then Y is G-adapted.

Proof
Step 1: Existence and uniqueness of Carathéodory solutions. For each k ∈ [0 : n] and w ∈ W n , since μ and v are regular and Q is bounded, the function is measurable in the first and Lipschitz continuous in the second argument. Thus, using that μ, v and Q are non-anticipative, a classical result, see [36,Theorem I.5.3], implies that for each initial condition y ∈ R 1×d there exists a unique Carathéodory solution ϕ Step 2: Thus from uniqueness in part (a) it follows that we have the representation Suppose v is a solution of (DP μ ) subject to (CC μ ), (TC μ ), and the agent implements his optimal strategy ν as defined in Theorem 6. Then the aggregate distribution M has the P-dynamics and satisfies the initial condition and the jump conditions Proof Let w ∈ W n be a common noise configuration. Since X is defined path by path, see (2) and (3), we first note that X = X w on {W T = w}, where X w satisfies (2) and Using analogous arguments as in Step 1 of the proof of Lemma 2 (see in particular (4) and (7)), it follows that there exists a probability measure P w with density process where the filtration H = {H t } is given by Furthermore, in view of (4) and (15) we have Step 1: Conditional Kolmogorov dynamics. Throughout Step 1, we fix a common noise configuration w ∈ W n . It follows exactly as in the proof of Lemma 2 (with P w in place of P) that P w P, P w = P on σ (X 0 ), and that for i, j ∈ S, i = j, the process N i j is a counting process with (H, P w )intensity Boundedness of Q implies that for each z ∈ R d the process Using Itô's lemma and the fact that λ Taking expectations with respect to P w and using Fubini's theorem yields It follows from (20) satisfies, for all i ∈ S and k ∈ [0 : n], Moreover, since P w = P on σ (X 0 ) and X w 0 = X 0 , η(w) satisfies the initial condition Finally, consider a common noise time t = T k and note that for all i ∈ S the jump condition (18) implies Since η(W T ) = w∈W n 1 {W T =w} · η(w), in view of (22), (23) and (24) it follows from Lemma 8 that the process η(W T ) is G-adapted.
Step 2: Identification of η(W T ). Recall that G T = σ (W T ) ∨ N and let w ∈ W n . For t ∈ [0, T ] and i ∈ S we have by (6) and (19) where in the final line the first identity is due to Lemma 2 and P-independence of (ζ (w), X w ) and G T ; and the second is due to (21) and the fact that P(W T = w) = 1/|W| n . Thus Step 3: Dynamics of M. By Step 2 and the tower property of conditional expectation, we find that for each i ∈ S and t ∈ [0, T ] where the final identity is due to the fact that η(W T ) is G-adapted by Step 1 and E denotes P-expectation. As a by-product, the preceding proof yields the alternative representation M t = P(X t ∈ · | G T ) for t ∈ [0, T ], P-a.s.

Mean Field Equilibrium System
As discussed above, equilibrium obtains if the agents' ex ante beliefs coincide with the ex post outcome. This holds if and only if the ex post aggregate distribution process M from (M) satisfies for k ∈ [1 : n], and the initial/terminal conditions We also refer to (E1)-(E6) as the equilibrium system.
In combination, Theorem 6 and Theorem 9 demonstrate that, given a solution (μ, v) of the equilibrium system, v is the value function of the agent's optimization problem (P μ ) with ex ante aggregate distribution μ; and the ex post distribution resulting from the corresponding optimal strategy is given by μ itself. Thus we can identify a mean field equilibrium with common noise by producing a solution of the equilibrium system (E1)-(E6). We provide some illustrations in Sect. 5 and v max max + T · ψ max · e Q max ·T .
Note that these constants depend only on the underlying model coefficients.

Assumption 11 (i) The reduced-form running reward function ψ satisfies
The terminal reward function is continuous with respect to m, i.e. for every w ∈ W n the map (w, · ) is continuous. (iv) For each k ∈ [1 : n] and all i ∈ S, w ∈ W n and v ∈ R d with v ≤ v max , the map (v) For each k ∈ [1 : n] and w ∈ W n the map k (w, · ) is continuous.
Since all norms on R d are equivalent, the concrete specification is immaterial for Assumption 11. For the sake of convenience, in the following we use the maximum norm on R d and a compatible matrix norm on R d×d ; moreover, we suppose that (ii) holds for both Q and Q T .

Remark 12
Sufficient conditions for Assumptions 11(i)-(ii) in terms of the model's primitives can be found in, e.g., [31] or [15]. Furthermore, in the special case where the jump map J is independent of m ∈ M, Assumption 11(v) is trivially satisfied, and continuity of the transition kernels κ k with respect to m is sufficient for Assumption 11(iv) to hold.

Theorem 13 (Existence of Equilibria) If Assumption 11 holds, then there exists a solution of the equilibrium system (E1)-(E6).
Proof See Appendix A.
The reduced-form Hamiltonian H : Assumption 14 Let Assumptions 11(i) and (ii) hold, and suppose that: (i) The terminal payoff function is monotone with respect to m ∈ M, i.e.
(iii) The reduced-form Hamiltonian H satisfies a uniform monotonicity condition with respect to m ∈ M, i.e. there exist α, γ > 0 such that (iv) For k ∈ [1 : n] the maps κ k and J satisfy the following monotonicity conditions in m ∈ M: For all w ∈ W n , m 1 , and The constant v max > 0 in 14(ii)-(iv) is defined in (25). Conditions 14(i)-(iii) are standard given the literature; see, e.g., Assumptions 1-3 in [31]

Applications
Before we illustrate our results in two showcase examples, we briefly discuss our numerical approach to the equilibrium system (E1)-(E6). (E1)-(E2) is a forwardbackward system of 2d ODEs with boundary conditions (E3)-(E6), coupled through the parameter w ∈ W n representing common noise configurations. The special case n = 0 (no common noise) corresponds to the setting of [31] and [15], with the equilibrium system reducing to a single 2d-dimensional forward-backward ODE. For n ≥ 1, the consistency conditions (E3)-(E4) specify initial conditions for μ on [T k , T k+1 and terminal conditions for v on [T k−1 , T k , k ∈ [1 : n]; since these conditions are interconnected, there is in general no segment [T k , T k+1 × W n where the equilibrium system yields both an explicit initial condition for μ and an explicit terminal condition for v, so we cannot simply split the problem into subintervals. Rather, the equilibrium system can be regarded as a multi-point boundary value problem where for each of the |W| k conceivable combinations of common noise factors on [T k , T k+1 , k ∈ [0 : n], we have to solve a coupled forward-backward system of ODEs in 2d dimensions, resulting in a tree of such systems of size Our approach to solving (E1)-(E6) numerically is to rely on the probabilistic interpretation as a fixed-point system, based on Theorem 13. Thus, starting from an initial flow of probability weights μ 0 (t, w), (t, w) ∈ [0, T ] × W n with μ 0 (0, w) = m 0 for all w ∈ W n , we solve (DP μ ) subject to (TC μ ) and (CC μ ) backward in time for all non-negligible common noise configurations w ∈ W n to obtain the value v 0 (t, w), (t, w) ∈ [0, T ] × W n , of the agents' optimal response to the given belief μ 0 . This, in turn, is used to solve (M) subject to (M 0 ) and (M k ) forward in time. As a result, we obtain an ex post aggregate distribution μ 1 (t, w), (t, w) ∈ [0, T ] × W n ; we then iterate this with μ 1 in place of μ 0 , etc. 9

A Decentralized Agricultural Production Model
As a first (stylized) example we consider a mean field game of agents, each of which owns (an infinitesimal amount of) land of identical size and quality within a given area. If it is farmed, each field has a productivity f (w k ) > 0 depending on the common weather condition w k . We assume that weather is either good, bad or catastrophic, so w k ∈ W {↑, ↓, }, and changes at given common noise times T 1 , . . . , T n . Each agent is in exactly one state i ∈ S {0, 1} depending on whether he grows crops on his field (i = 1, the agent is a farmer) or not (i = 0). The selling price p for his harvest depends on aggregate production, and thus in particular on the proportion m 1 ∈ [0, 1] of farmers; the mean field interaction is transmitted through the market price of the crop. We assume that p is a strictly decreasing function of overall production f (w k ) · m 1 ; see Fig. 1 for illustration.
We assume that f (↑) ≥ f (↓) = f ( ) ≥ 0. Moreover, on the catastrophic event {W k = } all agents are reduced to being non-farmers, and thus Each agent can make an effort u ∈ U [0, ∞) to become being a farmer; the intensity matrix for state transitions is given by where q entry , q exit ≥ 0 are given transition rates. The running rewards capture the fact that both efforts to building up farming capacities and production itself are costly, while revenues from the sales of the crop generate profits; thus where w 0 ↑ and c entry , c prod ≥ 0. The terminal reward is zero. It follows that the maximizer h 0 in Assumption 3 is unique and given by  Fig. 1 Price function p (parameters as in Table 1) a specification of h 1 is immaterial. We choose m 1 0 10% for the initial proportion of farmers, and report the relevant coefficients in Table 1.
Our results for the evolution of the mean field equilibrium are shown in Figs. 2 and 3 for various common noise configurations w ∈ W n and the following two baseline models: To illustrate the effect of uncertainty about future weather conditions we also show, for each common noise configuration, the theoretical perfect-foresight equilibria that would pertain if future weather conditions were known; these are plotted using dashed lines in Figs. 3, 4 and 5, and the subscript • indicates the relevant deterministic common noise path. Equilibrium prices are stochastically modulated by the prevailing weather conditions, both directly and indirectly: First, prices jump at common noise times due to weather-related changes in productivity. Second, weather conditions indirectly affect market prices through their effect on the proportion of farming agents. Thus, with consistently good weather conditions, agents are strongly incentivized to become farmers, see Fig. 4; the fraction of farmers increases, see Fig. 3; and hence increased production drives down prices, see Fig. 5. By contrast, under bad weather conditions, incentives are weaker and prices remain higher. Both effects are dampened if a catastrophic event may occur. In addition, efforts tend to decrease between common noise times; this is due to the uncertainty of future weather conditions; this effect is more pronounced in the presence of catastrophic events.

An SIR Model with Random One-Shot Vaccination
Our second application is a mean field game of agents that are confronted with the spread of an infectious disease. Our main focus is to illustrate the qualitative effects of common noise on the equilibrium behavior of the system. We consider a classical where q IR ≥ 0 denotes the recovery rate of infected agents and the infection rate is given by with a given maximum rate q SI ≥ 0. The running reward penalizes both protection efforts and time spent in the infected state; with c P , ψ I ≥ 0 we set In addition, we include the possibility of a one-shot vaccination that becomes available, simultaneously to all agents, at a random point of time τ ∈ {T 1 , . . . , T n } ⊂ (0, T ). We set W {0, 1} and identify the k th unit vector e k = (δ k j ) j∈[1:n] ∈ W n , k ∈ [1 : n] with the indicator of the event {τ = T k }. The event that no vaccine is available until T is represented by 0 ∈ W n ; we set τ +∞ in this case. 10 If and when it is available, all susceptible agents are vaccinated instantaneously, rendering them immune to infection; thus The probability of vaccination becoming available is proportional to the percentage of agents that have already recovered from the disease. Thus for k ∈ [1 : n], w 1 , . . . , w k ∈ W and m ∈ M we set (1 |w 1 , . . . , w k−1 , m) where α ∈ (0, 1]. As a consequence, for all (i, t, w, m, v) ∈ S × [0, T ] × W × M × R 3 , a maximizer as Remark 17 (SIR Models in the Literature). Note that, given the above specification of the transition matrix Q, the forward dynamics (E1) within the equilibrium system (E1)-(E6) read as follows: Disregarding common noise, these constitute a ramification of the classical SIR dynamics, which are a basic building block of numerous compartmental epidemic models in the literature; see, among others, [32,37,38,47] and the references therein. The SIR mean field game with controlled infection rates, albeit without common noise, has recently been studied in the independent article [26]; we also refer to [46] and [23] for mean field models with controlled vaccination rates. Mathematically similar contagion mechanisms also appear in, e.g., [40,41], §7.2.3 in [9], §7.1.10 in [10], or §4.4 in [52].
While Theorem 13 guarantees existence of a mean field equilibrium for (a variant 12 of) the SIR model, the monotonicity conditions of Theorem 16 do not hold in this setup. 13 Nevertheless, our numerical results reliably yield consistent equilibria. For our illustrations, the initial distribution of agents is given by m 0 (0.995, 0.005, 0.00), and the model coefficients are reported in Table 2. Note that there are n = 1999 common noise times T k = k · 0.01, k = 1, . . . , 1999, at which a vaccine can be administered. The specifications of q SI and q IR imply a basic reproduction number R 0 q SI /q IR = 15 in the absence of vaccination and protection efforts.
Our results for the mean field equilibrium distributions of agents μ and the corresponding optimal protection efforts of susceptible agents h S are displayed in Figs. 7, 8, and 9 for different common noises configurations, i.e. vaccination times τ . As in Sect. 5.1, we also display the corresponding (theoretical) perfect-foresight equilibria, marked by the subscript •.
Note that an agent's running reward is the same in state S with zero protection effort and in state R; agents are penalized relative to these in state I and hence aim to avoid that state. Susceptible agents can reach the state R of immunity by two ways: First, they can become infected and overcome the disease; second, they can be vaccinated and jump instantly from state S to state R. While the first alternative is painful, the second comes at no cost and is hence clearly preferable. However, as the availability of a vaccine cannot be directly controlled by the agents, they can only protect themselves against infection at a certain running cost until the vaccine becomes available. Figures 7, 8, and 9 demonstrate that the possibility of vaccination as a common noise event can dampen the spread of the disease and lower the peak infection rate. This is due to an increase in agents' protection efforts during the time period when the proportion of infected agents is high. By contrast, in the perfect-foresight equilibria where the vaccination date is known, agents do not make substantial protection efforts until the vaccination date is imminent, see Figs. 8 and 9; in the scenario without vaccination, see Fig. 7, protection efforts are only ever made by a very small fraction of the population. With perfect foresight, the agents' main rationale is to avoid being in state I when the vaccine becomes available. This highlights the importance of being able to model the vaccination date as a (random) common noise event. Finally, observe that our numerical results indicate convergence to the stationary distribution μ = (0, 0, 1) ∈ M, showing that the model is able to capture the entire evolution of an epidemic.
Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

A Appendix: Proof of Theorem 13
Let E ⊆ R d and define the space for k ∈ [0 : n], and for k ∈ [1 : n], where for all w 1 , . . . , w k−1 ∈ W the family {γ k (w −k ,w k )}w k ∈W consists of probability weights on W. Then we have where C ρ + αT + (n + 1)ϑ · e βT .
By construction,v( · , w) solves (E2) on [T k , T k+1 and does not depend on w k+1 , . . . , w n . Having constructedv on [T k , T k+1 × W n , we use (E4) and definē for w ∈ W n . By (10) and the fact that μ and J are non-anticipative, it follows that this definition does not depend on w k , . . . , w n . Consequently, the above construction can be iterated, and hence we obtainv as the unique solution of (E2) subject to (E4) and (E6). By definition,v is non-anticipative and regular, i.e.v ∈ Reg(R d ).

Lemma A.3 Suppose that Assumption 11 is satisfied and let
Then there is a unique solutionμ of (E1) subject to (E3) and (E5), and we haveμ ∈ Reg(M).
Proof The proof is analogous to (but somewhat simpler than) that of Lemma A.2.

Proof of Theorem 13
We divide the proof into four steps: Step 1: Solution operators. We define wherev ∈ Reg(R d ) is the unique solution of (E2) subject to (E4) and (E6) given μ ∈ D(M); χ is well-defined by Lemma A.2. Moreover, let whereμ ∈ Reg(M) is the unique solution of (E1) subject to (E3) and (E5) given v ∈ D(R d ); χ is well-defined by Lemma A.3.
Step 4: Construction of the fixed point. Let χ : D(M) → Reg(M), χ χ • χ and note that χ is continuous with respect to · sup by Steps 2 and 3. We define and w ∈ W n we define and note that by the Arzelà-Ascoli theorem, the sequence {μ (k,w) } ∈N ⊆ C([T k , T k+1 ]; M) contains a uniformly convergent subsequence. Taking sub-subsequences for k ∈ [0 : n] and w ∈ W n , we obtain a subsequence { ν } ν∈N such that μ ν − μ sup → 0 as ν → ∞ for some μ ∈ D(M). It is easy to see that μ ∈ Lip(M), and thus Lip(M) is indeed compact. Now Schauder's fixed point theorem implies that the continuous map χ : Lip(M) → Lip(M) has a fixed point μ ∈ Lip(M); upon setting v χ[μ] ∈ Reg(R d ) it follows that μ = χ [v] ∈ Reg(M) and that (μ, v) is a solution of (E1)-(E6).