1 Introduction

Since the seminal contributions of Lasry and Lions [44] and Huang, Malhamé and Caines [39], mean field games have become an active field of mathematical research with a wide range of applications, including economics [13, 16, 27, 33, 41, 50], sociology [35], finance [17, 45], epidemiology [23, 26, 46] and computer science [40]; see also the overview article [29] and the monograph [9].

Mean field games constitute a class of dynamic, multi-player stochastic differential games with identical agents. The key characteristic of the mean field approach is that (i) the payoff and state dynamics of each agent depend on other agents’ decisions only through an aggregate statistic (typically, the aggregate distribution of states); and (ii) no individual agent’s actions can change the aggregate outcome. Thus, in solving an individual agent’s optimization problem, the feedback effect of his own actions on the aggregate outcome can be discarded, breaking the notorious vicious circle (“the optimal strategy depends on the aggregate outcome, which depends on the strategy, which depends ...”). This significantly facilitates the identification of rational expectations equlibria. A standard assumption that further simplifies the analysis is that randomness is idiosyncratic (equivalently, there is no common noise), i.e. that the random variables appearing in one agent’s optimization are independent of those in any other’s. As a result, all randomness is “averaged out” in the aggregation of individual decisions, and the equilibrium dynamics of the aggregate distribution are deterministic.

In the literature, mean field games are most often studied in settings with a continuous state space and deterministic or diffusive dynamics, i.e. stochastic differential equations (SDEs) driven by Brownian motion. The corresponding dynamic programming equations thus become parabolic partial differential equations, and the aggregate dynamics are represented by a flow of Borel probability measures; see, e.g., the monographs [4] and [9] and the references therein. Formally, the mean field game is typically formulated in terms of a controlled McKean-Vlasov SDE, where the coefficients depend on the current state and control and the distribution of the solution; intuitively, these McKean-Vlasov dynamics codify the dynamics that pertain to a representative agent. The mathematical link to N-player games is subsequently made through suitable propagation of chaos results in the mean field limit \(N\rightarrow \infty \); see, e.g., [14, 25, 28, 42, 43]. In this context, the analysis of McKean-Vlasov SDEs has also seen significant progress recently; see, e.g., [6, 8, 19, 48]. In the presence of common noise, i.e. sources of risk that affect all agents and do not average out in the mean field limit, the mathematical analysis becomes even more involved as the dynamics of the aggregate distribution become stochastic, leading to conditional McKean-Vlasov dynamics; see, e.g., [1, 12, 21, 51]. We refer to [10] for background and further references on continuous-state mean field games with common noise.

There is also a strand of literature on mean field games with finite state spaces, including [2, 15, 18, 24, 30, 31, 34, 49] as well as [9, §7.2]. In a recent article, [22] provide an extension of [31] to mean field interactions that occur not only through the agents’ states, but also through their controls. To the best of our knowledge, however, to date there has been no extension of these results to settings that include common noise. In the context of finite-state mean field games, we are only aware of two contributions that include common stochasticity (both via the master equation and with a different focus/setting than this paper): [5] analyze the master equation for finite-state mean field games with common noise, and [3] include a common continuous-time Gaussian noise in the aggregate distribution dynamics.

In this article, we set up a mathematical framework for finite-state mean field games with common noise.Footnote 1 Our setup extends that of [31] and [15] by common noise events at fixed points in time. We provide a rigorous formulation of the underlying stochastic dynamics, and we establish a verification theorem for the optimal strategy and an aggregation theorem to determine the resulting aggregate distribution. This leads to a characterization of the mean field equilibrium in terms of a system of (random) forward-backward differential equations. The key insight is that, after conditioning on common noise configurations, we obtain classical piecewise dynamics subject to jump conditions at common noise times.

The remainder of this article is organized as follows: In Sect. 2 we set up the mathematical model, provide a probabilistic construction of the state dynamics, and formulate the agent’s optimization problem. In Sect. 3 we state the dynamic programming equation and establish a verification theorem for the agent’s optimization, given an ex ante aggregate distribution (Theorem 6). Section 4 provides the dynamics of the ex post distribution (Theorem 9) and, on that basis, a system of random forward-backward ODEs for the mean field equilibrium (Definition 10) as well as corresponding existence and uniqueness results (Theorems 13 and 16 ). In Sect. 5 we showcase our results in two benchmark applications: agricultural production and infection control. The Appendix provides the proofs of Theorems 13 and 16.

2 Mean Field Model

We first provide an informal description of the individual agents’ state dynamics, optimization problem, and the resulting mean field equilibrium. The agent’s state process \(X=\{X_t\}\) takes values in the finite set \({\mathbb {S}}\). Between common noise events, transitions from state i to state j occur with intensity \(Q^{ij}(t,W_t,M_t,\nu _t)\), where \(W_t\) represents the common noise events that have occurred up to time t; \(M_t\) the time-t aggregate distribution of agents; and \(\nu _t\) the agent’s control. In addition, upon the realization of a common noise event \(W_k\) at time \(T_k\), the state jumps from \(X_{T_k-}\) to \(X_{T_k}=J^{X_{T_k-}}(T_k,W_{T_k},M_{T_k-})\). With this, the agent aims to maximize

$$\begin{aligned} {\mathbb {E}}^\nu \Bigl [ \int _0^T \psi ^{X_t}(t,W_t,M_t,\nu _t)\mathrm {d}t + \Psi ^{X_T}(W_T,M_T) \Bigr ] \end{aligned}$$

where \(\psi \) and \(\Psi \) are suitable reward functions and the aggregate distribution process \(M=\{M_t\}\) is given by

$$\begin{aligned} M_t\triangleq \mu (t,W_t)\quad \text {for }t\in [0,T]. \end{aligned}$$

Here \(\mu \) represents the aggregate distribution of states as a function of the common noise factors. We obtain a rational expectations equilibrium by determining \(\mu \) such that the representative agent’s ex ante expectations equal the ex post aggregate distribution resulting from all agents’ optimal decisions, i.e.

$$\begin{aligned} {\mathbb {P}}^{{{\widehat{\nu }}}}(X_t\in \,\cdot \,\,|\,W_t) = {{\widehat{\mu }}}(t,W_t)\quad \text {for all }t\in [0,T], \end{aligned}$$

where \({{\widehat{\nu }}}\) and \({{\widehat{\mu }}}\) denote the equilibrium strategy and the equilibrium aggregate distribution. In the remainder of this section, we provide a rigorous mathematical formulation of this model.

2.1 Probabilistic Setting and Common Noise

Throughout, we fix a time horizon \(T>0\) and a finite set \({\mathbb {W}}\) and work on a probability space \((\Omega ,{\mathfrak {A}},{\mathbb {P}})\) that carries a finite sequence \(W_1,\ldots ,W_n\) of i.i.d. random variables that are uniformly distributedFootnote 2 on \({\mathbb {W}}\). We refer to \(W_1,\dots ,W_n\) as common noise factors and to \({\mathbb {P}}\) as the reference probability. The common noise factor \(W_k\) is revealed at time \(T_k\), where

$$\begin{aligned} 0\triangleq T_0< T_1< T_2< \cdots< T_n < T_{n+1}\triangleq T. \end{aligned}$$

Both n and the common noise times \(T_0,T_1,\ldots ,T_{n+1}\) are fixed and deterministic. The piecewise constant filtration \({\mathfrak {G}}=\{{\mathfrak {G}}_t\}\) generated by common noise events is given by

$$\begin{aligned} {\mathfrak {G}}_t\triangleq \sigma \bigl ( W_k\, :\, k\in [1:n],\ T_k\le t \bigr )\vee {\mathfrak {N}}\quad \text {for }t\in [0,T] \end{aligned}$$

where \({\mathfrak {N}}\) denotes the set of \({\mathbb {P}}\)-null sets. For each configuration of common noise factors \(w\in {\mathbb {W}}^n\) we write

$$\begin{aligned} w_t\triangleq (w_1,\ldots ,w_k)\ \text {for }t\in [T_k,T_{k+1}\rangle ,\ k\in [0:n], \end{aligned}$$

where for \(0\le s\le t\le T\) we set \([s,t\rangle \triangleq [s,t)\) if \(t<T\) and \([s,T\rangle \triangleq [s,T]\). With this convention, \(W=\{W_t\}\) represents a piecewise constant, \({\mathfrak {G}}\)-adapted process.

Definition 1

A function \(f:\ [0,T]\times {\mathbb {W}}^n\rightarrow {\mathbb {R}}^m\) is non-anticipative if for all \(t\in [0,T]\)

$$\begin{aligned} f(t,w) = f(t,{{\bar{w}}})\quad \text {whenever }w,{{\bar{w}}}\in {\mathbb {W}}^n\ \text {are such that }w_t={{\bar{w}}}_t. \end{aligned}$$

Moreover, f is regular if \(f(\,\cdot \,,w)\) is absolutely continuous on \([T_k,T_{k+1}\rangle \) for all \(k\in [0:n]\). \(\square \)

With a slight abuse of notation, if \(f:\ [0,T]\times {\mathbb {W}}^n\rightarrow {\mathbb {R}}^m\) is non-anticipative, we write

$$\begin{aligned} f(t,w_t) \triangleq f(t,w)\quad \text {for }w\in {\mathbb {W}}^n,\ t\in [0,T]. \end{aligned}$$

Note that for f regular, the one-sided limits \(f(T_k-,w)\triangleq \lim _{t\uparrow T_k}f(t,w)\) exist for all \(k\in [1:n]\), \(w\in {\mathbb {W}}^n\).

2.2 Optimization Problem

The agent’s state and action spaces are given by

$$\begin{aligned} {\mathbb {S}}\triangleq [1:d]\qquad \text {and}\qquad {\mathbb {U}}\subseteq {\mathbb {R}}^k,\qquad \text {where }d,k\in {\mathbb {N}}\text { and }{\mathbb {U}}\ne \varnothing , \end{aligned}$$

and we identify the space of aggregate distributions on \({\mathbb {S}}\) with the space of probability vectors

$$\begin{aligned} {\mathbb {M}}\triangleq \Bigl \{ m\in [0,\infty )^{1\times d}\, :\, \sum _{i=1}^d m^i = 1 \Bigr \}. \end{aligned}$$

The coefficients in the state dynamics and payoff functional are bounded and Borel measurable functions

$$\begin{aligned} Q:\ [0,T]\times {\mathbb {W}}^n\times {\mathbb {M}}\times {\mathbb {U}}&\rightarrow {\mathbb {R}}^{d\times d}&J:\ [0,T]\times {\mathbb {W}}^n\times {\mathbb {M}}&\rightarrow {\mathbb {S}}^d\\ \psi :\ [0,T]\times {\mathbb {W}}^n\times {\mathbb {M}}\times {\mathbb {U}}&\rightarrow {\mathbb {R}}^d&\Psi :\ {\mathbb {W}}^n\times {\mathbb {M}}&\rightarrow {\mathbb {R}}^d \end{aligned}$$

such that \(Q(\,\cdot \,,\,\cdot \,,m,u)\), \(\psi (\,\cdot \,,\,\cdot \,,m,u)\) and \(J(\,\cdot \,,\,\cdot \,,m)\) are non-anticipative for all fixed \(m\in {\mathbb {M}}\) and \(u\in {\mathbb {U}}\); Q satisfies the intensity matrix conditions \(Q^{ij}(t,w,m,u)\ge 0\), \(i,j\in {\mathbb {S}}\), \(i\ne j\) and \(\sum _{j\in {\mathbb {S}}}Q^{ij}(t,w,m,u)=0\), \(i\in {\mathbb {S}}\), for \((t,w,m,u)\in [0,T]\times {\mathbb {W}}^n\times {\mathbb {M}}\times {\mathbb {U}}\); and for each \(k\in [1:n]\) the function

$$\begin{aligned} \kappa _k:\ {\mathbb {W}}^k\times {\mathbb {M}}\rightarrow [0,1],\qquad (w_k,w_1,\ldots ,w_{k-1},m)\mapsto \ \kappa _k(w_k|w_1,\ldots ,w_{k-1},m), \end{aligned}$$

is Borel measurable with \(\sum _{{{\bar{w}}}_k\in {\mathbb {W}}}\kappa _k({{\bar{w}}}_k|w_1,\ldots ,w_{k-1},m)=1\) for all \(w_1,\ldots ,w_{k-1}\in {\mathbb {W}}\) and \(m\in {\mathbb {M}}\).

We further suppose that \((\Omega ,{\mathfrak {A}},{\mathbb {P}})\) supports, for each \(i,j\in {\mathbb {S}}\), \(i\ne j\), a standard (i.e., unit intensity) Poisson process \(N^{ij}=\{N^{ij}_t\}\) and an \({\mathbb {S}}\)-valued random variable \(X_0\) such that

$$\begin{aligned} X_0\qquad \text {and}\qquad N^{ij},\ i,j\in {\mathbb {S}},\ i\ne j\qquad \text {and}\qquad W_1,\ldots ,W_n\qquad \text {are independent.} \end{aligned}$$

The corresponding full filtration \({\mathfrak {F}}=\{{\mathfrak {F}}_t\}\) is given by

$$\begin{aligned} {\mathfrak {F}}_t\triangleq \sigma \bigl ( X_0,\, W_s,\, N^{ij}_s\, :\, s\in [0,t];\ i,j\in {\mathbb {S}},\ i\ne j \bigr )\vee {\mathfrak {N}}\quad \text {for }t\in [0,T]. \end{aligned}$$

Note that \({\mathfrak {G}}_t\subseteq {\mathfrak {F}}_t\) for all \(t\in [0,T]\), that both \({\mathfrak {G}}\) and \({\mathfrak {F}}\) satisfy the usual conditions, and that \(N^{ij}\) is a standard \(({\mathfrak {F}},{\mathbb {P}})\)-Poisson process for \(i,j\in {\mathbb {S}}\), \(i\ne j\). Given a regular, non-anticipative function \(\mu \), the \({\mathfrak {G}}\)-adapted, \({\mathbb {M}}\)-valued ex ante aggregate distribution \(M=\{M_t\}\) is given by

$$\begin{aligned} M_t\triangleq \mu (t,W_t)\quad \text {for }t\in [0,T] \end{aligned}$$

and the agent’s optimization problem readsFootnote 3

figure a

where the class of admissible strategies for (P\({}_\mu \)) is given by the set of closed-loop controls

$$\begin{aligned} {\mathcal {A}}\triangleq \bigl \{ \nu :\ [0,T]\times {\mathbb {S}}^{[0,T]}\times {\mathbb {W}}^n\rightarrow {\mathbb {U}}\&:\, \nu \ \text {is Borel measurable and }\\&\quad \nu (\,\cdot \,,x,\,\cdot \,)\ \text {is non-anticipative for all }x\in {\mathbb {S}}^{[0,T]} \bigr \}. \end{aligned}$$

Note that \({\mathcal {A}}\) subsumes the class of Markovian feedback controls considered in, e.g., [31] or [34], and that each \(\nu \in {\mathcal {A}}\) canonically induces an \({\mathfrak {F}}\)-adapted \({\mathbb {U}}\)-valued process via

$$\begin{aligned} \nu _t\triangleq \nu \bigl (t,X_{(\,\cdot \,\wedge t)-},W_t\bigr )\quad \text {for }t\in [0,T]. \end{aligned}$$

\({\mathbb {E}}^\nu [\,\cdot \,]\) denotes the expectation operator with respect to the probability measure \({\mathbb {P}}^\nu \) given by

$$\begin{aligned} \frac{\mathrm {d}{\mathbb {P}}^\nu }{\mathrm {d}{\mathbb {P}}}&= \prod \limits _{\begin{array}{c} i,j\in {\mathbb {S}},\\ i\ne j \end{array}}\left( \exp \left\{ \int _0^T \bigl (1-Q^{ij}(t,W_t,M_t,\nu _t)\bigr )\mathrm {d}t\right\} \cdot \, \prod \limits _{\begin{array}{c} t\in (0,T],\\ \Delta N^{ij}_t\ne 0 \end{array}} \, Q^{ij}(t,W_t,M_t,\nu _t)\right) \nonumber \\&\quad \times \ |{\mathbb {W}}|^n\cdot \prod _{k=1}^n\kappa _k\bigl (W_k|W_1,\ldots ,W_{k-1},M_{T_k-}\bigr ); \end{aligned}$$
(1)

and the agent’s state process X is given by

$$\begin{aligned} \mathrm {d}X_t = \sum \limits _{\begin{array}{c} i,j\in {\mathbb {S}},\\ i\ne j \end{array}} \mathbb {1}_{\{X_{t-}=i\}} (j-i) \mathrm {d}N^{ij}_t\quad \text {for }t\in [T_k,T_{k+1}\rangle ,\ k\in [0:n], \end{aligned}$$
(2)

subject to the jump conditions

$$\begin{aligned} X_{T_k}=J^{X_{T_k-}}\bigl (T_k,W_{T_k},M_{T_k-}\bigr )\ \text {for\ } k\in [1:n]. \end{aligned}$$
(3)

Here \(N^{ij}\) triggers transitions from state i to state j, and \({\mathbb {P}}^\nu \) is defined in such a way that \(N^{ij}\) has \({\mathbb {P}}^\nu \)-intensity \(Q^{ij}(t,W_t,M_t,\nu _t)\); see Lemma 2 below.Footnote 4 In summary, in order to formulate a mean field model within the above setting, it suffices to specify

  • the agent’s state space \({\mathbb {S}}\), action space \({\mathbb {U}}\) and the common noise space \({\mathbb {W}}\),

  • the transition intensities Q(twmu), transition kernels \(\kappa _k(w_k|w_1,\ldots ,w_{k-1},m)\) and common noise jumps J(twm), and finally

  • the reward functions \(\psi (t,w,m,u)\) and \(\Psi (w,m)\).

2.3 State Dynamics

In what follows, we show that the preceding construction implies the dynamics described informally above.

Lemma 2

\(({\mathbb {P}}^\nu \)-dynamics) For each admissible strategy \(\nu \in {\mathcal {A}}\), \({\mathbb {P}}^\nu \) is a well-defined probability measure on \((\Omega ,{\mathfrak {A}})\), absolutely continuous with respect to \({\mathbb {P}}\), and satisfies

$$\begin{aligned} {\mathbb {P}}^\nu = {\mathbb {P}}\quad \text {on }\sigma (X_0). \end{aligned}$$

Moreover, \(N^{ij}\) is a counting process with \(({\mathfrak {F}},{\mathbb {P}}^\nu )\)-intensity \(\lambda ^{ij}=\{\lambda ^{ij}_t\}\), where

$$\begin{aligned} \lambda ^{ij}_t \triangleq Q^{ij}\left( t,W_t,M_t,\nu _t\right) \quad \text {for }t\in [0,T]\ \text {and }i,j\in {\mathbb {S}},\ i\ne j. \end{aligned}$$

Finally, for all \(k\in [1:n]\) we have

$$\begin{aligned} {\mathbb {P}}^\nu \bigl (W_k=w_k|{\mathfrak {G}}_{T_k-}\bigr ) = \kappa _k(w_k|W_1,\ldots ,W_{k-1},M_{T_k-})\quad \text {for all }w_1,\ldots ,w_k\in {\mathbb {W}}\end{aligned}$$

where \({\mathfrak {G}}\) denotes the common noise filtration and, in particular,

$$\begin{aligned} {\mathbb {P}}^{\nu _1} = {\mathbb {P}}^{\nu _2}\quad \text {on }{\mathfrak {G}}_T\quad \text {for all admissible strategies }\nu _1,\nu _2\in {\mathcal {A}}. \end{aligned}$$

Proof

We fix \(\nu \in {\mathcal {A}}\) and split the proof into four steps.

Step 1: \({\mathbb {P}}^\nu \) is well-defined by (1). Since \(N^{ij}\) is a standard Poisson process under \({\mathbb {P}}\), the compensated process \({\bar{N}}^{ij}_t\triangleq N^{ij}_t-t\), \(t\ge 0\), is an \(({\mathfrak {F}},{\mathbb {P}})\)-martingale for all \(i,j\in {\mathbb {S}}\), \(i\ne j\). We define \(\theta ^\nu =\{\theta ^\nu _t\}\) viaFootnote 5

$$\begin{aligned} \theta ^\nu _t \triangleq \sum \limits _{\begin{array}{c} i,j\in {\mathbb {S}},\\ i\ne j \end{array}}\int _0^t\bigl (Q^{ij}\bigl (s,W_s,\mu (s,W_s),\nu _s\bigr )-1\bigr )\mathrm {d}{\bar{N}}^{ij}_s,\quad t\in [0,T], \end{aligned}$$

and observe that the Doléans-Dade exponential \({\mathcal {E}}[\theta ^\nu ]\) is a local \(({\mathfrak {F}},{\mathbb {P}})\)-martingale with

$$\begin{aligned} {\mathcal {E}}[\theta ^\nu ]_t = \prod \limits _{\begin{array}{c} i,j\in {\mathbb {S}},\\ i\ne j \end{array}}\left( \exp \left\{ \int _0^t\bigl (1-Q^{ij}\bigl (s,W_s,\mu (s,W_s),\nu _s\bigr )\bigr )\mathrm {d}s\right\} \cdot \prod \limits _{\begin{array}{c} s\in (0,t],\\ \Delta N^{ij}_s\ne 0 \end{array}}Q^{ij}(s,W_s,\mu (s,W_s),\nu _s)\right) \end{aligned}$$
(4)

for \(t\in [0,T]\). Next, we define \(\vartheta =\{\vartheta _t\}\) via

$$\begin{aligned} \vartheta _t\triangleq \sum _{\begin{array}{c} k\in [1:n],\\ T_k\le t \end{array}}\Bigl (|{\mathbb {W}}|\cdot \kappa _k\bigl (W_k|W_1,\ldots ,W_{k-1},\mu (T_k-,W_{T_k-})\bigr )-1\Bigr ),\quad t\in [0,T], \end{aligned}$$

and note that \(\vartheta \) is an \(({\mathfrak {F}},{\mathbb {P}})\)-martingale. Indeed, for each \(k\in [0:n]\) we have \(\vartheta _t=\vartheta _{T_k}\) for \(t\in [T_k,T_{k+1}\rangle \) and, using that \(W_k\) is independent of \({\mathfrak {F}}_{T_k-}\) and uniformly distributed on \({\mathbb {W}}\) under \({\mathbb {P}}\), it follows that

$$\begin{aligned} {\mathbb {E}}\bigl [\vartheta _{T_k}|{\mathfrak {F}}_{T_k-}\bigr ]&= \vartheta _{T_k-}+{\mathbb {E}}\bigl [|{\mathbb {W}}|\cdot \kappa _k\bigl (W_k|W_1,\ldots ,W_{k-1},\mu (T_k-,W_{T_k-})\bigr )-1\big |{\mathfrak {F}}_{T_k-}\bigr ]\\&= \vartheta _{T_k-} - 1 + |{\mathbb {W}}|\cdot \sum _{w_k\in {\mathbb {W}}}{\mathbb {P}}\bigl (W_k=w_k|W_1,\ldots ,W_{k-1},\mu (T_k-,W_{T_k-})\bigr )\\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \times \kappa _k\bigl (w_k|W_1,\ldots ,W_{k-1},\mu (T_k-,W_{T_k-})\bigr )\\&= \vartheta _{T_k-} - 1 + |{\mathbb {W}}|\cdot \sum _{w_k\in {\mathbb {W}}}\frac{1}{|{\mathbb {W}}|}\cdot \kappa _k\bigl (w_k|W_1,\ldots ,W_{k-1},\mu (T_k-,W_{T_k-})\bigr ) = \vartheta _{T_k-}. \end{aligned}$$

Hence the Doléans-Dade exponential \({\mathcal {E}}[\vartheta ]\) is a local \(({\mathfrak {F}},{\mathbb {P}})\)-martingale, and we have

$$\begin{aligned} {\mathcal {E}}[\vartheta ]_t = \prod _{s\in (0,t]}(1+\Delta \vartheta _s) = \prod _{\begin{array}{c} k\in [1:n],\\ T_k\le t \end{array}} \Bigl (|{\mathbb {W}}|\cdot \kappa _k\bigl (W_k|W_1,\ldots ,W_{k-1},\mu (T_k-,W_{T_k-})\bigr )\Bigr )\nonumber \\ \end{aligned}$$
(5)

for \(t\in [0,T]\). Since \(\Delta N^{ij}_{T_k}=0\) for all \(i,j\in {\mathbb {S}}\), \(i\not =j\), and \(k\in [1:n]\) a.s., we have \([\theta ^\nu ,\vartheta ]=0\), and thus the process \(Z^\nu \triangleq {\mathcal {E}}[\theta ^\nu +\vartheta ]={\mathcal {E}}[\theta ^\nu ]\cdot {\mathcal {E}}[\vartheta ]\), i.e.

$$\begin{aligned} Z^\nu _t&= \prod \limits _{\begin{array}{c} i,j\in {\mathbb {S}},\\ i\ne j \end{array}}\left( \exp \left\{ \int _0^t\bigl (1-Q^{ij}\bigl (s,W_s,\mu (s,W_s),\nu _s\bigr )\bigr )\mathrm {d}s\right\} \cdot \prod \limits _{\begin{array}{c} s\in (0,t],\\ \Delta N^{ij}_s\ne 0 \end{array}}Q^{ij}(s,W_s,\mu (s,W_s),\nu _s)\right) \nonumber \\&\quad \times \prod _{\begin{array}{c} k\in [1:n],\\ T_k\le t \end{array}} \Bigl (|{\mathbb {W}}|\cdot \kappa _k\bigl (W_k|W_1,\ldots ,W_{k-1},\mu (T_k-,W_{T_k-})\bigr )\Bigr ) \end{aligned}$$
(6)

is a local \(({\mathfrak {F}},{\mathbb {P}})\)-martingale. Since

$$\begin{aligned} \sup _{t\in [0,T]}|{\mathcal {E}}[\theta ^\nu ]_t| \le \mathrm {e}^{d^2 T}\cdot \ell ^{Y} \end{aligned}$$
(7)

where \(\ell \triangleq \max _{i,j\in {\mathbb {S}},\,i\ne j}\Vert Q^{ij}\Vert _\infty \) and \(Y\triangleq \sum _{i,j\in {\mathbb {S}},\,i\ne j}N_T^{ij}\sim _{{\mathbb {P}}}\mathsf {Poisson}(d(d-1)T)\) and

$$\begin{aligned} \sup _{t\in [0,T]}|{\mathcal {E}}[\vartheta ]_t| \le |{\mathbb {W}}|^n \end{aligned}$$
(8)

it follows that \(\sup _{t\in [0,T]}|Z^\nu _t|\) is \({\mathbb {P}}\)-integrable, so \(Z^\nu \) is in fact an \(({\mathfrak {F}},{\mathbb {P}})\)-martingale. Since \(Z^\nu \) is non-negative with \(Z^\nu _0=1\) by construction, we conclude that \({\mathbb {P}}^\nu \) is a well-defined probability measure on \({\mathfrak {A}}\), absolutely continuous with respect to \({\mathbb {P}}\), with density process

$$\begin{aligned} \left. \frac{\mathrm {d}{\mathbb {P}}^\nu }{\mathrm {d}{\mathbb {P}}}\right| _{{\mathfrak {F}}_t} = Z^\nu _t,\quad t\in [0,T]. \end{aligned}$$

Step 2: \({\mathbb {P}}^{\nu }\)-intensity of \(N^{ij}\). Let \(i,j\in {\mathbb {S}}\) with \(i\ne j\). Since \({\mathbb {P}}^\nu \ll {\mathbb {P}}\) it is clear that \(N^{ij}\) is a \({\mathbb {P}}^\nu \)-counting process, so it suffices to show that the process given by

(9)

is a local \(({\mathfrak {F}},{\mathbb {P}}^\nu )\)-martingale. To show this, by Step 1 it suffices to demonstrate that is a local \(({\mathfrak {F}},{\mathbb {P}})\)-martingale. Noting that

  • \([N^{k\ell },N^{ij}]=\sum \limits _{s\in (0,\,\cdot \,]}\Delta N^{k\ell }_s\cdot \Delta N^{ij}_s=0\) whenever \(k,\ell \in {\mathbb {S}}\) and \((k,\ell )\ne (i,j)\),

  • \(\mathrm {d}Z^\nu _t=Z^\nu _{t-}\mathrm {d}\theta ^\nu _t + Z^\nu _{t-}\mathrm {d}\vartheta _t = \sum \limits _{\begin{array}{c} k,\ell \in {\mathbb {S}}, k\ne \ell \end{array}}Z^\nu _{t-}\left( Q^{k\ell }(t,W_t,\mu (t,W_t),\nu _t)-1\right) \mathrm {d}{\bar{N}}^{k\ell }_t + Z^\nu _{t-}\mathrm {d}\vartheta _t\),

  • ,

and using integration by parts, the local martingale property follows since

Step 3: \({\mathbb {P}}^\nu ={\mathbb {P}}\) on \(\sigma (X_0)\). For any function \(g:\ {\mathbb {S}}\rightarrow {\mathbb {R}}\) we have

$$\begin{aligned} {\mathbb {E}}^\nu [g(X_0)] = {\mathbb {E}}[g(X_0)\cdot Z_T^\nu ] = {\mathbb {E}}\bigl [g(X_0)\cdot {\mathbb {E}}[Z_T^\nu |{\mathfrak {F}}_0]\bigr ] = {\mathbb {E}}[g(X_0)\cdot Z^\nu _0] = {\mathbb {E}}[g(X_0)] \end{aligned}$$

by the \(({\mathfrak {F}},{\mathbb {P}})\)-martingale property of \(Z^\nu \).

Step 4: Distribution of \(W_k\) under \({\mathbb {P}}^\nu \). Let \(k\in [1:n]\) and \(w_1,\ldots ,w_k\in {\mathbb {W}}\). Since \({\mathcal {E}}[\theta ^\nu ]_{T_k}={\mathcal {E}}[\theta ^\nu ]_{T_k-}\) a.s. and \(W_k\) is uniformly distributed on \({\mathbb {W}}\) and independent of \({\mathfrak {F}}_{T_k-}\) under \({\mathbb {P}}\), iterated conditioning yields

$$\begin{aligned}&{\mathbb {P}}^\nu \bigl (W_1=w_1,\ldots ,W_k=w_k\bigr ) = {\mathbb {E}}\bigl [Z^\nu _{T_k} \cdot \mathbb {1}_{\{W_k=w_k\}} \cdot \mathbb {1}_{\{W_1=w_1,\ldots ,W_{k-1}=w_{k-1}\}}\bigr ]\\&= {\mathbb {E}}\bigl [ Z^\nu _{T_k-} \cdot |{\mathbb {W}}| \cdot \kappa _k(W_k|W_1,\ldots ,W_{k-1},\mu (T_k-,W_{T_k-})) \cdot \mathbb {1}_{\{W_k=w_k\}} \cdot \mathbb {1}_{\{W_1=w_1,\ldots ,W_{k-1}=w_{k-1}\}}\bigr ]\\&= |{\mathbb {W}}| \cdot \kappa _k(w_k|w_1,\ldots ,w_{k-1},\mu (T_k-,w_{T_k-})) \cdot {\mathbb {E}}\bigl [ Z^\nu _{T_k-} \cdot \mathbb {1}_{\{W_1=w_1,\ldots ,W_{k-1}=w_{k-1}\}} \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \cdot {\mathbb {P}}(W_k=w_k|{\mathfrak {F}}_{T_k-})\bigr ]\\&= \kappa _k(w_k|w_1,\ldots ,w_{k-1},\mu (T_k-,w_{T_k-})) \cdot {\mathbb {P}}^\nu \bigl (W_1=w_1,\ldots ,W_{k-1}=w_{k-1}\bigr ). \end{aligned}$$

Thus we have \({\mathbb {P}}^\nu (W_k=w_k|{\mathfrak {G}}_{T_k-}) = \kappa _k(w_k|W_1,\ldots ,W_{k-1},M_{T_k-})\) and the proof is complete. \(\square \)

Lemma 2 implies in particular that \({\mathbb {P}}^\nu (\Delta N^{ij}_t\ne 0)=0\) for every \(t\in [0,T]\), so as a consequence we have

$$\begin{aligned} \Delta X_t=0\quad {\mathbb {P}}^\nu \text {-a.s.\ for all }t\in [0,T]\setminus \{T_1,\ldots ,T_n\}. \end{aligned}$$

Moreover, since \({\mathbb {P}}^{\nu _1}={\mathbb {P}}^{\nu _2}\) on \({\mathfrak {G}}_T\) for all admissible controls \(\nu _1,\nu _2\in {\mathcal {A}}\) and \(M_t=\mu (t,W_t)\) for \(t\in [0,T]\), the agent’s ex ante beliefs concerning the common noise factors are the same, irrespective of his control.

3 Solution of the Optimization Problem

In the following, we solve the agent’s maximization problem (P\({}_\mu \)) using the associated dynamic programming equation (DPE). This is the same methodology as in [31] and [15]; see [22] for an alternative approach (to extended mean field games, but without common noise) based on backward SDEs.

The DPE for the value function of the agent’s optimization problem (P\({}_\mu \)) reads

$$\begin{aligned} 0 = \sup \limits _{u\in {\mathbb {U}}}\, \biggl \{ \frac{\partial v^i}{\partial t}(t,w) + \psi ^i\bigl (t,w,\mu (t,w),u\bigr ) + Q^{i\cdot }\bigl (t,w,\mu (t,w),u\bigr )\cdot v(t,w) \biggl \} \end{aligned}$$

for \(i\in {\mathbb {S}}\), subject to suitable consistency conditions for \(t=T_k\), \(k\in [1:n]\), and the terminal condition

$$\begin{aligned} v(T,w) = \Psi \bigl (w,\mu (T,w)\bigr )\quad \text {for all }w\in {\mathbb {W}}^n. \end{aligned}$$

Assumption 3

There exists a Borel measurable function \(h:\ [0,T]\times {\mathbb {W}}^n\times {\mathbb {M}}\times {\mathbb {R}}^d\rightarrow {\mathbb {U}}^d\) such that for every \(i\in {\mathbb {S}}\) and all \((t,w,m,v)\in [0,T]\times {\mathbb {W}}^n\times {\mathbb {M}}\times {\mathbb {R}}^d\) we have

$$\begin{aligned} h^i(t,w,m,v) \in \mathop {{{\,\mathrm{arg\,max}\,}}}\limits _{u\in {\mathbb {U}}} \bigl \{ \psi ^i(t,w,m,u) + Q^{i\cdot }(t,w,m,u)\cdot v \bigr \}. \end{aligned}$$

Assumption 3 is satisfied e.g. if \({\mathbb {U}}\) is compact and Q and \(\psi \) are continuous with respect to \(u\in {\mathbb {U}}\). Note that, since \(\psi ^i(\,\cdot \,,\,\cdot \,,m,u)\) and \(Q^{i\cdot }(\,\cdot \,,\,\cdot \,,m,u)\) are non-anticipative for \(m\in {\mathbb {M}}\), \(u\in {\mathbb {U}}\), we can assume without loss of generality that \(h(\,\cdot \,,\,\cdot \,,m,v)\) is non-anticipative for \(m\in {\mathbb {M}}\), \(v\in {\mathbb {R}}^d\). With this, we define

$$\begin{aligned}&{\widehat{Q}}:\ [0,T]\times {\mathbb {W}}^n\times {\mathbb {M}}\times {\mathbb {R}}^d\rightarrow {\mathbb {R}}^{d\times d}, \quad&{\widehat{Q}}^{ij}(t,w,m,v)\triangleq Q^{ij}\bigl ( t, w, m, h^i(t,w,m,v) \bigr ),\\&{\widehat{\psi }}:\ [0,T]\times {\mathbb {W}}^n\times {\mathbb {M}}\times {\mathbb {R}}^d\rightarrow {\mathbb {R}}^d, \quad&{\widehat{\psi }}^i(t,w,m,v)\triangleq \psi ^i\bigl ( t, w, m, h^i(t,w,m,v) \bigr ) \end{aligned}$$

and thus obtain the following reduced-form DPE, which we use in the following:

Definition 4

Let \(\mu :\ [0,T]\times {\mathbb {W}}^n\rightarrow {\mathbb {M}}\) be regular and non-anticipative. A function \(v:[0,T]\times {\mathbb {W}}^n\rightarrow {\mathbb {R}}^d\) is called a solution of (DP\({}_\mu \)) subject to (CC\({}_\mu \)), (TC\({}_\mu \)) if v is non-anticipative and satisfies the ordinary differential equation (ODE)Footnote 6

figure b

for \(t\in [T_k,T_{k+1}\rangle \), \(k\in [0:n]\), subject to the consistency and terminal conditions

figure c
figure d

for \(k\in [1:n]\) and all \(w\in {\mathbb {W}}^n\). Here, for \(k\in [1:n]\), the jump operator \(\Psi _k\) is defined via

$$\begin{aligned} \Psi ^i_k(w,m,{{\bar{v}}}) \triangleq \sum _{{{\bar{w}}}_k\in {\mathbb {W}}} \kappa _k\bigl ({{\bar{w}}}_k|w_1,\ldots ,w_{k-1},m\bigr ) \cdot {{\bar{v}}}^{J^i(T_k,(w_{-k},{{\bar{w}}}_k),m)}(w_{-k},{{\bar{w}}}_k),\ i\in {\mathbb {S}},\nonumber \\ \end{aligned}$$
(10)

where \({{\bar{v}}}:\, {\mathbb {W}}^n\rightarrow {\mathbb {R}}^d\) and \((w_{-k},{{\bar{w}}}_k)\triangleq (w_1,\ldots ,w_{k-1},{{\bar{w}}}_k,w_{k+1},\ldots ,w_n)\) for \({{\bar{w}}}_k\in {\mathbb {W}}\), \(w\in {\mathbb {W}}^n\). \(\square \)

Observe that (DP\({}_\mu \)) represents a system of (random) ODEs, coupled via \(w\in {\mathbb {W}}^n\). The ODEs run backward in time on each segment \([T_{k},T_{k+1}\rangle \times {\mathbb {W}}^n\), \(k\in [0:n]\), and their terminal conditions for \(t\uparrow T_{k+1}\) are specified by (TC\({}_\mu \)) for \(k=n\) and by (CC\({}_\mu \)) for \(k<n\). Note that for \(t\in [T_k,T_{k+1}\rangle \) the relevant common noise factors \(W_1,\ldots ,W_k\) are known.

Remark 5

While the significance of the DPE (DP\({}_\mu \)) and the terminal condition (TC\({}_\mu \)) are clear, the consistency conditions (CC\({}_\mu \)) warrant a brief comment: For \(i\in {\mathbb {S}}\), \(k\in [1:n]\) and \(w\in {\mathbb {W}}^n\) the state process jumps from state i to state \(j\triangleq J^i(T_k,(w_{-k},W_k),\mu (T_k-,w_{T_k-}))\) on \(\{X_{T_k-}=i\}\cap \{W_{T_k-}=w_{T_k-}\}\) when the common noise factor \(W_k\) is revealed at time \(T_k\).\(\square \)

We next link the solution of the DPE to the underlying stochastic control problem.

Theorem 6

(Verification) Suppose \(\mu :\ [0,T]\times {\mathbb {W}}^n\rightarrow {\mathbb {M}}\) is regular and non-anticipative and v is a solution of (DP\({}_\mu \)) subject to (CC\({}_\mu \)) and (TC\({}_\mu \)). Then v is the agent’s value function for problem (P\({}_\mu \)), i.e.

$$\begin{aligned} \sum _{i\in {\mathbb {S}}} {\mathbb {P}}(X_0=i)v^i(0) = \sup _{\nu \in {\mathcal {A}}}{\mathbb {E}}^\nu \Bigl [ \int _0^T \psi ^{X_t}(t,W_t,M_t,\nu _t)\mathrm {d}t + \Psi ^{X_T}(W_T,M_T) \Bigr ], \end{aligned}$$

and an optimal control is given by \({\widehat{\nu }}\in {\mathcal {A}}\) with

$$\begin{aligned} {\widehat{\nu }}\left( t,X_{(\,\cdot \,\wedge t)-},W_t\right) = h^{X_{t-}}\bigl (t,W_t,\mu (t,W_t),v(t,W_t)\bigr )\quad \text {for }t\in [0,T]. \end{aligned}$$

Proof

Let \(\nu \in {\mathcal {A}}\) be an admissible strategy. Until further notice we fix \(k\in [0:n]\).

Step 1: Dynamics on \([T_k,T_{k+1}\rangle \). From Itô’s lemma, applicable due to regularity of v, we obtain

(11)

Step 2: Jump dynamics at \(T_k\). We recall from Lemma 2 that

$$\begin{aligned} {\mathbb {P}}^\nu (W_k={{\bar{w}}}_k|X_{T_k-},W_1,\ldots ,W_{k-1})&= {\mathbb {P}}^\nu (W_k={{\bar{w}}}_k|W_1,\ldots ,W_{k-1})\\&= \kappa _k\bigl ({{\bar{w}}}_k|W_1,\ldots ,W_{k-1},\mu (T_k-,W_{T_k-})\bigr ). \end{aligned}$$

In view of the jump dynamics (3) and the consistency condition (CC\({}_\mu \)), we thus obtain

$$\begin{aligned}&{\mathbb {E}}^{\nu }\bigl [v^{X_{T_k}}\bigl (T_k,W_{T_k}\bigr )\big |\sigma \bigl (X_{T_k-},W_{T_k-}\bigr )\bigr ]\nonumber \\&\quad = {\mathbb {E}}^{\nu }\bigl [v^{J^{X_{T_k-}}(T_k,(W_{T_k-},W_k),\mu (T_k-,W_{T_k-}))}\bigl (T_k,(W_{T_k-},W_k)\bigr )\big |X_{T_k-},W_{T_k-}\bigr ]\nonumber \\&\quad = \sum _{{{\bar{w}}}_k\in {\mathbb {W}}} \kappa _k\bigl ({{\bar{w}}}_k|W_{T_k-},\mu (T_k-,W_{T_k-})\bigr )v^{J^{X_{T_k-}}(T_k,(W_{T_k-},{{\bar{w}}}_k),\mu (T_k-,W_{T_k-}))}\bigl (T_k,(W_{T_k-},{{\bar{w}}}_k)\bigr )\nonumber \\&\quad = \Psi ^{X_{T_k-}}_k\bigl (W_{T_k-}, \mu (T_k-,W_{T_k-}), v(T_k,\,\cdot \,)\bigr ) = v^{X_{T_k-}}(T_k-,W_{T_k-}). \end{aligned}$$
(12)

Step 3: Optimality. Combining (11) and (12) for \(k=[1:n]\) and using (TC\({}_\mu \)) yields

(13)

where for \(i,j\in {\mathbb {S}}\), \(i\ne j\) the local \(({\mathfrak {F}},{\mathbb {P}}^\nu )\)-martingale \(M^{ij}\) is given by

Since is a compensated counting process and v and Q are bounded, \(M^{ij}\) is in fact an \(({\mathfrak {F}},{\mathbb {P}}^\nu )\)-martingale. Hence taking \({\mathbb {P}}^\nu \)-expectations in (13), using the tower property of conditional expectation and the fact that \({\mathbb {P}}^{\nu }\) and \({\mathbb {P}}\) coincide on \(\sigma (X_0)\) by Lemma 2, and finally that v solves the DPE, we obtain

$$\begin{aligned} \sum _{i\in {\mathbb {S}}} {\mathbb {P}}(X_0=i)v^i(0)&= {\mathbb {E}}\bigl [v^{X_0}(0)\bigr ] = {\mathbb {E}}^{\nu }\bigl [v^{X_0}(0)\bigr ]\nonumber \\&= {\mathbb {E}}^\nu \bigg [ \Psi ^{X_T}\left( W_T,\mu \left( T,W_T\right) \right) \nonumber \\&\quad -\int _0^{T}\sum _{i=1}^d\mathbb {1}_{\left\{ X_s=i\right\} }\left( {\dot{v}}^i(s,W_s)+Q^{i\cdot }\left( s,W_s,\mu (s,W_s),\nu _s\right) \cdot v(s,W_s)\right) \mathrm {d}s\bigg ]\nonumber \\&\ge {\mathbb {E}}^\nu \bigg [\Psi ^{X_T}\left( W_T,M_T\right) +\int _0^T\psi ^{X_s}\left( s,W_s,M_s,\nu _s\right) \mathrm {d}s\bigg ]. \end{aligned}$$
(14)

If we replace \(\nu \) with \({\widehat{\nu }}\), the same argument applies with equality in (14); we thus conclude that v is the value function of (P\({}_\mu \)), and that the strategy \({\widehat{\nu }}\) is optimal. \(\square \)

The optimal strategy is Markovian in the agent’s state; this is unsurprising given the literature, see e.g. [31, Theorem 1] or [22, Proposition 3.9] and [15, Theorem 4]. Note, however, that the time-t optimal strategy may depend on all common noise events that have occurred up to time t, as \(W_t=(W_1,\ldots ,W_k)\) for \(t\in [T_k,T_{k+1}\rangle \). In the following, we denote by \(\widehat{{\mathbb {P}}}\) the probability measure

$$\begin{aligned} \widehat{{\mathbb {P}}}\triangleq {\mathbb {P}}^{{\widehat{\nu }}} \end{aligned}$$

where \({\widehat{\nu }}\) is the optimal control specified in Theorem 6. It follows from Lemma 2 that \(N^{ij}\) has \(\widehat{{\mathbb {P}}}\)-intensity \({\widehat{\lambda }}^{ij}=\{{\widehat{\lambda }}^{ij}_t\}\) for \(i,j\in {\mathbb {S}}\), \(i\ne j\), where

$$\begin{aligned} {\widehat{\lambda }}^{ij}_t\triangleq Q^{ij}\bigl (t,W_t,\mu (t,W_t),h^{X_{t-}}(t,W_t,\mu (t,W_t),v(t,W_t))\bigr )\quad \text {for }t\in [0,T]. \end{aligned}$$
(15)

4 Equilibrium

Having solved the agent’s optimization problem for a given ex ante function \(\mu \), we now turn to the resulting mean field equilibrium. We first identify the aggregate distribution resulting from the optimal control.

Remark 7

This paper generally adopts a “representative agent” point of view; an alternative justification of mean field equilibrium is via convergence of Nash equilibria of symmetric N-player games in the limit \(N\rightarrow \infty \); see, among others, [2, 14, 15, 18, 20, 22, 24, 28]. In the setting of this article (albeit under additional regularity conditions) a mean field limit justification can be provided along the lines of the proof of Theorem 7 in [31] by conditioning on common noise configurations, similarly as in the proof of Theorem 9 below.\(\square \)

4.1 Aggregation

Given an ex ante aggregate distribution specified in terms of a regular, non-anticipative function \(\mu \) and a corresponding solution v of (DP\({}_\mu \)) subject to (CC\({}_\mu \)), (TC\({}_\mu \)), Theorem 6 yields an optimal strategy \({\widehat{\nu }}\) for the agent’s optimization problem (P\({}_\mu \)). With \(\widehat{{\mathbb {P}}}\) denoting the probability measure associated with \({\widehat{\nu }}\), the resulting ex post aggregate distribution is given by the \({\mathbb {M}}\)-valued, \({\mathfrak {G}}\)-adapted process \({\widehat{M}}=\{{\widehat{M}}_t\}\),

$$\begin{aligned} {\widehat{M}}_t\triangleq \widehat{{\mathbb {P}}}(X_t\in \,\cdot \,\,|\,{\mathfrak {G}}_t)\quad \text {for }t\in [0,T] \end{aligned}$$

where \({\mathfrak {G}}\) denotes the common noise filtration. We note that \({\widehat{M}}\) is càdlàg since \({\mathfrak {G}}\) is piecewise constant and X is càdlàg. Equilibrium obtains if \({\widehat{M}}_t=\mu (t,W_t)\) for all \(t\in [0,T]\). To proceed, we aim for a more explicit description of \({\widehat{M}}\) and, in particular, its dynamics. Thus we define for \(k\in [1:n]\)

$$\begin{aligned} \Phi _k:\ {\mathbb {W}}^n\times {\mathbb {M}}\times {\mathbb {M}}\rightarrow {\mathbb {M}},\quad \Phi _k(w,m,{{\bar{m}}})\triangleq m\cdot P_k(w,{{\bar{m}}}), \end{aligned}$$
(16)

where \(P_k:\ {\mathbb {W}}^n\times {\mathbb {M}}\rightarrow \{0,1\}^{d\times d}\) is given by

$$\begin{aligned} P_k^{ij}(w,{{\bar{m}}}) \triangleq \mathbb {1}_{\left\{ J^i(T_k,w_1,\ldots ,w_{k},{{\bar{m}}})=j\right\} }\quad \text {for }i,j\in {\mathbb {S}}\end{aligned}$$

and we set

$$\begin{aligned} m_0\triangleq {\mathbb {P}}(X_0\in \,\cdot \,)=\widehat{{\mathbb {P}}}(X_0\in \,\cdot \,)\in {\mathbb {M}}. \end{aligned}$$

Lemma 8

Let \(\mu : [0,T]\times {\mathbb {W}}^n\rightarrow {\mathbb {M}}\) and \(v: [0,T]\times {\mathbb {W}}^n\rightarrow {\mathbb {R}}^d\) be regular and non-anticipative, and suppose that \(Y=\{Y_t\}\) is an \({\mathbb {M}}\)-valued stochastic process with dynamics

$$\begin{aligned}&Y_0=m_0,\quad Y_t = Y_{T_k} + \int _{T_k}^t Y_s \cdot {\widehat{Q}}\bigl (s,W_s,\mu (s,W_s),v(s,W_s)\bigr )\mathrm {d}s\nonumber \\&\quad \text {for } t\in [T_k,T_{k+1}\rangle ,\, k\in [0:n] \end{aligned}$$
(17)

that satisfies the consistency conditions

$$\begin{aligned} Y_{T_k}=\Phi _k\bigl ( W_{T_k}, Y_{T_k-}, \mu (T_k-,W_{T_k-}) \bigr )\quad \text {for }k\in [1:n]. \end{aligned}$$

Then Y is \({\mathfrak {G}}\)-adapted.

Proof

Step 1: Existence and uniqueness of Carathéodory solutions. For each \(k\in [0:n]\) and \(w\in {\mathbb {W}}^n\), since \(\mu \) and v are regular and Q is bounded, the function

$$\begin{aligned} f:\ [T_k,T_{k+1}]\times {\mathbb {R}}^{1\times d}\rightarrow {\mathbb {R}}^{1\times d},\quad f(t,y) \triangleq y\cdot {\widehat{Q}}\bigl (t,w,\mu (t,w),v(t,w)\bigr ) \end{aligned}$$

is measurable in the first and Lipschitz continuous in the second argument. Thus, using that \(\mu \), v and \({\widehat{Q}}\) are non-anticipative, a classical result, see [36, Theorem I.5.3], implies that for each initial condition \(y\in {\mathbb {R}}^{1\times d}\) there exists a unique Carathéodory solution \(\varphi _k^{y,w_{T_k}}:\ [T_k,T_{k+1}\rangle \rightarrow {\mathbb {R}}^{1\times d}\) of

$$\begin{aligned} \dot{y}(t) = y(t)\cdot {\widehat{Q}}\bigl (t,w_{T_k},\mu (t,w_{T_k}),v(t,w_{T_k})\bigr )\text { for } t\in [T_k,T_{k+1}\rangle ,\qquad y(T_k) = y. \end{aligned}$$

Step 2: Y is \({\mathfrak {G}}\)-adapted. First note that \(Y_0=m_0\) is clearly \({\mathfrak {G}}_0\)-measurable. Next, suppose that \(Y_{T_k}\) is \({\mathfrak {G}}_{T_k}\)-measurable, and note that for \(t\in [T_k,T_{k+1}\rangle \) we have \(W_t=W_{T_k}\), so

$$\begin{aligned} Y_t = Y_{T_k}+\int _{T_k}^t Y_s\cdot {\widehat{Q}}\bigl (s,W_{T_k},\mu (s,W_{T_k}),v(s,W_{T_k}) \bigr )\mathrm {d}s. \end{aligned}$$

Thus from uniqueness in part (a) it follows that we have the representation

$$\begin{aligned} Y_t = \varphi _k^{Y_{T_k}, W_{T_k}}(t)\quad \text {for } t\in [T_k,T_{k+1}\rangle . \end{aligned}$$

Hence \(Y_t\) is \({\mathfrak {G}}_{T_k}\)-measurable for all \(t\in [T_k,T_{k+1}\rangle \). Finally, for all \(k\in [0:(n-1)]\) the consistency condition implies that \( Y_{T_{k+1}} = \Phi _{k+1}( W_{T_{k+1}}, Y_{T_{k+1}-}, \mu (T_{k+1}-,W_{T_{k+1}-}))\) is \({\mathfrak {G}}_{T_{k+1}}\)-measurable, so the claim follows by induction on \(k\in [0:n]\). \(\square \)

Theorem 9

(Aggregation) Let \(\mu :\ [0,T]\times {\mathbb {W}}^n\rightarrow {\mathbb {M}}\) be regular and non-anticipative with \(\mu (0)=m_0\). Suppose v is a solution of (DP\({}_\mu )\) subject to (CC\({}_\mu )\), (TC\({}_\mu )\), and the agent implements his optimal strategy \({\widehat{\nu }}\) as defined in Theorem  6. Then the aggregate distribution \({\widehat{M}}\) has the \(\widehat{{\mathbb {P}}}\)-dynamics

$$\begin{aligned} \mathrm {d}{\widehat{M}}_t = {\widehat{M}}_t \cdot {\widehat{Q}}\bigl ( t, W_t, \mu (t,W_t), v(t,W_t) \bigr )\mathrm {d}t\quad \text {for }t\in [T_k,T_{k+1}\rangle ,\ k\in [0:n], \end{aligned}$$
(\text {M)

and satisfies the initial condition

figure e

and the jump conditions

figure f

Proof

Let \(w\in {\mathbb {W}}^n\) be a common noise configuration. Since X is defined path by path, see (2) and (3), we first note that \(X=X^w\) on \(\{W_T=w\}\), where \(X^w\) satisfies (2) and

$$\begin{aligned} X^w_{T_k}=J^{X^w_{T_k-}}\bigl (T_k,w_{T_k},\mu (T_k-,w_{T_k-})\bigr )\ \text {for }k\in [1:n]. \end{aligned}$$
(18)

We define \(\zeta (w)=\{\zeta (w)_t\}\) via

$$\begin{aligned} \zeta (w)_t&\triangleq \prod \limits _{\begin{array}{c} i,j\in {\mathbb {S}},\\ i\ne j \end{array}}\biggl (\exp \Bigl \{\int _0^t\Bigl (1-Q^{ij}\bigl (s,w_s,\mu (s,w_s),h^{X^w_{s-}}(s,w_s,\mu (s,w_s),v(s,w_s))\bigr )\Bigr )\mathrm {d}s\Bigr \}\nonumber \\&\quad \times \prod \limits _{\begin{array}{c} s\in (0,t],\\ \Delta N^{ij}_s\ne 0 \end{array}}Q^{ij}\bigl (s,w_s,\mu (s,w_s),h^{X^w_{s-}}(s,w_s,\mu (s,w_s),v(s,w_s))\bigr )\biggr ). \end{aligned}$$

Using analogous arguments as in Step 1 of the proof of Lemma 2 (see in particular (4) and (7)), it follows that there exists a probability measure \(\widehat{{\mathbb {P}}}^w\) with density process

$$\begin{aligned} \frac{\mathrm {d}\widehat{{\mathbb {P}}}^w}{\mathrm {d}{\mathbb {P}}}\bigg |_{{\mathfrak {H}}_t} \triangleq \zeta (w)_t \quad \text {for } t\in [0,T], \end{aligned}$$

where the filtration \({\mathfrak {H}}=\{{\mathfrak {H}}_t\}\) is given by

$$\begin{aligned} {\mathfrak {H}}_t\triangleq \sigma \bigl (X_0,\, N^{ij}_s\, :\, s\in [0,t];\ i,j\in {\mathbb {S}},\ i\ne j \bigr )\vee {\mathfrak {N}}\quad \text {for }t\in [0,T]. \end{aligned}$$

Furthermore, in view of (4) and (15) we have

$$\begin{aligned} \zeta (w) = {\mathcal {E}}[\theta ^{{\widehat{\nu }}}] \quad \text {on } \{W_T=w\}. \end{aligned}$$
(19)

Step 1: Conditional Kolmogorov dynamics. Throughout Step 1, we fix a common noise configuration \(w\in {\mathbb {W}}^n\). It follows exactly as in the proof of Lemma 2 (with \(\widehat{{\mathbb {P}}}^w\) in place of \(\widehat{{\mathbb {P}}}\)) that

$$\begin{aligned} \widehat{{\mathbb {P}}}^w\ll {\mathbb {P}},\qquad \widehat{{\mathbb {P}}}^w={\mathbb {P}}\quad \text {on }\sigma (X_0), \end{aligned}$$

and that for \(i,j\in {\mathbb {S}}\), \(i\ne j\), the process \(N^{ij}\) is a counting process with \(({\mathfrak {H}},\widehat{{\mathbb {P}}}^w)\)-intensity

$$\begin{aligned} Q^{ij}\bigl (t,w_t,\mu (t,w_t),h^{X^w_{t-}}(t,w_t,\mu (t,w_t),v(t,w_t))\bigr )\quad \text {for }t\in [0,T]. \end{aligned}$$

Boundedness of Q implies that for each \(z\in {\mathbb {R}}^d\) the process \(L^w[z]=\{L^w_t[z]\}\),

is an \(({\mathfrak {H}},\widehat{{\mathbb {P}}}^w)\)-martingale, where is given by

Using Itô’s lemma and the fact that \({\widehat{\lambda }}^{ij}_t = {\widehat{Q}}^{ij}(t,W_t,\mu (t,W_t),v(t,W_t))\) on \(\{X_{t-}=i\}\), \(t\in [0,T]\), by (15), we have for each \(z\in {\mathbb {R}}^d\), \(k\in [0:n]\) and \(t\in [T_k,T_{k+1}\rangle \)

$$\begin{aligned} z^{X^w_t} = z^{X^w_{T_k}}+L^w_t[z]-L^w_{T_k}[z]+\sum _{i=1}^d\int _{T_k}^t\mathbb {1}_{\{X_s^w=i\}}\cdot {\widehat{Q}}^{i\cdot }(s,w_s,\mu (s,w_s),v(s,w_s))\cdot z\ \mathrm {d}s. \end{aligned}$$

Taking expectations with respect to \(\widehat{{\mathbb {P}}}^w\) and using Fubini’s theorem yields

$$\begin{aligned} \widehat{{\mathbb {E}}}^w\bigl [z^{X^w_t}\bigr ]=\widehat{{\mathbb {E}}}^w\bigl [z^{X^w_{T_k}}\bigr ]+\sum _{i=1}^d\int _{T_k}^t\widehat{{\mathbb {P}}}^w(X^w_s=i)\cdot {\widehat{Q}}^{i\cdot }(s,w_s,\mu (s,w_s),v(s,w_s))\cdot z\ \mathrm {d}s, \end{aligned}$$

so with \(z=e_i\), \(i\in {\mathbb {S}}\), we get

$$\begin{aligned} \widehat{{\mathbb {P}}}^w(X_t^w=i) = \widehat{{\mathbb {P}}}^w(X^w_{T_k}=i)+\sum _{j=1}^d\int _{T_k}^t\widehat{{\mathbb {P}}}^w(X^w_s=j)\cdot {\widehat{Q}}^{ji}(s,w_s,\mu (s,w_s),v(s,w_s))\mathrm {d}s.\nonumber \\ \end{aligned}$$
(20)

It follows from (20) that \(\eta (w)=\{\eta (w)_t\}\),

$$\begin{aligned} \eta (w)_t \triangleq \ \widehat{{\mathbb {P}}}^w(X_t^w\in \,\cdot \,),\quad t\in [0,T] \end{aligned}$$
(21)

satisfies, for all \(i\in {\mathbb {S}}\) and \(k\in [0:n]\),

$$\begin{aligned} \eta (w)^i_t = \eta (w)^i_{T_k} + \int _{T_k}^t\eta (w)_s\cdot {\widehat{Q}}^{\cdot i}\left( s,w_s,\mu (s,w_s),v(s,w_s)\right) \mathrm {d}s\quad \text {for }t\in [T_k,T_{k+1}\rangle .\nonumber \\ \end{aligned}$$
(22)

Moreover, since \(\widehat{{\mathbb {P}}}^w={\mathbb {P}}\) on \(\sigma (X_0)\) and \(X_0^w=X_0\), \(\eta (w)\) satisfies the initial condition

$$\begin{aligned} \eta (w)_0 = \widehat{{\mathbb {P}}}^w(X^w_0\in \,\cdot \,)={\mathbb {P}}(X^w_0\in \,\cdot \,)={\mathbb {P}}(X_0\in \,\cdot \,)=m_0. \end{aligned}$$
(23)

Finally, consider a common noise time \(t=T_k\) and note that for all \(i\in {\mathbb {S}}\) the jump condition (18) implies

$$\begin{aligned} \eta (w)^i_{T_k}&= \widehat{{\mathbb {P}}}^w\bigl (X^w_{T_k}=i\bigr ) = \widehat{{\mathbb {P}}}^w\bigl (J^{X^w_{T_k-}}(T_k,w_{T_k},\mu (T_k-,w_{T_k-}))=i\bigr )\nonumber \\&= \sum _{j=1}^d\widehat{{\mathbb {P}}}^w\bigl (J^j(T_k,w_{T_k},\mu (T_k-,w_{T_k-}))=i\bigr |X^w_{T_k-}=j\bigr ) \cdot \widehat{{\mathbb {P}}}^w(X^w_{T_k-}=j)\nonumber \\&= \sum _{j=1}^d\mathbb {1}_{\left\{ J^j(T_k,w_{T_k},\mu (T_k-,w_{T_k-}))=i\right\} }\cdot \widehat{{\mathbb {P}}}^w(X^w_{T_k-}=j)\nonumber \\&= \sum _{j=1}^d P_k^{ji}(w_{T_k}, \mu (T_k-,w_{T_k-}))\cdot \eta (w)^j_{T_k-} = \Phi _k^i\bigl (w_{T_k},\eta (w)_{T_k-}, \mu (T_k-,w_{T_k-})\bigr ). \end{aligned}$$
(24)

Since \(\eta (W_T)= \sum _{w\in {\mathbb {W}}^n}\mathbb {1}_{\{W_T=w\}} \cdot \eta (w)\), in view of (22), (23) and (24) it follows from Lemma 8 that the process \(\eta (W_T)\) is \({\mathfrak {G}}\)-adapted.

Step 2: Identification of \(\eta (W_T)\). Recall that \({\mathfrak {G}}_T=\sigma (W_T)\vee {\mathfrak {N}}\) and let \(w\in {\mathbb {W}}^n\). For \(t\in [0,T]\) and \(i\in {\mathbb {S}}\) we have by (6) and (19)

$$\begin{aligned}&\widehat{{\mathbb {E}}}\bigl [\mathbb {1}_{\{W_T=w\}} \cdot \mathbb {1}_{\{X_t=i\}}\bigr ] = {\mathbb {E}}\bigl [\mathbb {1}_{\{W_T=w\}} \cdot \mathbb {1}_{\{X^w_t=i\}} \cdot Z^{{\widehat{\nu }}}_T\bigr ] = {\mathbb {E}}\bigl [\mathbb {1}_{\{W_T=w\}} \cdot \mathbb {1}_{\{X^w_t=i\}} \cdot \zeta (w)_T \cdot {\mathcal {E}}[\vartheta ]_T\bigr ]\\&= \prod _{k=1}^n\bigl (|{\mathbb {W}}|\cdot \kappa _k(w_k|w_1,\dots ,w_{k-1},\mu (T_k-,w_{T_k-}))\bigr ) \cdot {\mathbb {E}}\bigl [\mathbb {1}_{\{W_T=w\}} \cdot \mathbb {1}_{\{X^w_t=i\}} \cdot \zeta (w)_T\bigr ]\\&= |{\mathbb {W}}|^n \cdot \widehat{{\mathbb {P}}}(W_T=w) \cdot {\mathbb {P}}(W_T=w) \cdot \widehat{{\mathbb {P}}}^w(X_t^w=i) = \widehat{{\mathbb {E}}}\bigl [\mathbb {1}_{\{W_T=w\}} \cdot \eta (W_T)^i_t\bigr ], \end{aligned}$$

where in the final line the first identity is due to Lemma 2 and \({\mathbb {P}}\)-independence of \((\zeta (w),X^w)\) and \({\mathfrak {G}}_T\); and the second is due to (21) and the fact that \({\mathbb {P}}(W_T=w)=1/|{\mathbb {W}}|^n\). Thus

$$\begin{aligned} \widehat{{\mathbb {P}}}(X_t\in \,\cdot \,|{\mathfrak {G}}_T) = \eta (W_T)_t \quad \widehat{{\mathbb {P}}}\text {-a.s.~for } t\in [0,T]. \end{aligned}$$

Step 3: Dynamics of \({\widehat{M}}\). By Step 2 and the tower property of conditional expectation, we find that for each \(i\in {\mathbb {S}}\) and \(t\in [0,T]\)

$$\begin{aligned} {\widehat{M}}^i_t = \widehat{{\mathbb {P}}}(X_t=i|{\mathfrak {G}}_t) = \widehat{{\mathbb {E}}}\bigl [\widehat{{\mathbb {E}}}[\mathbb {1}_{\left\{ X_t=i\right\} }|{\mathfrak {G}}_T]|{\mathfrak {G}}_t\bigr ] = \widehat{{\mathbb {E}}}\bigl [\eta (W_T)^i_t|{\mathfrak {G}}_t\bigr ] = \eta (W_T)^i_t \quad \widehat{{\mathbb {P}}}\text {-a.s.}, \end{aligned}$$

where the final identity is due to the fact that \(\eta (W_T)\) is \({\mathfrak {G}}\)-adapted by Step 1 and \(\widehat{{\mathbb {E}}}\) denotes \(\widehat{{\mathbb {P}}}\)-expectation. Since both \({\widehat{M}}\) and \(\eta (W_T)\) are càdlàg, it follows that \({\widehat{M}}=\eta (W_T)\) \(\widehat{{\mathbb {P}}}\)-a.s., and (M), (\(\text {M}_0\)) and (\(\text {M}_k\)) follow from (22), (23) and (24). \(\square \)

As a by-product, the preceding proof yields the alternative representation

$$\begin{aligned} {\widehat{M}}_t = \widehat{{\mathbb {P}}}(X_t\in \,\cdot \,\,|\,{\mathfrak {G}}_T)\quad \text {for }t\in [0,T],\ \widehat{{\mathbb {P}}}\text {-a.s.} \end{aligned}$$

4.2 Mean Field Equilibrium System

As discussed above, equilibrium obtains if the agents’ ex ante beliefs coincide with the ex post outcome. This holds if and only if the ex post aggregate distribution process \({\widehat{M}}\) from (M) satisfies

$$\begin{aligned} \widehat{{\mathbb {P}}}(X_t\in \,\cdot \,|{\mathfrak {G}}_t) = {\widehat{M}}_t \overset{!}{= }M_t = \mu (t,W_t)\quad \text {for all }t\in [0,T]. \end{aligned}$$

Definition 10

(Equilibrium System). A pair \((\mu ,v)\) of regular and non-anticipative functions

$$\begin{aligned} \mu :\ [0,T]\times {\mathbb {W}}^n\rightarrow {\mathbb {M}}\qquad \text {and}\qquad v:\ [0,T]\times {\mathbb {W}}^n\rightarrow {\mathbb {R}}^d \end{aligned}$$

is called a rational expectations equilibrium, or briefly an equilibrium, if for all \(w\in {\mathbb {W}}^n\)

for \(t\in [T_k,T_{k+1}\rangle \), \(k\in [0:n]\), subject to the consistency conditionsFootnote 7

for \(k\in [1:n]\), and the initial/terminal conditions

We also refer to (E1)-(E6) as the equilibrium system. \(\square \)

In combination, Theorem 6 and Theorem 9 demonstrate that, given a solution \((\mu ,v)\) of the equilibrium system, v is the value function of the agent’s optimization problem (P\({}_\mu \)) with ex ante aggregate distribution \(\mu \); and the ex post distribution resulting from the corresponding optimal strategy is given by \(\mu \) itself. Thus we can identify a mean field equilibrium with common noise by producing a solution of the equilibrium system (E1)-(E6). We provide some illustrations in Sect. 5. Theorems 13 and 16 below ensure that this is feasible by showing that, under suitable continuity and monotonicity conditions, there exists a unique solution of the equilibrium system. The proofs are ramifications of classical arguments, based on Schauder’s fixed point theorem and monotonicity arguments, respectively.

We set

$$\begin{aligned}&Q_{\max } \triangleq \sup _{\begin{array}{c} t\in [0,T],\, w\in {\mathbb {W}}^n\\ m\in {\mathbb {M}},\, u\in {\mathbb {U}} \end{array}} \bigl \Vert Q(t,w,m,u)\bigr \Vert , \quad \psi _{\max } \triangleq \sup _{\begin{array}{c} t\in [0,T],\, w\in {\mathbb {W}}^n\\ m\in {\mathbb {M}},\, u\in {\mathbb {U}} \end{array}} \bigl \Vert \psi (t,w,m,u)\bigr \Vert , \\&\Psi _{\max } \triangleq \sup _{\begin{array}{c} m\in {\mathbb {M}}\\ w\in {\mathbb {W}}^n \end{array}}\Vert \Psi (w,m)\Vert \end{aligned}$$

and

$$\begin{aligned} v_{\max } \triangleq \bigl (\Psi _{\max } + T \cdot \psi _{\max }\bigr ) \cdot \mathrm {e}^{Q_{\max } \cdot T}. \end{aligned}$$
(25)

Note that these constants depend only on the underlying model coefficients.

Assumption 11

  1. (i)

    The reduced-form running reward function \({\widehat{\psi }}\) satisfies

    $$\begin{aligned} \Vert {\widehat{\psi }}(t,w,m_1,v_1)-{\widehat{\psi }}(t,w,m_2,v_2)\Vert \le L_{{\widehat{\psi }}} \cdot \bigl (\Vert m_1-m_2\Vert + \Vert v_1-v_2\Vert \bigr ) \end{aligned}$$

    for all \(t\in [0,T]\), \(w\in {\mathbb {W}}^n\), \(m_1,m_2\in {\mathbb {M}}\) and \(v_1,v_2\in {\mathbb {R}}^d\) with \(\Vert v_1\Vert ,\Vert v_2\Vert \le v_{\max }\), for some \(L_{{\widehat{\psi }}}>0\).

  2. (ii)

    The reduced-form intensity matrix function \({\widehat{Q}}\) satisfies

    $$\begin{aligned} \bigl \Vert {\widehat{Q}}(t,w,m_1,v_1)-{\widehat{Q}}(t,w,m_2,v_2)\bigr \Vert \le L_{{\widehat{Q}}} \cdot \bigl (\Vert m_1-m_2\Vert + \Vert v_1-v_2\Vert \bigr ) \end{aligned}$$

    for all \(t\in [0,T]\), \(w\in {\mathbb {W}}^n\), \(m_1,m_2\in {\mathbb {M}}\) and \(v_1,v_2\in {\mathbb {R}}^d\) with \(\Vert v_1\Vert ,\Vert v_2\Vert \le v_{\max }\), for some \(L_{{\widehat{Q}}}>0\).

  3. (iii)

    The terminal reward function \(\Psi \) is continuous with respect to m, i.e. for every \(w\in {\mathbb {W}}^n\) the map \(\Psi (w,\,\cdot \,)\) is continuous.

  4. (iv)

    For each \(k\in [1:n]\) and all \(i\in {\mathbb {S}}\), \(w\in {\mathbb {W}}^n\) and \(v\in {\mathbb {R}}^d\) with \(\Vert v\Vert \le v_{\max }\), the map

    $$\begin{aligned} {\mathbb {M}}\ni m \mapsto \sum _{{{\bar{w}}}_k\in {\mathbb {W}}} \kappa _k\bigl ({{\bar{w}}}_k|w_1,\ldots ,w_{k-1},m\bigr ) v^{J^i(T_k,(w_{-k},{\bar{w}}_k),m)} \in {\mathbb {R}}\quad \text {is continuous.} \end{aligned}$$
  5. (v)

    For each \(k\in [1:n]\) and \(w\in {\mathbb {W}}^n\) the map \(\Phi _k(w,\,\cdot \,)\) is continuous. \(\square \)

Since all norms on \({\mathbb {R}}^d\) are equivalent, the concrete specification is immaterial for Assumption 11. For the sake of convenience, in the following we use the maximum norm on \({\mathbb {R}}^d\) and a compatible matrix norm on \({\mathbb {R}}^{d\times d}\); moreover, we suppose that (ii) holds for both \({\widehat{Q}}\) and \({\widehat{Q}}^{^{{\mathsf {T}}}}\).

Remark 12

Sufficient conditions for Assumptions 11(i)-(ii) in terms of the model’s primitives can be found in, e.g., [31] or [15]. Furthermore, in the special case where the jump map J is independent of \(m\in {\mathbb {M}}\), Assumption 11(v) is trivially satisfied, and continuity of the transition kernels \(\kappa _k\) with respect to m is sufficient for Assumption 11(iv) to hold.\(\square \)

Theorem 13

(Existence of Equilibria) If Assumption  11 holds, then there exists a solution of the equilibrium system (E1)– (E6).

Proof

See Appendix A. \(\square \)

The reduced-form Hamiltonian \(\widehat{{\mathcal {H}}}:\, [0,T] \times {\mathbb {W}}^n\times {\mathbb {M}}\times {\mathbb {R}}^d\rightarrow {\mathbb {R}}^d\) is defined via

$$\begin{aligned} \widehat{{\mathcal {H}}}^i(t,w,m,v)&\triangleq \sup _{u\in {\mathbb {U}}} \psi ^i(t,w,m,u)+Q^{i\cdot }(t,w,m,u)\cdot v \\&={\widehat{\psi }}^i(t,w,m,v)+{\widehat{Q}}^{i\cdot }(t,w,m,v)\cdot v. \end{aligned}$$

Assumption 14

Let Assumptions 11(i) and (ii) hold, and suppose that:

  1. (i)

    The terminal payoff function \(\Psi \) is monotone with respect to \(m\in {\mathbb {M}}\), i.e.

    $$\begin{aligned} (m_1 - m_2) \cdot \bigl [ \Psi (w,m_1) - \Psi (w,m_2) \bigr ] \le 0 \quad \text {for all } w\in {\mathbb {W}}^n,\ m_1,m_2\in {\mathbb {M}}. \end{aligned}$$
  2. (ii)

    The reduced-form Hamiltonian \(\widehat{{\mathcal {H}}}\) is convex with respect to v, i.e. for all \(i\in {\mathbb {S}}\), \(t\in [0,T]\), \(w\in {\mathbb {W}}^n\), \(m\in {\mathbb {M}}\) and \(v_1,v_2\in {\mathbb {R}}^d\) satisfying \(\Vert v_1\Vert , \Vert v_2\Vert \le v_{\max }\) we have

    $$\begin{aligned} \widehat{{\mathcal {H}}}^i(t,w,m,v_2) - \widehat{{\mathcal {H}}}^i(t,w,m,v_1) - {\widehat{Q}}^{i\cdot }(t,w,m,v_1) \cdot (v_2-v_1) \ge 0. \end{aligned}$$
  3. (iii)

    The reduced-form Hamiltonian \(\widehat{{\mathcal {H}}}\) satisfies a uniform monotonicity condition with respect to \(m\in {\mathbb {M}}\), i.e. there exist \(\alpha ,\gamma >0\) such that

    $$\begin{aligned}&m_1 \cdot \bigl [ \widehat{{\mathcal {H}}}(t,w,m_2,v_2) - \widehat{{\mathcal {H}}}(t,w,m_1,v_2) \bigr ] + m_2 \cdot \bigl [ \widehat{{\mathcal {H}}}(t,w,m_1,v_1) \\&\quad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \qquad - \widehat{{\mathcal {H}}}(t,w,m_2,v_1) \bigr ] \ge \gamma \cdot \Vert m_1-m_2\Vert ^\alpha \end{aligned}$$

    for all \(t\in [0,T]\), \(w\in {\mathbb {W}}^n\), \(m_1,m_2\in {\mathbb {M}}\) and \(v_1,v_2\in {\mathbb {R}}^d\) with \(\Vert v_1\Vert , \Vert v_2\Vert \le v_{\max }\).

  4. (iv)

    For \(k\in [1:n]\) the maps \(\kappa _k\) and J satisfy the following monotonicity conditions in \(m\in {\mathbb {M}}\): For all \(w\in {\mathbb {W}}^n\), \(m_1,m_2\in {\mathbb {M}}\) and \(v_1,v_2\in {\mathbb {R}}^d\) satisfying \(\Vert v_1\Vert ,\Vert v_2\Vert \le v_{\max }\) as well as

    $$\begin{aligned} \bigl [ \Phi _k(w,m_1) - \Phi _k(w,m_2) \bigr ] \cdot (v_1-v_2) \le 0 \end{aligned}$$

    we have

    $$\begin{aligned}&\bigl [ \kappa _k(w_k|w_1,\ldots ,w_{k-1},m_1) - \kappa _k(w_k|w_1,\ldots ,w_{k-1},m_2) \bigr ] \nonumber \\&\quad \cdot \bigl ( m_2\cdot v_1^{J^{\cdot }(T_k,w,m_2)} - m_1\cdot v_1^{J^{\cdot }(T_k,w,m_1)} \nonumber \\&\quad + m_2\cdot v_2^{J^{\cdot }(T_k,w,m_2)} - m_1\cdot v_2^{J^{\cdot }(T_k,w,m_1)} \bigr ) \ge 0 \end{aligned}$$
    (26)

    and

    $$\begin{aligned}&\kappa _k(w_k|w_1,\ldots ,w_{k-1},m_2) \cdot m_1 \cdot \bigl ( v_2^{J^{\cdot }(T_k,w,m_2)} - v_2^{J^{\cdot }(T_k,w,m_1)} \bigr ) \nonumber \\&\quad + \kappa _k(w_k|w_1,\ldots ,w_{k-1},m_1) \cdot m_2 \cdot \bigl ( v_1^{J^{\cdot }(T_k,w,m_1)} - v_1^{J^{\cdot }(T_k,w,m_2)} \bigr ) \ge 0. \end{aligned}$$
    (27)

The constant \(v_{\max }>0\) in 14(ii)-(iv) is defined in (25). Conditions 14(i)-(iii) are standard given the literature; see, e.g., Assumptions 1-3 in [31].Footnote 8

Remark 15

Assumption 14 simplifies if some model coefficients do not depend on the mean field parameter \(m\in {\mathbb {M}}\):

  1. (a)

    If \({\widehat{Q}}\) is independent of m, 14(iii) reduces to a monotonicity condition for \({\widehat{\psi }}\).

  2. (b)

    In 14(iv), (26) is trivially satisfied if the probability weights \(\kappa _k\) do not depend on m.

  3. (c)

    In 14(iv), (27) is trivially satisfied if the jump map J is independent of m.\(\square \)

Theorem 16

(Uniqueness of Equilibria) Under the monoticity conditions stated in Assumption  14, the equilibrium system  (E1)– (E6) possesses at most one solution.

Proof

See Appendix B. \(\square \)

5 Applications

Before we illustrate our results in two showcase examples, we briefly discuss our numerical approach to the equilibrium system (E1)-(E6). (E1)-(E2) is a forward-backward system of 2d ODEs with boundary conditions (E3)-(E6), coupled through the parameter \(w\in {\mathbb {W}}^n\) representing common noise configurations. The special case \(n=0\) (no common noise) corresponds to the setting of [31] and [15], with the equilibrium system reducing to a single 2d-dimensional forward-backward ODE. For \(n\ge 1\), the consistency conditions (E3)-(E4) specify initial conditions for \(\mu \) on \([T_k,T_{k+1}\rangle \) and terminal conditions for v on \([T_{k-1},T_k\rangle \), \(k\in [1:n]\); since these conditions are interconnected, there is in general no segment \([T_k,T_{k+1}\rangle \times {\mathbb {W}}^n\) where the equilibrium system yields both an explicit initial condition for \(\mu \) and an explicit terminal condition for v, so we cannot simply split the problem into subintervals. Rather, the equilibrium system can be regarded as a multi-point boundary value problem where for each of the \(|{\mathbb {W}}|^k\) conceivable combinations of common noise factors on \([T_k,T_{k+1}\rangle \), \(k\in [0:n]\), we have to solve a coupled forward-backward system of ODEs in 2d dimensions, resulting in a tree of such systems of size

$$\begin{aligned} \sum _{k=0}^n|{\mathbb {W}}|^k=\frac{|{\mathbb {W}}|^{n+1}-1}{|{\mathbb {W}}|-1}\in {\mathcal {O}}(|{\mathbb {W}}^n|). \end{aligned}$$

Our approach to solving (E1)-(E6) numerically is to rely on the probabilistic interpretation as a fixed-point system, based on Theorem 13. Thus, starting from an initial flow of probability weights \(\mu _0(t,w)\), \((t,w)\in [0,T]\times {\mathbb {W}}^n\) with \(\mu _0(0,w)=m_0\) for all \(w\in {\mathbb {W}}^n\), we solve (DP\({}_\mu \)) subject to (TC\({}_\mu \)) and (CC\({}_\mu \)) backward in time for all non-negligible common noise configurations \(w\in {\mathbb {W}}^n\) to obtain the value \(v_0(t,w)\), \((t,w)\in [0,T]\times {\mathbb {W}}^n\), of the agents’ optimal response to the given belief \(\mu _0\). This, in turn, is used to solve (M) subject to (\(\text {M}_0\)) and (\(\text {M}_k\)) forward in time. As a result, we obtain an ex post aggregate distribution \(\mu _1(t,w)\), \((t,w)\in [0,T]\times {\mathbb {W}}^n\); we then iterate this with \(\mu _1\) in place of \(\mu _0\), etc.Footnote 9

5.1 A Decentralized Agricultural Production Model

As a first (stylized) example we consider a mean field game of agents, each of which owns (an infinitesimal amount of) land of identical size and quality within a given area. If it is farmed, each field has a productivity \(f(w_k)>0\) depending on the common weather condition \(w_k\). We assume that weather is either good, bad or catastrophic, so \(w_k\in {\mathbb {W}}\triangleq \{\uparrow ,\downarrow ,\lightning \}\), and changes at given common noise times \(T_1,\ldots ,T_n\).

Each agent is in exactly one state \(i\in {\mathbb {S}}\triangleq \{0,1\}\) depending on whether he grows crops on his field (\(i=1\), the agent is a farmer) or not (\(i=0\)). The selling price p for his harvest depends on aggregate production, and thus in particular on the proportion \(m^1\in [0,1]\) of farmers; the mean field interaction is transmitted through the market price of the crop. We assume that p is a strictly decreasing function of overall production \(f(w_k)\cdot m^1\); see Fig. 1 for illustration.

Fig. 1
figure 1

Price function p (parameters as in Table 1)

We assume that \(f(\uparrow )\ge f(\downarrow )=f(\lightning )\ge 0\). Moreover, on the catastrophic event \(\{W_k=\lightning \}\) all agents are reduced to being non-farmers, and thus

$$\begin{aligned} J^i(t,w,m) \triangleq {\left\{ \begin{array}{ll} 0 &{} \text {if } t\in \{T_1,\ldots ,T_n\},\, t=T_k,\, w_k=\lightning ,\\ i &{} \text {else} \end{array}\right. } \end{aligned}$$

for \((i,t,w,m)\in {\mathbb {S}}\times [0,T]\times {\mathbb {W}}^n\times {\mathbb {M}}\). Each agent can make an effort \(u\in {\mathbb {U}}\triangleq [0,\infty )\) to become being a farmer; the intensity matrix for state transitions is given by

$$\begin{aligned} Q(t,w,m,u) = \left[ \begin{array}{cc} -u \cdot q_{\mathrm {entry}} &{} u \cdot q_{\mathrm {entry}} \\ q_{\mathrm {exit}} &{} -q_{\mathrm {exit}} \end{array} \right] \quad \text {for } (t,w,m,u)\in [0,T]\times {\mathbb {W}}^n\times {\mathbb {M}}\times {\mathbb {U}}, \end{aligned}$$

where \(q_{\mathrm {entry}},q_{\mathrm {exit}}\ge 0\) are given transition rates. The running rewards capture the fact that both efforts to building up farming capacities and production itself are costly, while revenues from the sales of the crop generate profits; thus

$$\begin{aligned} \psi ^0(t,w,m,u) = -\frac{1}{2}c_{\mathrm {entry}} \cdot u^2\quad \text {and}\quad \psi ^1(t,w,m,u) = p\bigl (f(w_k) \cdot m^1\bigr ) \cdot f(w_k) - c_{\mathrm {prod}} \end{aligned}$$

for \(t\in [T_k,T_{k+1}\rangle \), \(k\in [0:n]\), where \(w_0\triangleq \ \uparrow \) and \(c_{\mathrm {entry}},c_{\mathrm {prod}}\ge 0\). The terminal reward is zero. It follows that the maximizer \(h^0\) in Assumption 3 is unique and given by

$$\begin{aligned} h^0(t,w,m,v) = \frac{q_{\mathrm {entry}}}{c_{\mathrm {entry}}}(v^1-v^0)^+; \end{aligned}$$

a specification of \(h^1\) is immaterial. We choose \(m^1_0\triangleq 10\%\) for the initial proportion of farmers, and report the relevant coefficients in Table 1.

Table 1 Coefficients in the agricultural production model

Our results for the evolution of the mean field equilibrium are shown in Figs. 2 and 3 for various common noise configurations \(w\in {\mathbb {W}}^n\) and the following two baseline models:

\((\mathrm {nC})\):

Catastrophic weather conditions do not occur; we use

$$\begin{aligned} \kappa _k(\uparrow |w_1,\ldots ,w_{k-1},m) = \kappa _k(\downarrow |w_1,\ldots ,w_{k-1},m) = 0.5 \end{aligned}$$

for all \(w\in {\mathbb {W}}^n\) and \(m\in {\mathbb {M}}\).

\((\mathrm {C})\):

Catastrophic events are likely; we use

$$\begin{aligned} \kappa _k(\uparrow |w_1,\ldots ,w_{k-1},m)= & {} 0.25,\quad \kappa _k(\downarrow |w_1,\ldots ,w_{k-1},m) = 0.25, \\ \kappa _k(\, \lightning \, |w_1,\ldots ,w_{k-1},m)= & {} 0.5 \end{aligned}$$

for all \(w\in {\mathbb {W}}^n\) and \(m\in {\mathbb {M}}\).

Fig. 2
figure 2

Proportion of farmers in model \((\mathrm {C})\) for all possible common noise configurations \(w\in {\mathbb {W}}^n\)

Fig. 3
figure 3

Proportion of farmers in models \((\mathrm {nC})\) and \((\mathrm {C})\)

The model specified above satisfies both Assumption 11 and Assumption 14, so Theorems 13 and 16 guarantee the existence of a unique mean field equilibrium characterized by (E1)-(E6). Figure 2 illustrates the tree of all possible equilibrium evolutions in model \((\mathrm {C})\). Figs. 3, 4 and 5 illustrate the resulting equilibrium proportions of farmers, optimal actions, and market prices for some fixed common noise configurations. To illustrate the effect of uncertainty about future weather conditions we also show, for each common noise configuration, the theoretical perfect-foresight equilibria that would pertain if future weather conditions were known; these are plotted using dashed lines in Figs. 3, 4 and 5, and the subscript \(\circ \) indicates the relevant deterministic common noise path. Equilibrium prices are stochastically modulated by the prevailing weather conditions, both directly and indirectly: First, prices jump at common noise times due to weather-related changes in productivity. Second, weather conditions indirectly affect market prices through their effect on the proportion of farming agents. Thus, with consistently good weather conditions, agents are strongly incentivized to become farmers, see Fig. 4; the fraction of farmers increases, see Fig. 3; and hence increased production drives down prices, see Fig. 5. By contrast, under bad weather conditions, incentives are weaker and prices remain higher. Both effects are dampened if a catastrophic event may occur. In addition, efforts tend to decrease between common noise times; this is due to the uncertainty of future weather conditions; this effect is more pronounced in the presence of catastrophic events.

Fig. 4
figure 4

Optimal action \(h^0\) of non-farmers in models \((\mathrm {nC})\) and \((\mathrm {C})\)

Fig. 5
figure 5

Equilibrium market prices in models \((\mathrm {nC})\) and \((\mathrm {C})\)

5.2 An \(\mathrm {SIR}\) Model with Random One-Shot Vaccination

Our second application is a mean field game of agents that are confronted with the spread of an infectious disease. Our main focus is to illustrate the qualitative effects of common noise on the equilibrium behavior of the system. We consider a classical \(\mathrm {SIR}\) model setup with \({\mathbb {S}}=\{\mathrm {S},\mathrm {I},\mathrm {R}\}\): Each agent can be either susceptible to infection (\(\mathrm {S}\)), infected and simultaneously infectious for other agents (\(\mathrm {I}\)), or recovered and thus immune to (re-)infection (\(\mathrm {R}\)); see Fig. 6.

Fig. 6
figure 6

State space and transitions in the \(\mathrm {SIR}\) model

The infection rate is proportional to the prevalence of the disease, i.e. the percentage of currently infected agents. Susceptible agents can make individual efforts of size \(u\in {\mathbb {U}}\triangleq [0,1]\) to protect themselves against infection and thus reduce intensity of infection. The transition intensities are given by

$$\begin{aligned} Q(t,w,m,u) \triangleq \left[ \begin{array}{ccc} -q_{\mathrm {inf}}(t,w,m,u) &{} q_{\mathrm {inf}}(t,w,m,u) &{} 0\\ 0 &{} -q_{\mathrm {I}\mathrm {R}} &{} q_{\mathrm {I}\mathrm {R}} \\ 0 &{} 0 &{} 0 \end{array} \right] \end{aligned}$$

for \((t,w,m,u)\in [0,T]\times {\mathbb {W}}^n\times {\mathbb {M}}\times {\mathbb {U}}\), where \(q_{\mathrm {I}\mathrm {R}}\ge 0\) denotes the recovery rate of infected agents and the infection rate is given by

$$\begin{aligned} q_{\mathrm {inf}}(t,w,m,u) \triangleq q_{\mathrm {S}\mathrm {I}}\cdot m^{\mathrm {I}}\cdot (1-u)\cdot \mathbb {1}_{\{t<\tau ^{\star }\}}(w) \end{aligned}$$

with a given maximum rate \(q_{\mathrm {S}\mathrm {I}}\ge 0\). The running reward penalizes both protection efforts and time spent in the infected state; with \(c_{\mathrm {P}},\psi _{\mathrm {I}}\ge 0\) we set

$$\begin{aligned} \psi ^{\mathrm {S}}(t,w,m,u) \triangleq -c_{\mathrm {P}} \frac{u}{1-u},\qquad \psi ^{\mathrm {I}}(t,w,m,u) \triangleq -\psi _{\mathrm {I}},\qquad \psi ^{\mathrm {R}}(t,w,m,u) \triangleq 0. \end{aligned}$$

In addition, we include the possibility of a one-shot vaccination that becomes available, simultaneously to all agents, at a random point of time \(\tau ^{\star }\in \{T_1,\ldots ,T_n\}\subset (0,T)\). We set \({\mathbb {W}}\triangleq \{0,1\}\) and identify the \(k^{\text {th}}\) unit vector \(e_k=(\delta _{kj})_{j\in [1:n]}\in {\mathbb {W}}^n\), \(k\in [1:n]\) with the indicator of the event \(\{\tau ^{\star }=T_k\}\). The event that no vaccine is available until T is represented by \(0\in {\mathbb {W}}^n\); we set \(\tau ^{\star }\triangleq +\infty \) in this case.Footnote 10 If and when it is available, all susceptible agents are vaccinated instantaneously, rendering them immune to infection; thus

$$\begin{aligned} J^{\mathrm {S}}(t,w,m) \triangleq {\left\{ \begin{array}{ll} \mathrm {R}&{} \text {if } t\in \{T_1,\ldots ,T_n\},\, t = T_k = \tau ^{\star }, \\ \mathrm {S}&{} \text {otherwise} \end{array}\right. } \quad \text {and} \quad J^i(t,w,m) \triangleq i \quad \text {for } i\in \{\mathrm {I},\mathrm {R}\}. \end{aligned}$$

The probability of vaccination becoming available is proportional to the percentage of agents that have already recovered from the disease. Thus for \(k\in [1:n]\), \(w_1,\ldots ,w_k\in {\mathbb {W}}\) and \(m\in {\mathbb {M}}\) we set

$$\begin{aligned} \kappa _k(1\, |w_1,\ldots ,w_{k-1},m) \triangleq {\left\{ \begin{array}{ll} \alpha \cdot m^{\mathrm {R}} &{} \text {if } w_1,\ldots ,w_{k-1}=0,\\ 0 &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$

and \(\kappa _k(0\, |w_1,\ldots ,w_{k-1},m) \triangleq 1-\kappa _k(1\, |w_1,\ldots ,w_{k-1},m)\) where \(\alpha \in (0,1]\). As a consequence, for all \((i,t,w,m,v)\in {\mathbb {S}}\times [0,T]\times {\mathbb {W}}\times {\mathbb {M}}\times {\mathbb {R}}^3\), a maximizer as required in Assumption 3 is given byFootnote 11

$$\begin{aligned} h^{\mathrm {S}}(t,w,m,v) \triangleq {\left\{ \begin{array}{ll} \Bigl [1-\sqrt{\frac{c_{\mathrm {P}}}{q_{\mathrm {S}\mathrm {I}}\cdot m^{\mathrm {I}}\cdot (v^{\mathrm {S}}-v^{\mathrm {I}})}}\Bigr ]^+ &{} \text {if } v^{\mathrm {S}}>v^{\mathrm {I}},\ m^{\mathrm {I}}>0 \text { and }{t<\tau ^\star },\\ \qquad \qquad 0 &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$

and \(h^i(t,w,m,v)\triangleq 0\) for \(i\in \{\mathrm {I},\mathrm {R}\}\).

Remark 17

(\(\mathrm {SIR}\) Models in the Literature). Note that, given the above specification of the transition matrix Q, the forward dynamics (E1) within the equilibrium system (E1)-(E6) read as follows:

$$\begin{aligned} \left\{ \begin{aligned} {\dot{\mu }}^{\mathrm {S}}(t,w)&= -q_{\mathrm {SI}}\cdot \mu ^{\mathrm {I}}(t,w)\cdot \bigl (1-h^{\mathrm {S}}(t,w,\mu (t,w),v(t,w))\bigr )\cdot \mathbb {1}_{\{t<\tau ^{\star }\}}(w)\cdot \mu ^{\mathrm {S}}(t,w)\\ {\dot{\mu }}^{\mathrm {I}}(t,w)&= q_{\mathrm {SI}}\cdot \mu ^{\mathrm {I}}(t,w)\cdot \bigl (1-h^{\mathrm {S}}(t,w,\mu (t,w),v(t,w))\bigr )\cdot \mathbb {1}_{\{t<\tau ^{\star }\}}(w)\cdot \mu ^{\mathrm {S}}(t,w)- q_{\mathrm {IR}}\cdot \mu ^{\mathrm {I}}(t,w)\\ {\dot{\mu }}^{\mathrm {R}}(t,w)&= \ q_{\mathrm {IR}}\cdot \mu ^{\mathrm {I}}(t,w). \end{aligned} \right. \end{aligned}$$

Disregarding common noise, these constitute a ramification of the classical \(\mathrm {SIR}\) dynamics, which are a basic building block of numerous compartmental epidemic models in the literature; see, among others, [32, 37, 38, 47] and the references therein. The \(\mathrm {SIR}\) mean field game with controlled infection rates, albeit without common noise, has recently been studied in the independent article [26]; we also refer to [46] and [23] for mean field models with controlled vaccination rates. Mathematically similar contagion mechanisms also appear in, e.g., [40, 41], §7.2.3 in [9], §7.1.10 in [10], or §4.4 in [52].\(\square \)

While Theorem 13 guarantees existence of a mean field equilibrium for (a variantFootnote 12 of) the \(\mathrm {SIR}\) model, the monotonicity conditions of Theorem 16 do not hold in this setup.Footnote 13 Nevertheless, our numerical results reliably yield consistent equilibria. For our illustrations, the initial distribution of agents is given by \(m_0\triangleq (0.995,0.005,0.00)\), and the model coefficients are reported in Table 2. Note that there are \(n=1999\) common noise times \(T_k=k\cdot 0.01\), \(k=1,\ldots ,1999\), at which a vaccine can be administered. The specifications of \(q_{\mathrm {S}\mathrm {I}}\) and \(q_{\mathrm {I}\mathrm {R}}\) imply a basic reproduction number \(R_0\triangleq q_{\mathrm {S}\mathrm {I}}/q_{\mathrm {I}\mathrm {R}}=15\) in the absence of vaccination and protection efforts.

Table 2 Coefficients in the \(\mathrm {SIR}\) model

Our results for the mean field equilibrium distributions of agents \(\mu \) and the corresponding optimal protection efforts of susceptible agents \(h^{\mathrm {S}}\) are displayed in Figs. 7, 8, and 9 for different common noises configurations, i.e. vaccination times \(\tau ^{\star }\). As in Sect. 5.1, we also display the corresponding (theoretical) perfect-foresight equilibria, marked by the subscript \(\circ \).

Fig. 7
figure 7

Equilibrium distribution and protection effort for \(\tau ^{\star }=+\infty \): Mean field game with common noise (top) and corresponding perfect-foresight equilibrium (bottom)

Fig. 8
figure 8

Equilibrium distribution and protection effort for \(\tau ^{\star }=2.5\): Mean field game with common noise (top) and corresponding perfect-foresight equilibrium (bottom)

Fig. 9
figure 9

Equilibrium distribution and protection effort for \(\tau ^{\star }=5\): Mean field game with common noise (top) and corresponding perfect-foresight equilibrium (bottom)

Note that an agent’s running reward is the same in state \(\mathrm {S}\) with zero protection effort and in state \(\mathrm {R}\); agents are penalized relative to these in state \(\mathrm {I}\) and hence aim to avoid that state. Susceptible agents can reach the state \(\mathrm {R}\) of immunity by two ways: First, they can become infected and overcome the disease; second, they can be vaccinated and jump instantly from state \(\mathrm {S}\) to state \(\mathrm {R}\). While the first alternative is painful, the second comes at no cost and is hence clearly preferable. However, as the availability of a vaccine cannot be directly controlled by the agents, they can only protect themselves against infection at a certain running cost until the vaccine becomes available.

Figures 7, 8, and 9 demonstrate that the possibility of vaccination as a common noise event can dampen the spread of the disease and lower the peak infection rate. This is due to an increase in agents’ protection efforts during the time period when the proportion of infected agents is high. By contrast, in the perfect-foresight equilibria where the vaccination date is known, agents do not make substantial protection efforts until the vaccination date is imminent, see Figs. 8 and 9; in the scenario without vaccination, see Fig. 7, protection efforts are only ever made by a very small fraction of the population. With perfect foresight, the agents’ main rationale is to avoid being in state \(\mathrm {I}\) when the vaccine becomes available. This highlights the importance of being able to model the vaccination date as a (random) common noise event. Finally, observe that our numerical results indicate convergence to the stationary distribution \({{\bar{\mu }}}=(0,0,1)\in {\mathbb {M}}\), showing that the model is able to capture the entire evolution of an epidemic.