1 Introduction

A stylized yet highly general model that describes the spread of wealth over a population of agents could encompass the following elements. In the first place, the agents are endowed with external inflow of wealth, for instance representing their salaries, paid to them by agents outside the population. In the second place, there are transactions: agents may purchase commodities or services from other agents, thus redistributing the wealth. Thirdly, to make the model more realistic, one could impose Markov modulation on the system: all parameters involved are affected by an exogenously evolving Markovian process. This Markovian background process could for instance represent the state of the economy, alternating between economic growth and recession.

The main contributions of this paper are the following. We develop a highly general dissemination model that keeps track of the stochastic evolution of the distribution of wealth over a set of agents, incorporating the three elements mentioned above (i.e., external inflow, redistribution and Markov modulation). A main asset of the model lies in it being broadly applicable and allowing closed-form analysis at the same time.

For the resulting model, we succeed in deriving a system of coupled differential equations that describe the joint transient probability generating function of the agents’ wealth levels, jointly with the state of the Markovian background process. When focusing on the corresponding means and (co-)variances, this system takes on a more convenient form, in that it becomes a system of linear differential equations, thus allowing for straightforward numerical evaluation. In passing, we also consider the model’s stationary behavior, in particular establishing a stability condition.

The broad applicability of the model is illustrated through a series of examples. As suggested by our terminology, being in terms of wealth that is distributed over agents, it can be used to analyze the evolution (in time) of a wealth vector. Another example concerns the dissemination of opinions over a population, where all agents influence one another. Our modelling framework extends existing opinion dynamics models in the way we incorporate stochasticity, while also the Markovian background process is a novel element. In a last example we consider a file storage system, where files of clients are periodically copied to one or multiple central storage units. Our model can be used to assess the efficacy of policies, intended to strike a proper balance between storage cost on one hand and the risk of data loss on the other hand.

The dissemination model analyzed in this paper can be seen as a next step in a long tradition of queueing and population models. In most existing queueing network models, the dynamics are such that the number of clients per queue changes by one at a time; see e.g. the accounts in Kelly (1979), Serfozo (1999). A relatively small branch of the queueing literature considers queues with batch arrivals and batch services. In this respect we refer to e.g. (Coyle et al. , 1995; Henderson and Taylor , 1990; Mitrofanov et al. , 2015) where product-form results are obtained. The setup that is probably closest to the one we consider in this paper, is the one of Fiems et al. (2018), where the transaction events correspond to the network population vector undergoing a (deterministic) linear transformation. In population-process theory and epidemics there is a strong emphasis on deterministic models to describe the dynamics of the sizes of various subpopulations (e.g. age groups, infected individuals, etc.); for more background, see for instance the monograph Renshaw (1991). These models’ stochastic counterparts have been studied as well; a general framework has been presented, and analyzed, in Kurtz (1981). Importantly, to the best our knowledge, none of the earlier works covers our, highly general, redistribution mechanism.

This paper is organized as follows. The model and notation are introduced in Section 2. Then subsequently the transient joint probability generating function (Section 3), the first moments (Section 4), and second moments — and hence also variances and covariances — (Section 5) are analyzed; the section on first moments in addition establishes the model’s stability criterion. Then there are three sections with illustrative examples, focusing on wealth redistribution (Section 6), on opinion dynamics (Section 7) and on storage sharing systems (Section 8). Section 9 concludes.

2 Model and Notation

In our model we study the stochastic behavior of \({\varvec{M}}(t)\equiv (M_1(t),\ldots .M_I(t))\), where \(M_i(t)\) denotes the “wealth” of agent i at time t, with \(i = 1, \ldots , I\), for some \(I\in {\mathbb N}\); in this paper we follow the convention that bold symbols are used to represent vectors. We recall that, as pointed out in the introduction, “wealth” is to be interpreted in the broad sense; as we will extensively argue, the setup considered can also be used e.g. in the context of opinion spreading dynamics, or the context of file storage systems.

To make our model as rich as possible, we let its dynamics be affected by an autonomously evolving Markovian background (or regime-switching) process. In the (economic) context of wealth being spread over a population of individuals, the background process could reflect the state of the economy (e.g. alternating between economic peaks and periods of recession). Let this regime-switching process be modelled by the continuous-time Markov process \((X(t))_{t\geqslant 0}\) on the state space \(\{1, \ldots , d\}\), for some \(d\in {\mathbb N}\). This process, which is assumed to be irreducible, is governed by the transition rate matrix \(Q = \{q_{ij}\}^{d}_{i,j=1}\) (with all non-diagonal elements being non-negative and row sums equal to 0), so that Asmussen (2004), Norris (1997)

$$\begin{aligned} {\mathbb P}(X(t) =\ell \,|\,X(0)=k) = \big (e^{Qt}\big )_{k,\ell }. \end{aligned}$$

We proceed by describing the dynamics of the wealth process, given the background process is in state \(k\in \{1,\ldots ,d\}.\) We distinguish two types of events.

  • In the first place there are “external arrivals” of wealth. Concretely, for \(j\in \{1,\ldots ,J\}\) with \(J\in {\mathbb N}\), at Poisson epochs with rate \(\lambda _{jk}>0\) these external arrivals occur, leading to an increase of the wealth of all agents \(i\in {S_j}\subseteq \{1,\ldots ,I\}\) by one unit.

  • In the second place there are “shocks”, arriving to the system according to a Poisson process with rate \(\gamma _k>0\). At such a shock, “transactions” take place, which concretely means that each of the \(M_i(t)\) wealth units of agent i contributes \(W_{ijk} \in \mathbbm {N}_0\) wealth units to agent j. The precise mechanism is described more formally as follows. Supposing a shock happens at time \(t>0\), then the number of wealth units after the shock at agent j is, conditional on \({\varvec{M}}(t-) = (m_1, \ldots , m_I)^\top \) being the wealth vector just prior to time t, given by

    $$\begin{aligned} \sum _{i=1}^I \sum _{n=1}^{m_i} W_{ijkn}, \end{aligned}$$

    with \((W_{ijkn})_{n\in {\mathbb N}}\) denoting a sequence of independent and identically distributed random variables, all of them distributed as the discrete, non-negative random variable \(W_{ijk}\). The random variables \(W_{ijk}\) (with \(i=1,\ldots I\) and \(k=1,\ldots ,d\)) are assumed independent; importantly, throughout we do allow dependence in j. We define, for \({\varvec{z}}\) such that \(\max \{|z_1|,\ldots ,|z_I|\}\leqslant {1}\), the associated probability generating function (pgf) by

    $$\begin{aligned} g_{ik}({\varvec{z}}) = {\mathbb E}\Big [\prod _{j=1}^I z_j^{W_{ijk}} \Big ]. \end{aligned}$$

Our primary aim is to establish a unique characterization of the transient wealth vector \({\varvec{M}}(t)\), jointly with the state of the background process X(t). To this end we work with the corresponding multivariate time-dependent joint pgf. Concretely, our analysis aims at identifying the following key object of study, for \({\varvec{z}}\) such that \(\max \{|z_1|,\ldots ,|z_I|\}\leqslant {1}\):

$$\begin{aligned} f_k(\varvec{z},t) := {\mathbb E}\Big [\prod _{j=1}^{I}z_j^{M_j(t)} \mathbbm {1}_{\{X(t)=k\}}\Big ], \end{aligned}$$
(1)

which uniquely defines the distribution of \(({\varvec{M}}(t),X(t))\in {\mathbb N}^I\times \{1,\ldots ,d\}.\) In the remainder of the paper we have used, as systematically as possible, \(i,j\in \{1,\ldots ,I\}\) to denote agent types, \(k,\ell \in \{1,\ldots ,d\}\) states of the background process, and \(n{\in {\mathbb N}}\) the index of the sampled random variable.

3 Derivation of the Differential Equation for the Joint PGF

The main objective of this section is to establish a system of coupled differential equations (in t) for the time-dependent joint pgfs \(f_k(\varvec{z},t)\), as defined in (1). We do so relying on a standard argumentation: we relate their values at time \(t+\Delta t\) to their values at time t, with the aim to set up a system of differential equations. To this end, the underlying idea is to distinguish the three types of events that can occur in an interval of length \(\Delta t\): a transition of the background process, external arrivals, and shocks (and, evidently, there is in addition the event that none of these three types of events occurs). Following this line of reasoning, we obtain for the time-dependent joint pgf at time \(t+\Delta t\) that, as \(\Delta t\downarrow 0\),

$$\begin{aligned} \begin{aligned} f_k(\varvec{z},t+\Delta t)&= \sum _{\ell =1,\ell \not = k}^dq_{\ell k}\,\Delta t \,{\mathbb E}\Big [\prod _{j=1}^{I}z_j^{M_j(t)} \mathbbm {1}_{\{X(t)=\ell \}}\Big ]\,\\ {}&+ \sum _{j=1}^J \lambda _{jk}\,\Delta t \left( \prod _{i\in S_j} z_i\right) {\mathbb E}\Big [\prod _{i=1}^{I}z_i^{M_i(t)} \mathbbm {1}_{\{X(t)=k\}}\Big ]\,\\ {}&+ \gamma _k\, \Delta t\sum _{{\varvec{m}}\in {\mathbb N}^I} {\mathbb E}\Big [\prod _{j=1}^{I}z_j^{M_j(t+\Delta t)} \mathbbm {1}_{\{X(t+\Delta t)=k\}}\,\Big |\,{\varvec{M}}(t)= {\varvec{m}},{\mathscr {E}}_k(t)\Big ]\\ {}&\qquad {\mathbb P}({\varvec{M}}(t)= {\varvec{m}})\,\\ {}&+ \Big (1-\sum _{\ell =1,\ell \not = k}^dq_{\ell k}\Delta t-\sum _{j=1}^J\lambda _{jk}\Delta t-\gamma _k\Delta t\Big ) f_k(\varvec{z},t)+o(\Delta t), \end{aligned} \end{aligned}$$
(2)

with \({\mathscr {E}}_{k}(t)\) denoting the event of a shock between times t and \(t+\Delta t\) (evidently, while the background state is k). Above we used the Landau notation \(o(\Delta t)\): stating that a function \(F(\cdot )\) is \(o(\Delta t)\) as \(\Delta t\downarrow 0\) means that \(F(\Delta t)/\Delta t \rightarrow 0\) as \(\Delta t\downarrow 0\).

The right-hand side of Equality (2) can be interpreted and rewritten as follows.

  • The first term, which considers the scenario that the background process was in a state \(\ell \not = k\) at time t and makes a transition to k between t and \(\Delta t\), equals by definition

    $$\begin{aligned} \sum _{\ell =1,\ell \not = k}^dq_{\ell k}\,\Delta t \,f_\ell (\varvec{z},t). \end{aligned}$$
  • The second term represents the contributions of the external arrivals: if it is of the j-th type, then it increases (by 1) the wealth values of all agents i such that \(i\in S_j\). It reads

    $$\begin{aligned} \sum _{j=1}^J\lambda _{jk}\,\Delta t \left( \prod _{i\in S_j} z_i\right) f_k(\varvec{z},t). \end{aligned}$$
  • The third term describes the effect of the shocks. The claim is that we can express it in terms of the pgf \(f_k(\varvec{z},t)\), but not evaluated in the argument \({\varvec{z}}\) but rather in a different argument \({\varvec{h}}_k({\varvec{z}})\). Indeed, defining

    $$\begin{aligned} {\varvec{h}}_k({\varvec{z}}) := \big (g_{1k}({\varvec{z}}),\ldots ,g_{Ik}({\varvec{z}})\big ), \end{aligned}$$

    again up to \(\Delta t\)-terms, due to the shock that occurs between times t and \(t+\Delta t\),

    $$\begin{aligned} \begin{aligned} \sum _{{\varvec{m}}\in {\mathbb N}^I}{\mathbb E}&\Big [\prod _{j=1}^{I}z_j^{M_j(t+\Delta t)} \mathbbm {1}_{\{X(t+\Delta t)=k\}}\,\Big |\,{\varvec{M}}(t)= {\varvec{m}},{\mathscr {E}}_{k}(t)\Big ]\,{\mathbb P}({\varvec{M}}(t)= {\varvec{m}}) \\&=\sum _{{\varvec{m}}\in {\mathbb N}^I} \prod _{j =1}^I {\Big (g_{jk}({\varvec{z}})\Big )^{m_j}} \,{\mathbb P}({\varvec{M}}(t)= {\varvec{m}},X(t)=k)\\&={\mathbb E}\Big [\prod _{j=1}^{I}g_{jk}({\varvec{z}})^{M_j(t)} \mathbbm {1}_{\{X(t)=k\}}\Big ] = f_k({\varvec{h}}_k({\varvec{z}}),t). \end{aligned} \end{aligned}$$
  • The fourth term corresponds to the scenario of no transition of the background process, no external arrivals and no shocks, leaving the wealth process unchanged.

Observe that we have succeeded in expressing \(f_k(\varvec{z},t+\Delta t)\) in terms of quantities of the same type, as well as quantities of the type \(f_\ell ({\varvec{\varphi }}({\varvec{z}}),t)\) for known functions \({\varvec{\varphi }}({\varvec{z}}):[-1,1]^I\rightarrow [-1,1]^I.\) The next step is to subtract \(f_k(\varvec{z},t)\) from both sides of the equation, divide by \(\Delta t\), and let \(\Delta t\downarrow 0\), so as to obtain a system of differential equations in t. Indeed, we obtain, for \(t\geqslant 0\), the following result, where we have used that the row sums of Q are equal to 0.

Proposition 3.1

For any \(t\geqslant 0\), \(f_k(\varvec{z},t)\) satisfies the system of differential equations

$$\begin{aligned} \begin{aligned} \frac{\partial }{\partial t} f_k(\varvec{z},t) = \sum _{\ell =1}^d&q_{\ell k}f_\ell (\varvec{z},t) \,+\\ \sum _{j=1}^J&\lambda _{jk}\left( \prod _{i\in S_j}z_i - 1\right) f_k(\varvec{z},t) + \gamma _k \big (f_k({\varvec{h}}_{k}(\varvec{z}),t) - f_k(\varvec{z},t) \big ), \end{aligned} \end{aligned}$$
(3)

If \({\varvec{M}}(0) = {\varvec{m}}_0\) and \(X(0)=k_0\), then the initial condition is

$$\begin{aligned} f_k({\varvec{z}},0)= \mathbbm {1}_{\{k=k_0\}} \prod _{i=1}^I z_i^{m_{0,i}}. \end{aligned}$$

Using the obvious property that \({\varvec{h}}_k({\varvec{1}}) = {\varvec{1}}\), it is readily seen from this proposition that \(f_k(\varvec{1},t)= {\mathbb P}\big (X(t) = k \big )\), as it should.

The system of differential Eq. (3) can be solved numerically. Importantly, the system is not linear; observe that in one term on the right-hand side of (3) the pgf has the argument \({\varvec{h}}_{k}(\varvec{z})\) rather than \({\varvec{z}}\). As we will notice in the next sections, however, when considering the computation of time-dependent moments (rather than the full time-dependent pgf) we do obtain a reduction to systems of differential equations that are linear. These can be solved at relatively low numerical effort using standard computational software (e.g. deSolve in R, dsolve in Matlab). Alternatively, as in our case the differential equations are linear, one can use an implementation of the matrix exponential (expm in both R and Matlab). In the next two sections we subsequently concentrate on the evaluation of the first and second moments.

Remark 3.2

If one is interested in moments only, then one could in principle also set up systems of differential equations just for those moments, rather than setting up a system of differential equations for the pgf and then perform a differentiation (as will be done in the next sections); cf. for instance the procedure followed in Léveillé and Garrido (2001). We however prefer to present the system of coupled ordinary differential equations of Proposition 3.1, as these characterize the full distribution of \({\varvec{M}}(t)\) (jointly with the state of the background process X(t)). These coupled differential equations do not allow an analytical solution, but can be solved numerically relying on standard computational software packages (use e.g. the function deSolve in R, or dsolve in Matlab). It is also noted that repeated differentiation with respect to entries of \({\varvec{z}}\) and inserting \({\varvec{z}}={\varvec{0}}\) (i.e., not \({\varvec{z}}={\varvec{1}}\)) yields the joint probability distribution; for instance, for a single-dimensional random variable X attaining non-negative integer values, we have

$$\begin{aligned} {\mathbb P}(X=k) = \frac{1}{k!}\left. \frac{\mathrm d^k}{{\mathrm d}z^k} {\mathbb E} z^X\right| _{z=0}, \end{aligned}$$

for \(k\in {\mathbb N}\).

4 Derivation of First Moments, Stability

In principle all moments of the components of \({\varvec{M}}(t)\), as well as all mixed moments, can be derived from the differential equations (3) by repeated differentiation and plugging in \(\varvec{z}=\varvec{1}\). This typically leads to a recursion from which the joint moments of order n are expressed in terms of their counterparts of order \(0,1,\ldots ,n-1\); cf. Boxma et al. (2019), Léveillé and Garrido (2001), Starreveld et al. (2018). In this section we discuss this widely applied procedure to determine the time-dependent first moments. In the next section we then move to second moments, but in principle this procedure extends to moments of any order. As we point out in the present section, knowledge of the first moments also provides us with a criterion under which the model has a stable stationary version (i.e., does not explode as \(t\rightarrow \infty \)).

4.1 Differential Equations for First Moments

In this subsection we derive a system of linear differential equations that characterize the expectation of \({\varvec{M}}(t)\). With \(w_{ijk}:={\mathbb E}W_{ijk}\) and \(m_{ik}(t) := {\mathbb E}[M_i(t) \mathbbm {1}_{\{X(t)=k\}}]\), differentiating (3) to \(z_i\) and inserting \(\varvec{z}=\varvec{1}\) yields the following system of coupled ordinary differential equations:

$$\begin{aligned} m'_{ik}(t)= \sum _{\ell =1}^d q_{\ell k} m_{i\ell }(t) + \sum _{j:i\in S_j}\lambda _{jk} \,\pi _k(t)+\gamma _k \left( \sum _{j=1}^I w_{jik}\, m_{jk}(t) - m_{ik}(t)\right) , \end{aligned}$$

with \(\pi _k(t)={\mathbb P}(X(t)=k).\) Here we have used the standard differentiation rule for compositions of functions with vector-valued arguments, i.e.,

$$\begin{aligned} \frac{\partial f_k({\varvec{h}}_{k}(\varvec{z}),t)}{\partial z_i}= \sum _{j=1}^I \frac{\partial f_k({\varvec{x}},t) }{\partial x_j}\Big |_{{\varvec{x}}={\varvec{h}}_k({\varvec{z}})} \frac{\partial (h_k({\varvec{z}}))_j}{\partial z_i}, \end{aligned}$$

and

$$\begin{aligned} \frac{\partial (h_k({\varvec{z}}))_j}{\partial z_i}\Big |_{{\varvec{z}}={\varvec{1}}}=w_{jik}. \end{aligned}$$

The next step is to compactly write the above system of differential equations in matrix-vector form. We let \({\varvec{m}}(t)\in {\mathbb R}^{dI}\) denote the stacked vector that results from the I vectors \({\varvec{m}}_i(t)\equiv (m_{i1}(t),\ldots , m_{id}(t))^{\top }\), where \({\varvec{\pi }}(t)\in {\mathbb R}^{dI}\) is the stacked vector that results from the I (identical) vectors \({\varvec{\pi }}_i(t)\equiv (\pi _{1}(t),\ldots , \pi _{d}(t))^{\top }\), i.e.,

$$\begin{aligned} {\varvec{m}}(t) :=\left( \begin{array}{c}{\varvec{m}}_1(t)\\ \vdots \\ {\varvec{m}}_I(t)\end{array} \right) ,\,\,\,\,{\varvec{\pi }}(t) :=\left( \begin{array}{c}{\varvec{\pi }}_1(t)\\ \vdots \\ {\varvec{\pi }}_I(t)\end{array} \right) . \end{aligned}$$

In addition, \(G_{ji}:=\textrm{diag}\{\gamma _1 w_{ji1},\ldots ,\gamma _d w_{jid}\}-\textrm{diag}\{\gamma _1,\ldots ,\gamma _d\}\mathbbm {1}_{\{i=j\}}\). We then define

$$\begin{aligned} A:=\left( \begin{array}{ccccc} Q^\top +G_{11}&{}G_{21}&{}G_{31}&{}\cdots &{}G_{I1}\\ G_{12}&{}Q^\top +G_{22}&{}G_{{32}}&{}\cdots &{}G_{I2}\\ G_{13}&{}G_{23}&{}Q^\top +G_{33}&{}\cdots &{}G_{I3}\\ \vdots &{}\vdots &{}\vdots &{}\ddots &{}\vdots \\ G_{1I}&{}G_{2I}&{}G_{3I}&{}\cdots &{}Q^\top +G_{II} \end{array}\right) , \end{aligned}$$

and, with \(\Lambda _i:=\textrm{diag}\{\bar{\lambda }_{i1},\ldots ,\bar{\lambda }_{id}\}\) and \(\bar{\lambda }_{ik}:=\sum _{j:i\in S_j}\lambda _{jk}\),

$$\begin{aligned} \Lambda :=\left( \begin{array}{cccc} \Lambda _1&{}0&{}\cdots &{}0\\ 0&{}\Lambda _2&{}\cdots &{}0\\ \vdots &{}\vdots &{}\ddots &{}\vdots \\ 0&{}0&{}\cdots &{}\Lambda _I\end{array}\right) . \end{aligned}$$

We thus end up with a system of dI coupled linear differential equations, as given in the following proposition.

Proposition 4.1

For any \(t\geqslant 0\), \({\varvec{m}}(t)\) satisfies the system of linear differential equations

$$\begin{aligned} {\varvec{m}}'(t) = A\,{\varvec{m}}(t)+\Lambda \,{\varvec{\pi }}(t). \end{aligned}$$

If \({\varvec{M}}(0) = {\varvec{m}}_0\) and \(X(0)=k_0\), then the initial condition is

$$\begin{aligned} m_k(0)= \mathbbm {1}_{\{k=k_0\}} \,{m_{0,k}}. \end{aligned}$$

This non-homogeneous system of linear differential equations can be solved in the standard manner. In the first place, the vector \({\varvec{\pi }}(t)\), corresponding to the transient state probabilities of the background process X(t), satisfies the differential equation

$$\begin{aligned} {\varvec{\pi }}'(t) = ({\mathbb I}_I \otimes Q^{\top }) {\varvec{\pi }}(t), \end{aligned}$$

with \(\otimes \) being the usual notation for the Kronecker product and \({\mathbb I}_I\) an identity matrix of dimension I. This means, with \(\bar{Q}\) denoting the \((dI\times dI\))-dimensional matrix \({\mathbb I}_I \otimes Q^{\top }\), that \({\varvec{\pi }}(t) = e^{\bar{Q} t}{\varvec{\pi }}(0)\), or equivalently,

$$\begin{aligned} \pi _j(t) = \sum _{i=1}^d {\mathbb P}(X(0)=i) \big ( e^{Qt}\big )_{i,j}. \end{aligned}$$

In the second place, the solution for \({\varvec{m}}(t)\) can be written in terms of matrix exponentials, as follows:

$$\begin{aligned} {\varvec{m}}(t) = e^{At}\,{\varvec{m}}(0) + \int _0^t e^{A(t-s)} \,\Lambda \,{\varvec{\pi }}(s)\,\textrm{d}s. \end{aligned}$$

We thus obtain the following result.

Proposition 4.2

For any \(t\geqslant 0\),

$$\begin{aligned} {\varvec{m}}(t) = e^{At}\,{\varvec{m}}(0) + \int _0^t e^{A(t-s)} \,\Lambda \,e^{\bar{Q} s}{\varvec{\pi }}(0)\,\textrm{d}s. \end{aligned}$$

4.2 Stability Condition

Let \({\varvec{\pi }}:={\varvec{\pi }}(\infty )\) be the unique solution of \({\varvec{\pi }}Q={\varvec{0}}\) such that its entries sum to 1, i.e., the stationary distribution of X(t). In case the underlying model is stable, the above results directly imply that the steady-state mean vector \({\varvec{m}}\) can be written in terms of the steady-state probabilities \({\varvec{\pi }}\), as follows:

$$\begin{aligned} {\varvec{m}} = - A^{-1} \Lambda {\varvec{\pi }}. \end{aligned}$$

The formal stability condition is given in the next statement. We define by \(\omega \) the eigenvalue of A with largest real part, i.e., the spectral abscissa of A.

Proposition 4.3

The Markov chain \(({\varvec{M}}(t),X(t))_{t\geqslant 0}\) is ergodic if \(\omega <0.\)

Proof. The proof step-by-step mimics the one of Fiems et al. (2018, Prop. 3); we therefore restrict ourselves to sketching its main steps. The underlying idea is to establish ergodicity of the skeleton Markov chain \(({\varvec{M}}(n\Delta ),X(n\Delta ))_{n\in {\mathbb N}}\) if \(\omega <0\), for some \(\Delta > 0\), where it should be noted that if the skeleton Markov chain is ergodic for some \(\Delta > 0\), then so is \(({\varvec{M}}(t),X(t))_{t\geqslant 0}\) (observing that the mean recurrence time for any state of the skeleton chain is an upper bound for the mean recurrence time of the original process). Then, by Asmussen (2004, Prop. I.5.3), a sufficient condition for ergodicity can be phrased in terms of

$$\begin{aligned} {\mathbb E}\big [\Vert {{\varvec{M}}(\Delta )}\Vert _{1}\,|\,{\varvec{M}}(0)={\varvec{m}}_0, X(0)=k_0\big ] -\Vert {{\varvec{m}}_0}\Vert _1< -\varepsilon , \end{aligned}$$

for some \(\varepsilon >0\), and all \({\varvec{m}}_0\in {\mathbb N}^I\) and \(k_0\in \{1,\ldots ,d\}\). Informally, this criterion entails that the process’ drift is negative, bounded away from zero; cf. Foster’s criterion, Brémaud (1999), Foster (1953). Such a bound can be achieved under \(\omega <0\), with the precise same argumentation as the one used in the proof of Fiems et al. (2018, Prop. 3), where we rely on Bernstein (2009, Prop. 11.18) to find the required bound on the norm of the matrix exponential.

In case the stability condition \(\omega < 0\) is not fulfilled, the components of \({\varvec{m}}(t)\) typically grow in a fixed proportion; we return to this issue in Section 7.

4.3 Special Cases

In this subsection we provide more explicit results for three special cases: (1) a single fully homogeneous population, (2) a single distinct agent (a “leader”) with \(I-1\) homogeneous other agents (“followers”), (3) two, internally homogeneous, interacting subpopulations.

4.3.1 Homogeneous Population

We denote by \(m_k(t)\) the mean wealth of an arbitrary agent, jointly with the event that the background process is in state k. As a consequence of the fact that we “start symmetrically”, i.e., the initial wealth of all agents is the same, at any point in time the mean wealth of the individual agents coincides. In case we do not “start symmetrically” (for instance with two possible values of the initial wealth), the computations can still be performed, but become less clean.

In the variant that we consider, we let \(J=I\), and we take \(S_j=\{j\}\) and \(\lambda _{jk}=\lambda _k\), for \(j=1,\ldots ,I\). (Here we note that one can construct other fully symmetric external arrival processes, for instance by letting \(J=1\) and \(S_1=\{1,\ldots ,I\}\). Such alternative symmetric variants can be dealt with analogously.) The generic random variable \(W_{ijk}\) now depends on the background state k only (i.e., not on the indices i and j that indicate the agents); we let \(w_k\) be its expected value.

It is directly verified that our earlier results now yield

$$\begin{aligned} m_k'(t) = \sum _{\ell =1}^d q_{\ell k}m_\ell (t) + \lambda _k \pi _k(t) +\gamma _k(Iw_k-1) \,m_k(t), \end{aligned}$$

or, in matrix-vector notation,

$$\begin{aligned} {\varvec{m}}'(t) = A\,{\varvec{m}}(t) + \Lambda \,{\varvec{\pi }(t)}, \end{aligned}$$

where \(A:=Q^\top +\textrm{diag}\{\gamma _1(Iw_1-1),\ldots ,\gamma _d(Iw_d-1)\}\) and \(\Lambda :=\textrm{diag}\{\lambda _1,\ldots ,\lambda _d\}.\) The kth entry of \({\varvec{m}}\) now expresses the agents’ mean wealth when the system is in state k. In steady state, we obtain that the mean wealth vector equals \(-A^{-1}\Lambda {\varvec{\pi }}\), with \({\varvec{\pi }}\) as defined before, provided that the stability condition is fulfilled (i.e., that the spectral abscissa of A is negative).

4.3.2 Leader and Homogeneous Followers

In this model, there is a single leader and \(I-1\) homogeneous followers. We “start symmetrically”, i.e., all followers have the same initial wealth. We let \(m_{\textrm{L},k}(t)\) be the mean wealth of the leader at time t, and \(m_{\textrm{F},k}(t)\) the mean wealth of an arbitrary follower at time t, both jointly with the event that the background process is in state k.

As we did in the case of a homogeneous population, we take \(J=I\) with \(S_j=\{j\}\). We let the external arrival rate of the leader be \(\lambda _{\textrm{L},k}\), and of the followers \(\lambda _{\textrm{F},k}\), both corresponding to the background process being in state k. The wealth redistribution, as taking place at the “shocks”, corresponds to the means (in self-evident notation) \(w_{\textrm{LL},k}\), \(w_{\textrm{LF},k}\), \(w_{\textrm{FL},k}\), and \(w_{\textrm{FF},k}\), again for the background process in state k. We thus obtain

$$\begin{aligned} \begin{aligned} m'_{\textrm{L},k}(t)&=\sum _{\ell =1}^dq_{\ell k}m_{\textrm{L},\ell }(t)+\lambda _{\textrm{L},k}\pi _k(t)+ \gamma _k\big ((w_{\textrm{LL},k}-1)m_{\textrm{L},k}(t)+ (I-1)w_{\textrm{FL},k} m_{\textrm{F},k}(t)\big ), \\ m'_{\textrm{F},k}(t)&=\sum _{\ell =1}^dq_{\ell k}m_{\textrm{F},\ell }(t)+\lambda _{\textrm{F},k}\pi _k(t)+ \gamma _k\big ((w_{\textrm{FF},k}(I-1)-1)m_{\textrm{F},k}(t)+ w_{\textrm{LF},k} m_{\textrm{L},k}(t)\big ). \end{aligned} \end{aligned}$$

The model further simplifies if we consider the setting without modulation. In self-evident notation, we obtain \({\varvec{m}}'(t) = \gamma (\bar{A}-{\mathbb I}_2)\,{\varvec{m}}(t)+{\varvec{\lambda }}\), where \({\mathbb I}_2\) denotes a 2-dimensional identity matrix,

$$\begin{aligned} {\varvec{m}}(t)=\left( \begin{array}{c} m_\textrm{L}(t) \\ m_\textrm{F}(t) \end{array}\right) ,\,\,\,\bar{A}:= \left( \begin{array}{cc} w_\textrm{LL}&{}(I-1)w_\textrm{FL}\\ w_\textrm{LF}&{}(I-1)w_\textrm{FF} \end{array} \right) ,\,\,\,{\varvec{\lambda }}=\left( \begin{array}{c} \lambda _\textrm{L} \\ \lambda _\textrm{F} \end{array}\right) . \end{aligned}$$

4.3.3 Two Internally Homogeneous Interacting Subpopulations

We now consider a generalization of the situation with a leader and homogeneous followers, viz. the situation of \(I_\textrm{A}\) agents of subpopulation A and \(I_\textrm{B}:=I-I_\textrm{A}\) agents of subpopulation B. We let \(m_{\textrm{A},k}(t)\) (respectively \(m_{\textrm{B},k}(t)\)) denote the mean wealth of an arbitrary agent from subpopulation A (respectively subpopulation B) at time t; the agents within each of the two subpopulations “start symmetrically”. The arrival rates \(\lambda _{\textrm{A},k}\) and \(\lambda _{\textrm{B},k}\) are defined in the evident manner, and so are the means \(w_{\textrm{AA},k}\), \(w_{\textrm{AB},k}\), \(w_{\textrm{BA},k}\), and \(w_{\textrm{BB},k}\). Using the same reasoning as above, we obtain

$$\begin{aligned} \begin{aligned} m'_{\textrm{A},k}(t)&=\sum _{\ell =1}^dq_{\ell k}m_{\textrm{A},\ell }(t)+\lambda _{\textrm{A},k}\pi _k(t)+ \gamma _k\big ((I_\textrm{A}w_{\textrm{AA},k}-1)m_{\textrm{A},k}(t)+ I_\textrm{B}w_{\textrm{BA},k} m_{\textrm{B},k}(t)\big ), \\ m'_{\textrm{B},k}(t)&=\sum _{\ell =1}^dq_{\ell k}m_{\textrm{B},\ell }(t)+\lambda _{\textrm{B},k}\pi _k(t)+ \gamma _k\big ((I_\textrm{B}w_{\textrm{BB},k}-1)m_{\textrm{B},k}(t)+ I_\textrm{A}w_{\textrm{AB},k} m_{\textrm{A},k}(t)\big ). \end{aligned} \end{aligned}$$

We get a further simplification in case that there is no modulation. In self-evident notation, we again obtain \({\varvec{m}}'(t) = \gamma (\bar{A}-{\mathbb I}_2)\,{\varvec{m}}(t)+{\varvec{\lambda }}\), but now with

$$\begin{aligned} {\varvec{m}}(t)=\left( \begin{array}{c} m_\textrm{A}(t) \\ m_\textrm{B}(t) \end{array}\right) ,\,\,\,\bar{A}:= \left( \begin{array}{cc} I_\textrm{A}w_\textrm{AA}&{}I_\textrm{B}w_\textrm{BA}\\ I_\textrm{A}w_\textrm{AB}&{} I_\textrm{B}w_\textrm{BB}\end{array} \right) ,\,\,\,{\varvec{\lambda }}=\left( \begin{array}{c} \lambda _\textrm{A} \\ \lambda _\textrm{B} \end{array}\right) . \end{aligned}$$

5 Derivation of Second Moments

In this section we focus on characterizing the second moments pertaining to the vector \({\varvec{M}}(t)\). The techniques relied upon resemble those used in the previous section to find the first moments. In particular, the solution again amounts to solving a system of linear differential equations.

5.1 Differential Equations for Second Moments

Concretely, our aim is to provide recipes to evaluate the reduced second moments of \(M_i(t)\), i.e.,

$$\begin{aligned} v_{iik}(t) := {\mathbb E}[M_i(t)(M_i(t)-1)\mathbbm {1}_{\{X(t)=k\}}]=\frac{\partial ^2 f_k(\varvec{z},t)}{\partial z_i^2}, \end{aligned}$$

as well as the mixed second moments of \(M_i(t)\) and \(M_{i'}(t)\), i.e., for \(i\not =i'\),

$$\begin{aligned} v_{ii'k}(t) := {\mathbb E}[M_i(t)\,M_{i'}(t)\mathbbm {1}_{\{X(t)=k\}}]=\frac{\partial ^2 f_k(\varvec{z},t)}{\partial z_i\,\partial z_{i'}}. \end{aligned}$$

Again the idea is to set up a system of coupled linear differential equations. In these differential equations now both the transient state probabilities \(\pi _k(t)\) and the transient first moments \(m_{ik}(t)\) feature. With these objects at our proposal (recalling in particular that an expression for \(m_{ik}(t)\) was identified in the previous section), we can determine the corresponding variances and covariances in the evident manner.

In the derivation, we use the identity, for \(i,i'=1,\ldots ,I\),

$$\begin{aligned} \begin{aligned} \frac{\partial ^2 f_k({\varvec{h}}_{k}(\varvec{z}),t)}{\partial z_i \,\partial z_{i'}}&= \frac{\partial }{\partial z_i}\left( \sum _{j'=1}^I \frac{\partial f_k({\varvec{x}},t) }{\partial x_{j' }}\Big |_{{\varvec{x}}={\varvec{h}}_k({\varvec{z}})} \frac{\partial (h_k({\varvec{z}}))_{j' }}{\partial z_{i'}}\right) \\&= \sum _{j=1}^I \sum _{j'=1}^I \frac{\partial ^2 f_k({\varvec{x}},t) }{\partial x_j\,\partial x_{j'}}\Big |_{{\varvec{x}}={\varvec{h}}_k({\varvec{z}})} \frac{\partial (h_k({\varvec{z}}))_j}{\partial z_i}\frac{\partial (h_k({\varvec{z}}))_{j'}}{\partial z_{i'}}\\&\,\,\,\,+\,\sum _{j' =1}^I \frac{\partial f_k({\varvec{x}},t) }{\partial x_{j' }}\Big |_{{\varvec{x}}={\varvec{h}}_k({\varvec{z}})} \frac{\partial ^2 (h_k({\varvec{z}}))_{j' }}{\partial z_i \,\partial z_{i'}}, \end{aligned} \end{aligned}$$

which is based on standard differentiation rules. We thus obtain, by differentiating (3) with respect to \(z_i\) and \(z_{i'}\) and inserting \({\varvec{z}}={\varvec{1}}\),

$$\begin{aligned} \begin{aligned} v_{ii'k}'(t)=&\; \sum _{\ell =1}^d q_{\ell k} v_{ii'\ell }(t)+ \mathbbm {1}_{\{i\not =i'\}}\sum _{j:i,i'\in S_j}\lambda _{jk} \,\pi _{k}(t)+\sum _{j:i\in S_j}\lambda _{jk}m_{i'k}(t) + \sum _{j:i'\in S_j} \lambda _{jk}m_{ik}(t) \\&+ \gamma _k\left( \sum _{j=1}^I \sum _{j'=1}^I v_{jj'k}(t)\,w_{jik}\,w_{j'i'k}+\sum _{j=1}^I m_{jk}(t)\,w^{(2)}_{jii'k} - v_{ii'k}(t)\right) ; \end{aligned} \end{aligned}$$

here

$$\begin{aligned} w^{(2)}_{jii'k}:= \frac{\partial ^2 (h_k({\varvec{z}}))_j}{\partial z_i \,\partial z_{i'}}\Big |_{{\varvec{z}}={\varvec{1}}}, \end{aligned}$$

which equals \({\mathbb E}[W_{jik}(W_{jik}-1)]\) if \(i=i'\) and \({\mathbb E} [W_{jik}W_{ji'k}]\) otherwise.

5.2 Special Case: Two Subpopulations

We consider the situation of Section 4.3.3 and derive the differential equations for the (reduced) second moments in case there is no modulation. Five quantities are to be determined:

$$\begin{aligned} \begin{aligned} v_\textrm{AA}(t)&:= \text {reduced 2nd moment of arbitrary agent in population A,}\\ v_\textrm{BB}(t)&:= \text {reduced 2nd moment of arbitrary agent in population B,}\\ v_\mathrm{AA'}(t)&:= \text {mixed 2nd moment of two arbitrary distinct agents in population A,}\\ v_\mathrm{BB'}(t)&:= \text {mixed 2nd moment of two arbitrary distinct agents in population B,}\\ v_\textrm{AB}(t)&:= \text {mixed 2nd moment of two arbitrary agents in populations A and B,} \end{aligned} \end{aligned}$$

with all quantities on the right-hand side being evaluated at time \(t\geqslant 0.\) The vector \({\varvec{v}}(t)\in {\mathbb R}_+^5\) consists of the above five entries. We can write, with \({\mathbb I}_5\) denoting a 5-dimensional identity matrix,

$$\begin{aligned} {\varvec{v}}'(t) = \gamma (\bar{A}-{\mathbb I}_5)\,{\varvec{v}}(t) + A_m\,{\varvec{m}}(t), \end{aligned}$$

for a suitably chosen \((5\times 5)\)-matrix \(\bar{A}\) and a suitably chosen \((5\times 2)\)-matrix \(A_m\). The matrix \(\bar{A}\) is given by, with \(J_x:=I_{x}(I_x-1)\) for \(x\in \{\textrm{A},\textrm{B}\}\),

$$\begin{aligned} \bar{A}=\left( \begin{array}{ccccc} I_\textrm{A}(w_\textrm{AA})^2&{}I_\textrm{B}(w_\textrm{BA})^2&{}J_\textrm{A}(w_\textrm{AA})^2&{}J_\textrm{B}(w_\textrm{BA})^2&{}2I_\textrm{A}I_\textrm{B}w_\textrm{AA}w_\textrm{BA}\\ I_\textrm{A}(w_\textrm{AB})^2&{}I_\textrm{B}(w_\textrm{BB})^2&{}J_\textrm{A}(w_\textrm{AB})^2&{}J_\textrm{B}(w_\textrm{BB})^2&{}2I_\textrm{A}I_\textrm{B}w_\textrm{AB}w_\textrm{BB}\\ I_\textrm{A}(w_\textrm{AA})^2&{}I_\textrm{B}(w_\textrm{BA})^2&{}J_\textrm{A}(w_\textrm{AA})^2&{}J_\textrm{B}(w_\textrm{BA})^2&{}2I_\textrm{A}I_\textrm{B}w_\textrm{AA}w_\textrm{BA}\\ I_\textrm{A}(w_\textrm{AB})^2&{}I_\textrm{B}(w_\textrm{BB})^2&{}J_\textrm{A}(w_\textrm{AB})^2&{}J_\textrm{B}(w_\textrm{BB})^2&{}2I_\textrm{A}I_\textrm{B}w_\textrm{AB}w_\textrm{BB}\\ I_\textrm{A}w_\textrm{AA} w_\textrm{AB}&{}I_\textrm{B}w_\textrm{BA} w_\textrm{BB}&{}J_\textrm{A}w_\textrm{AA} w_\textrm{AB}&{}J_\textrm{B}w_\textrm{BA} w_\textrm{BB}&{}I_\textrm{A}I_\textrm{B}(w_\textrm{AA}w_\textrm{BB} + w_\textrm{AB}w_\textrm{BA})\end{array} \right) \end{aligned}$$

and

$$\begin{aligned} A_m= \left( \begin{array}{cc} 2\lambda _\textrm{A} +\gamma I_\textrm{A}w_\textrm{AAA}^{(2)}&{}\gamma I_\textrm{B}w_\textrm{BAA}^{(2)}\\ \gamma I_\textrm{A}w_\textrm{ABB}^{(2)}&{} 2\lambda _\textrm{B} +\gamma I_\textrm{B}w_\textrm{BBB}^{(2)}\\ 2\lambda _\textrm{A} +\gamma I_\textrm{A}w_\mathrm{AAA'}^{(2)}&{}\gamma I_\textrm{B}w_\mathrm{BAA'}^{(2)}\\ \gamma I_\textrm{A}w_\mathrm{ABB'}^{(2)}&{} 2\lambda _\textrm{B} +\gamma I_\textrm{B}w_\mathrm{BBB'}^{(2)}\\ \lambda _\textrm{A}+ \gamma I_\textrm{A}w_\textrm{AAB}^{(2)}&{} \lambda _\textrm{B} +\gamma I_\textrm{B}w_\textrm{BAB}^{(2)}\\ \end{array}\right) . \end{aligned}$$

6 Application 1: Wealth Redistribution

The model that we consider in this paper can be interpreted as a simple formalism describing an economy, providing insight into the stochastic evolution of the wealth of each of the individual agents. Indeed, the set I could correspond to the agents of the economic system under study, which is fed by external inflow and in which at random times wealth redistribution occurs. In this section we present an example of such an economic system, with a population consisting of one agent (the “leader”) obtaining income from outside the system, and the other \(I-1\) agents obtaining their income from the leader (the “followers”). The Markovian background process records the state of the economy, in that it alternates between periods of economic growth and periods of recession. Our model allows us to quantify the distribution of the fraction of followers whose income drops below a critical threshold. In this way, we can get insight into the phenomenon of “poverty trap”, i.e., persistent poverty for the followers (to be interpreted as the segment of the population that does not own resources, and whose income strongly depends on payments from the leader). The model considered is formally described as follows.

  • The background process X(t) has two states, i.e., economic growth (corresponding to state 1) and recession (corresponding to state 2).

  • Regarding the income rates \(\lambda _j\), we consider the situation that \(S_j=\{j\}\), for \(j=1,\ldots ,I\). Only agent 1 (the leader) has external income: we let \(\lambda _{j1}=\lambda _1\) and \(\lambda _{j2}=\lambda _2\) for rates \(\lambda _1\) and \(\lambda _2\) such that \(0<\lambda _2<\lambda _1\) (i.e., the leader has a higher income during periods of economic growth). The other agents (i.e., the followers) do not have any external income: \(\lambda _{jk}=0\) for \(j=2,\ldots ,I.\)

  • Wealth distribution takes place at a Poisson rate \(\gamma _k\), with k the state of the background process. The vectors \((W_{11k},\ldots ,W_{1Ik})\), for \(k=1,2\), are multinomially distributed with parameters 1 and \((p_k, r_k,\ldots ,r_k)\). Here \(r_k\leqslant (1-p_k)/(I-1)\), meaning that with probability \(1-p_k-(I-1)r_k\in [0,1]\) the wealth unit of agent 1 leaves the economy. Typically one expects \(p_2>p_1\), as in periods of recession the leader will be inclined to save a larger fraction of their wealth. We let the vectors \((W_{j1k},\ldots ,W_{jIK})\), for \(k=1,2\) and \(j=2,\ldots ,I\), be multinomially distributed with parameters 1 and \((0, s_k,\ldots ,s_k)\). Here \(s_k\leqslant 1/(I-1)\), meaning that with probability \(1-(I-1)s_k\in [0,1]\) the wealth unit of client j leaves the economy.

In this setup intentionally various symmetries are assumed, so as to keep the model as low-dimensional as possible. Evidently, more general variants can be analyzed as well. For example, we could consider instances such that, for \(j=2,\ldots ,I\) and \(j'\not = j\), the distribution of \(W_{jjk}\) differs from the distribution of \(W_{jj'k}\).

6.1 Means and Variances

It is not hard to verify that

$$\begin{aligned} g_{1k}({\varvec{z}}) = z_1^{p_k}\left( \prod _{j=2}^I z_j\right) ^{r_k},\,\,\,\, g_{jk}({\varvec{z}})= \left( \prod _{j=2}^I z_j\right) ^{s_k}, \end{aligned}$$

thus also defining \(h_k({\varvec{z}})\). For ease, assume that the background process is in stationarity at time 0. We thus obtain the following differential equations for the transient means (in self-evident notation):

$$\begin{aligned} \begin{aligned} m_{\textrm{L},k}'(t)&= \sum _{\ell =1}^2q_{\ell k}m_{\textrm{L},\ell }(t)+\lambda _{k}\pi _k +\gamma _k(p_k-1)\,m_{\textrm{L},k}(t),\\ m_{\textrm{F},k}'(t)&= \sum _{\ell =1}^2q_{\ell k}m_{\textrm{F},\ell }(t)+\gamma _k\big (((I-1)s_k-1)m_{\textrm{F},k}(t)+r_k m_{\textrm{L},k}(t)\big ), \end{aligned} \end{aligned}$$

using the results found in Section 4.3.2. The transient means as described by the differential equations above indeed nicely match simulation results as shown in Fig. 1. Focusing on stationarity, we obtain that

$$\begin{aligned} \left( \begin{array}{c}m_{\textrm{L},1}\\ m_{\textrm{L},2}\end{array}\right) = - \left( \begin{array}{cc}-q_1+\gamma _1(p_1-1)&{}q_2\\ q_1&{}-q_2+\gamma _2(p_2-1)\end{array}\right) ^{-1}\left( \begin{array}{c}\lambda _1\pi _1\\ \lambda _2\pi _2\end{array}\right) \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} \left( \begin{array}{c}m_{\textrm{F},1}\\ m_{\textrm{F},2}\end{array}\right)&= - \left( \begin{array}{cc}-q_1+\gamma _1((I-1)s_1-1)&{}q_2\\ q_1&{}-q_2+\gamma _2((I-1)s_2-1)\end{array}\right) ^{-1}\cdot \\&\,\,\,\hspace{3cm}\left( \begin{array}{cc}\gamma _1r_1&{}0\\ 0&{}\gamma _2r_2\end{array}\right) \left( \begin{array}{c}m_{\textrm{L},1}\\ m_{\textrm{L},2}\end{array}\right) . \end{aligned} \end{aligned}$$

Using Proposition 4.3 we can determine under what condition these stationary means are well defined. A similar system of equations can be set up to determine the corresponding stationary (reduced) second moments, in self-evident notation denoted by

$$\begin{aligned} (v_{\textrm{LL},1},v_{\textrm{LL},2}, v_{\textrm{FF},1} , v_{\textrm{FF},2}, v_{\mathrm{FF'},1}, v_{\mathrm{FF'},2}, v_{\textrm{LF},1}, v_{\textrm{LF},2}); \end{aligned}$$

as the underlying ideas are precisely the same as for the stationary means, we do not present the expressions here.

Fig. 1
figure 1

Transient means of the leader’s and follower’s wealth. The numerical solutions have been evaluated by applying Euler’s method. For the simulated approximation we have run 2000 simulations for a population of \(I=30\). The chosen parameters are: \(q_{12}=1/100\), \(q_{21}=5/100\), \(\lambda _1 = 3\), \(\lambda _2=1\), \(\gamma _1 = 2\), \(\gamma _2 = 1\), \(p_1 = 0.3\), \(p_2 = 0.6\), \(1-p_1-(I-1)r_1 = 5/100\), \(1-p_2-(I-1)r_2 = 1/100\), \(1-(I-1)s_1 = 5/100\) and \(1-(I-1)s_2 = 1/10\)

Importantly, our method is capable of providing near-instantaneous response. In our implementation we have used highly accurate, efficient and robust procedures that solve the systems of linear differential equations, available in any computational package. The alternative is Monte Carlo simulation, which is much slower: for instance generating the data points for Fig. 1b took about 20 minutes on a standard laptop, while Fig. 2 (to be discussed in the next subsection) took roughly an hour. It is also noted that Monte Carlo provides just an estimate of the quantity under study, with the inherent uncertainty being quantified by means of a confidence interval. If one aims to have precise estimates, i.e., estimates with a narrow confidence interval, one has to bear in mind the rule of thumb that halving the width of the confidence interval requires that the number of sampled paths increases by a factor 4 (which follows by the central limit theorem).

Fig. 2
figure 2

Cumulative distribution function of the follower’s wealth, approximated through simulations and the normal approximation. For the simulated approximation we have run 5000 simulations for a population of \(I=30\). The chosen parameters are: \(q_{12}=1/100\), \(q_{21}=5/100\), \(\lambda _1 = 10\), \(\lambda _2=6\), \(\gamma _1 = 4\), \(\gamma _2 = 2\), \(p_1 = 0.2\), \(p_2 = 0.4\), \(1-p_1-(I-1)r_1 = 5/100\), \(1-p_2-(I-1)r_2 = 1/100\), \(1-(I-1)s_1 = 3/100\) and \(1-(I-1)s_2 = 7/100\)

6.2 Poverty Trap for Single Follower

Let f be the probability of the stationary wealth of an arbitrary follower, denoted by \(M_{j}\) for some follower \(j=2,\ldots ,I\), being below some critical threshold c; this f can, in our stylized context, be considered as the probability of a follower ending up in the poverty trap. We propose a straightforward normal approximation, inspired by the central limit theorem (clt), that is accurate if the scale of the underlying model is sufficiently high. A key property of such a clt-based approximations is that only the first two moments of \(M_j\) play a role. In our case, this means

$$\begin{aligned} f = {\mathbb P}(M_{j}\leqslant c) \approx f_N:=\Phi \left( \frac{c + 1/2 -m_\textrm{F}}{\sqrt{v^\circ _\textrm{FF}}}\right) , \end{aligned}$$

where the “\(+\,1/2\)” is the standard continuity correction, and in addition

$$\begin{aligned} m_\textrm{F}:= m_{\textrm{F},1}+ m_{\textrm{F},2},\,\,\,\,\,v_\textrm{FF}:= v_{\textrm{FF},1}+ v_{\textrm{FF},2},\,\,\,\,\,\, v^\circ _{\textrm{FF}}:=v_{\textrm{FF}}+m_{\textrm{F}}-m_{\textrm{F}}^2; \end{aligned}$$

as usual, \(\Phi (\cdot )\) denotes the cumulative distribution function of a standard normal random variable. To verify the accuracy of our findings, we have also estimated \({\mathbb P}(M_{j}\leqslant c)\) by performing 5000 independent simulation runs. The resulting distribution is indeed highly similar to the \(f_N\) resulting from the normal approximation, as shown in Fig. 2.

Note that the followers do not operate independently, as they react to a common background process. As a consequence, the number of followers ending up in the state of a poverty trap, say B, is not binomially distributed (even though all of them experience the same probability f of doing so). In the remainder of this section we point out how to derive a good proxy for the distribution of B.

6.3 Poverty Trap for Full Follower Population

With \(B_j\) the indicator function of \(M_j\leqslant c\), for \(j=2,\ldots ,I\), we are interested in approximating the distribution of

$$\begin{aligned} B:=\sum _{j=2}^I B_j. \end{aligned}$$

We propose a normal approximation, which relies on the central limit theorem, entailing that it will be particularly accurate as I grows. Define, for \(j,j'=2,\ldots ,I\) such that \(j\not = j'\),

$$\begin{aligned} f' := {\mathbb P}(M_{j}\leqslant c, M_{j'}\leqslant c). \end{aligned}$$

This probability can be approximated by its Gaussian counterpart. Applying continuity correction,

$$\begin{aligned} f'\approx f'_N:={\mathbb P}\left( M^\circ _{j}\leqslant c +\frac{1}{2}, M^\circ _{j'}\leqslant c +\frac{1}{2}\right) , \end{aligned}$$

with \((M^\circ _{j}, M^\circ _{j'})\) bivariate normal with mean \((m_\textrm{F},m_\textrm{F})\) and covariance matrix

$$\begin{aligned} \Sigma = \left( \begin{array}{cc}v^\circ _{\textrm{FF}}&{}v^\circ _{\mathrm{FF'}}\\ v^\circ _{\mathrm{FF'}}&{}v^\circ _{\textrm{FF}}\end{array}\right) , \end{aligned}$$

where \(v^\circ _{\textrm{FF}}:=v_{\textrm{FF}}+m_{\textrm{F}}-m_{\textrm{F}}^2\) (as before) and \(v^\circ _{\mathrm{FF'}}:=v_{\mathrm{FF'}}-m_{\textrm{F}}^2\). Notice that there are powerful numerical techniques to accurately evaluate bivariate normal probabilities; see for instance Cox and Wermuth (1991). It now follows that

$$\begin{aligned} \begin{aligned} {\mathbb E}\,B&\approx \mu _B:=(I-1)f_N,\\{\mathbb V}\textrm{ar}\,B&\approx \sigma ^2_B:=(I-1)f_N(1-f_N)+(I-1)(I-2)(f'_N-f_N^2). \end{aligned} \end{aligned}$$

Applying a standard continuity correction, this gives rise to the following approximation: for \(k=1,\ldots ,I-1\),

$$\begin{aligned} {\mathbb P}(B=k) \approx \Phi \left( \frac{k+\frac{1}{2}-\mu _B}{\sigma _B}\right) - \Phi \left( \frac{k-\frac{1}{2}-\mu _B}{\sigma _B}\right) . \end{aligned}$$

As shown in Fig. 3, the distribution following from the normal approximation matches the simulation-based approximation quite well.

Fig. 3
figure 3

Density function of the amount of follower’s with wealth less or equal to 1, evaluated through simulations and the normal approximation. For the simulated approximation we have run 2000 simulations for a population of \(I=50\). The chosen parameters are: \(q_{12}=1/100\), \(q_{21}=5/100\), \(\lambda _1 = 10\), \(\lambda _2=6\), \(\gamma _1 = 4\), \(\gamma _2 = 2\), \(p_1 = 0.2\), \(p_2 = 0.4\), \(1-p_1-(I-1)r_1 = 5/100\), \(1-p_2-(I-1)r_2 = 1/100\), \(1-(I-1)s_1 = 3/100\) and \(1-(I-1)s_2 = 7/100\)

7 Application 2: Opinion Dynamics

In the field of opinion dynamics one studies, predominantly based on mathematical models, the evolution of opinions in a collection of agents. Arguably the most basic, yet meaningful model was proposed in DeGroot (1974), considering a set of n agents, each of them having an opinion on a particular subject. At every discrete time instant each agent updates her opinion based on the other agents’ opinions. More specifically, the opinion of an agent’s opinion at a certain point in time is a weighted sum of all agents’ opinion at the previous time instant, with the weights summing to 1. To make the model more realistic, various extensions have been developed. In this context, one of the important contributions is Friedkin and Johnsen (1990), in which agents are assumed to have an internal opinion as well as an expressed opinion. In modelling terms, it means that the framework of Friedkin and Johnsen (1990) extends the one of DeGroot (1974) by dropping the assumption that the weights sum to 1 and by allowing (at any point in time) an external contribution to the opinion vector.

The goal of this section is to show that one can use the theory developed in the present paper to make opinion dynamics models considerably more realistic. Compared to existing frameworks on opinion dynamics, two significant improvements can be made. In the first place, most of the existing models on opinion dynamics that allows for analytical investigation are of a deterministic nature, and as such do not incorporate the random effects present in a population of agents influencing each other’s opinions; these can be included relying on our modelling framework. In the second place, it allows us, through the modulation mechanism, to incorporate randomly evolving external effects that have impact on the agents’ opinions. Examples of external effects could relate to the current level of economic prosperity, or to the degree of access to digital communication, or to the geographic area agents live in; these variables change over time (where the natural timescale could correspond to years) which result in different opinion dynamics.

Let us consider the first advantage, i.e., the option of introducing stochasticity, in more detail. The framework that we developed can in fact be interpreted as a stochastic generalization of the model studied in Friedkin and Johnsen (1990), which (in its most elementary form) can be summarized as follows. Let \(Y_t\) a vector of opinions, \(W_t\) a matrix that describes the effects of each opinion held at time \(t-1\) on the opinions held at time t, \(X_t\) a matrix of scores on exogenous variables, \(B_t\) a vector of coefficients giving the effects of each of the exogenous variables, \(\alpha _t\) a scalar weight corresponding to the endogenous conditions and \(\beta _t\) a scalar weight corresponding to the exogenous conditions. Then the recurrence relation that defines the model in Friedkin and Johnsen (1990) is given, for \(t\in {\mathbb N}\), by

$$\begin{aligned} Y_t = \alpha _t W_t Y_{t-1} + \beta _t X_t B_t. \end{aligned}$$

The relation with our model can be seen immediately by comparing this recurrence relation with the differential equation of the transient means in Proposition 4.1. Indeed, just as in Friedkin and Johnsen (1990), the opinion of an agent is construed as a linear combination of its own and the others’ opinions before the time of redistribution. An important additional advantage of our approach is the option to explicitly characterize all (reduced) moments of the opinion vector through systems of differential equations, by repeated differentiation of the relation featuring in Proposition 3.1. To our best knowledge, this is a novel analytical tool that was not provided in other stochastic generalizations of the Friedkin-Johnsen model Friedkin and Johnsen (1990).

7.1 Opinion Dynamics with Markovian Background Process

In this example we consider opinion dynamics in a two-group population (groups A and B) where individuals of one group – say group A – have an opinion update mechanism that alternates between a “normal mode” and an “adapted mode”. This framework may be interpreted as a stylized model for groups of individuals occasionally visiting larger events as conferences, demonstrations, or political gatherings. Formally we describe the model as follows.

  • The background process X(t) has two states, i.e., one corresponding to the normal mode (corresponding to state 1) and one to the adapted mode (corresponding to state 2).

  • We consider the situation that individuals do not automatically increase the value of their opinions when the background process is in the normal mode. Since group B agents are not affected by the background process, we thus have \(\lambda _\textrm{A,1}=\lambda _\textrm{B,1}=\lambda _\textrm{B,2}=0\). Additionally, we let group A agents automatically strengthen their opinion with rate \(\lambda _\textrm{A,2} \geqslant 0\) in the adapted mode, i.e., they only strengthen their opinion as a result of external influence by attending conference, political meetings, etc.

  • We let the Poisson rate of opinion redistribution \(\gamma > 0\) be unaffected by the background process. Since group A separates itself from group B when the background process is in the adapted state, we assume that group A and B individuals do not affect each other’s opinion in that state. In addition, in this example we allow group A agents to be “overenthusiastic” in the adapted state, resulting in distributing more opinion units than they originally have.

We first consider the situation that the background process is in state 1. Let there be \(I_\textrm{A}\) agents in group A, and \(I_\textrm{B}\) in group B. For \(j\in \{1,\ldots , I_\textrm{A}\}\) (i.e., corresponding to agents in group A), we let the vector

$$\begin{aligned} (W_{j,1,1}, \ldots , W_{j, I_\textrm{A},1},W_{j, I_\textrm{A} + 1, 1}, \ldots , W_{j ,I_\textrm{A}+ I_\textrm{B},1}) \end{aligned}$$

be multinomially distributed with parameters 1 and \((p_\textrm{AA,1},\ldots ,p_\textrm{AA,1},p_\textrm{AB,1},\ldots ,p_\textrm{AB,1})\); for \(j\in \{I_\textrm{A} +1 ,\ldots , I_\textrm{A} + I_\textrm{B}\}\) (i.e., corresponding to agents in group B) these parameters are 1 and \((p_\textrm{BA,1},\ldots ,p_\textrm{BA,1},p_\textrm{BB,1},\ldots ,p_\textrm{BB,1})\). Here we consider the situation that

$$\begin{aligned} I_\textrm{A} \, p_\textrm{AA,1} + I_\textrm{B} \, p_\textrm{AB,1} = 1\,\,\, \text{ and } \,\,\, I_\textrm{A} \, p_\textrm{BA,1} + I_\textrm{B} \, p_\textrm{BB,1} = 1, \end{aligned}$$

effectively meaning that “opinion mass cannot leak away from the system” when the background process is in state 1.

When the background process is in state 2, we let for \(j\in \{I_\textrm{A} +1 ,\ldots , I_\textrm{A} + I_\textrm{B}\}\) (i.e., for agents of group B)

$$\begin{aligned} (W_{j,1,2}, \ldots , W_{j, I_\textrm{A},2},W_{j, I_\textrm{A} + 1, 2}, \ldots , W_{j ,I_\textrm{A}+ I_\textrm{B},2}) \end{aligned}$$

be multinomially distributed with parameters 1 and \((0,\ldots ,0,p_\textrm{BB,2},\ldots ,p_\textrm{BB,2})\). Here, group B opinion does not leak away: \(I_B \, p_\textrm{BB,2} = 1\). For \(j\in \{1,\ldots , I_\textrm{A}\}\) (i.e., for agents of group A) we introduce an additional mechanism: with probability \(0 \leqslant \alpha \leqslant 1\) the unit of opinion to be distributed becomes two units and with probability \(1-\alpha \) it remains one unit. Afterwards, each unit is multinomially distributed with parameters 1 and \((p_\textrm{AA,2},\ldots ,p_\textrm{AA,2},0,\ldots ,0)\), with \(I_\textrm{A} \, p_\textrm{AA,2} = 1\).

We would like to stress that the mechanism described above is just an example of one specific type of opinion dynamics we can handle. Various alternative models can be dealt similarly (more than two groups, more than two background states, the “\(\alpha \)-jumps” corresponding with multiplying the “opinion unit” with a number different from two, etc.).

The model proposed allows groups of individuals of group A to “create opinion” when the background state is 2 (by \(\lambda _\textrm{A,2} > 0\)), and to multiply each of their existing opinion units by 2 with probability \(\alpha \). Since “opinion mass” does not leave the population in this setup, we typically expect opinions to grow unboundedly in time when \(\lambda _\textrm{A,2} > 0\) or \(\alpha > 0\). In the next section we show that the aforementioned is indeed true for the means of the agent’s opinion.

7.2 Means and Variances

For ease, we assume that the background process is in stationarity at time 0. With the results found in Section 4.3.3 and the notation as in Proposition 4.1, we have the following collection of differential equations describing the transient means:

$$\begin{aligned} {\varvec{m}}'(t) = A\,{\varvec{m}}(t)+\Lambda \,{\varvec{\pi }}, \end{aligned}$$

with \({\varvec{m}}(t) = (m_\textrm{A, 1}(t), m_\textrm{A, 2}(t), m_\textrm{B, 1}(t), m_\textrm{B, 2}(t))^{\top }\),

$$\begin{aligned} A=\left( \begin{array}{cccc} -q_{12} + \gamma (I_\textrm{A}\, p_\mathrm{{AA},1} - 1) &{} q_{21} &{} \gamma \, I_\textrm{B}\, p_\mathrm{{BA},1} &{} 0\\ q_{12} &{} -q_{21} + \gamma \, \alpha &{} 0 &{} 0\\ \gamma \, I_\textrm{A}\, p_\mathrm{{AB},1} &{} 0 &{} -q_{12} + \gamma \, (I_\textrm{B}\, p_\mathrm{{BB},1} -1)&{}q_{21}\\ 0 &{} 0 &{} q_{12} &{} -q_{21} \end{array} \right) \end{aligned}$$

and \(\Lambda = \textrm{diag}\{0, \lambda _\textrm{A,2},0,0\}\). For the transient second moments, similar differential equations can be derived as well, relying on the results presented in Section 5.1.

It is straightforward to verify that a necessary condition for the existence of a steady state is \(\alpha =0\) and \(\lambda _\textrm{A,2}=0\), as anticipated in the previous section. Under these conditions, opinions are only redistributed since no additional opinion is “created” and, by assumption, no opinion is “lost”. The steady state of this model is, by solving \(A \, \varvec{m} = \varvec{0}\), proportional to the vector

$$\begin{aligned} \left( 1,\,\, \frac{q_{12}}{q_{21}}, \,\, \frac{p_\mathrm{{AB},1}}{p_\mathrm{{BA},1}},\,\, \frac{p_\mathrm{{AB},1} \, q_{12}}{p_\mathrm{{BA},1} \, q_{21}}\right) ^{\top }. \end{aligned}$$
(4)

This vector should be normalized such that the total “opinion mass” equals its initial value (i.e., \({\varvec{1}}^\top {\varvec{M}}(0)\)). Indeed, if the setup of the system corresponds to spending longer periods of time in state 2, or equivalently \(q_{12} > q_{21}\), then we expect \(m_\textrm{A,2} > m_\textrm{A,1}\). Also, when group \(\mathrm A\) agents distribute their opinions with a higher probability to group \(\mathrm B\) agents than the probability of group \(\mathrm B\) agents distributing their opinion to group \(\mathrm A\) agents, thus \(p_\mathrm{{AB},1} > p_\mathrm{{BA},1}\), we expect \(m_{\textrm{B},1} > m_{\textrm{A},1}\). Both properties are in line with the steady-state expression (4).

When \(\alpha =0\) and \(\lambda _\textrm{A,2}=0\), the adapted state can be interpreted as a temporary interruption of the opinion formation process between groups \(\mathrm A\) and \(\mathrm B\); in the adapted state, agents of both groups do not distribute their opinion to other group’s agents. This interruption should not affect the steady state of the system. Indeed, the steady states of the groups are related through \(m_\textrm{A} = ({p_\mathrm{{BA},1}}/{p_\mathrm{{AB},1}}) \, m_\textrm{B}\) which does not involve the background process parameters \(q_{12}\) and \(q_{21}\).

Figure 4 shows that the simulation-based approximation of the means aligns with the numerical solution of the differential equation of the means. In this setup the abscissa \(\omega \) of the matrix A is equal to 0, thus not satisfying the stability condition of Proposition 4.3. We see that, although \(\omega =0\), the transient means do converge to a steady state. Note that this does not contradict Proposition 4.3: the condition \(\omega <0\) is a sufficient condition, hence there can be stability even though \(\omega <0\) is not fulfilled. In Section 7.3 we provide an example where there is no stability while the abscissa \(\omega \) is 0. In the setup of Fig. 4, the agents of group \(\mathrm A\) tend to give more attention to the topic of interest than the agents of group \(\mathrm B\), in the sense that \(I_\textrm{A} \,p_\mathrm{{AA},1} + I_\textrm{B}\, p_\mathrm{{BA},1} = 5/3\) while \(I_\textrm{A} \,p_\mathrm{{AB},1} + I_\textrm{B} \,p_\mathrm{{BB},1} = 8/9\). This “unequal attentiveness” typically results in polarization, as discussed in Chan et al. (2023).

Fig. 4
figure 4

Transient means of group A’s and B’s opinion. The numerical solutions have been evaluated by applying Euler’s method. For the simulated approximation we have run 2000 simulations. The group sizes are \(I_\textrm{A} = 10\) and \(I_\textrm{B} = 30\). The initial opinions of group A and B agents are 1 and 5 respectively. The aggregate opinion of all agents, \(I_\textrm{A} m_\textrm{A}(t) + I_\textrm{B} m_\textrm{B}(t)\), remains constant and equals \(I_\textrm{A} {m}_\textrm{A}(0) + I_\textrm{B} {m}_\textrm{B}(0) = 160\), as we have enforced “conservation of opinion”. The chosen parameters are: \(q_{12}=3/10\), \(q_{21}=2/10\), \(\lambda _\textrm{A, 2} = 0\), \(\alpha = 0\), \(\gamma = 5/8\), \(I_\textrm{A}\, p_\mathrm{{AA},1} / I_\textrm{B}\, p_\mathrm{{AB},1} = 2\) and \(I_\textrm{A}\, p_\mathrm{{BA},1} / I_\textrm{B}\, p_\mathrm{{BB},1} = 4/5\). The abscissa \(\omega \) is 0

In Fig. 5 we show the effect of different background parameters \(q_{12}\) and \(q_{21}\). The curve of the transient means corresponding to spending a longer fraction of time in the adapted state (i.e., the one with the higher \(q_{12}\)) moves more slowly to the steady state.

Fig. 5
figure 5

Effect of a longer period of time in the adapted state on the transient means. The transient means have been evaluated by applying Euler’s method. The longer period of time in the adapted state is obtained by increasing \(q_{12}\) (i.e., we now set \(q_{12}=1\), versus \(q_{12}=3/10\) in the base case). All the other parameters are identical to the parameters used in Fig. 4. The abscissa \(\omega \) is 0 in both situations

7.3 Unbounded Opinion Mass and Relative Opinions

We now consider the setup where \(\alpha \) or \(\lambda _{\textrm{A},2}\) is assumed to be strictly positive. As reasoned earlier, in this situation we expect the total “opinion mass” to grow beyond any bound. In Fig. 6a we show the evolution of the transient means in the situations \((\alpha , \lambda _{\textrm{A},2}) = (0, 2)\) and \((\alpha , \lambda _{\textrm{A},2}) = (1/10, 0)\).

Fig. 6
figure 6

Unbounded transient means become bounded when interpreted as relative opinions. The numerical solutions have been evaluated by applying Euler’s method. All the parameters chosen are identical to the setup as in Fig. 4 except for \(\lambda _\textrm{A, 2}\) and \(\alpha \). The abscissa \(\omega \) is 0 in the setup with \(\lambda _\textrm{A, 2} = 2\). The abscissa \(\omega \) is 0.027 in the setup with \(\alpha = 10\%\)

It is noted that, in principle, unbounded opinions (in our case due to \(\alpha > 0\) or \(\lambda _{\textrm{A},2} > 0\)) are an outcome of the model that cannot be supported by empirical data.

However, various techniques have been proposed in order to avert the phenomenon of unbounded opinions while maintaining the mentioned interaction mechanism, as explained in Flache et al. (2017). One of the ideas is to interpret the modeled opinions as relative opinions Chan et al. (2023). This means that the absolute values of the entries of the opinion vector have no meaning, but that interpretation is only given to the fractions between these entries. Concretely, the i-th entry divided by the j-th entry reflects the relative opinion of agent i with respect to agent j.

In Fig. 6b it is shown how the unbounded growth of opinion mass in Fig. 6a changes into a stabilizing growth when considering the relative perspective. This illustrates that, when dealing with instances of the stochastic model where opinion mass may grow unboundedly, one may consider working with relative opinions instead. With this interpretation, one can still obtain insight into the effects that the specific interaction mechanism between the agents has on the opinion formation process of the entire population. Further empirical and technical arguments in favor of working with relative opinions, rather than their absolute counterparts, are discussed in great detail in Chan et al. (2023).

8 Application 3: File Storage Systems

Consider a system in which users generate files. For safety reasons, the files that have been saved at the clients’ locations are periodically copied to a central storage location where a backup is made. The frequency of copying the clients’ files has to be sufficiently high to make sure that a relatively small amount has no centrally stored duplicate. In this section we discuss a sequence of models describing the dynamics of such file storage systems, starting with the most rudimentary variant. For more background on various aspects of data storage networks, we refer to e.g. Pessach (2013).

8.1 Basic Variant

A stylized model that describes the dynamics of this system is the following. Let \(M_1(t)\) record the number of files at the clients’ devices that have not been copied to the central storage unit by time \(t\geqslant 0\), and let \(M_2(t)\) the number of files at the storage unit at time \(t\geqslant 0\). Let \(\lambda \) be the Poisson rate at which the aggregate client population generates files. Let the times between subsequent backups be exponentially distributed with mean \(\gamma ^{-1}.\) In this system without modulation, we observe that \(W_{12}\equiv W_{22}\equiv 1\), and \(W_{11}\equiv W_{21}\equiv 0\).

It can be verified from the results of Section 4 that

$$\begin{aligned} m'_1(t) = \lambda - \gamma m_1(t),\,\,\,\,m_2'(t) = \gamma m_1(t). \end{aligned}$$

It is not hard to verify that the first moments can be computed explicitly: if the system starts empty, for \(t\geqslant 0\),

$$\begin{aligned} m_1(t) = \frac{\lambda }{\gamma }\big (1-e^{-\gamma t}\big ),\,\,\,\,m_2(t) = {\lambda t} - \frac{\lambda }{\gamma }\big (1-e^{-\gamma t}\big ). \end{aligned}$$

These expressions can be used to determine the optimal backup rate \(\gamma ^\star \), for instance by considering a cost function that encompasses the per-backup cost (with proportionality constant \(\kappa _\textrm{B}>0\)) and the cost associated with the number of files that have not been copied yet (with proportionality constant \(\kappa _\textrm{NC}>0\)). Concretely, for given time horizon \(t\geqslant 0\), this leads to the optimization problem

$$\begin{aligned} \min _{\gamma >0}F_1(\gamma )+F_2(\gamma ),\,\,\,\,\,\text{ with }\,\,F_1(\gamma ):= \gamma t\, \kappa _\textrm{B} ,\,\,\,F_2(\gamma ):= \frac{\lambda }{\gamma }\big (1-e^{-\gamma t}\big ) \kappa _\textrm{NC}, \end{aligned}$$

with the objective function \(F(\gamma ):=F_1(\gamma )+F_2(\gamma )\) being convex. It takes some calculus to verify that if \(F'(0)\geqslant 0\), or (equivalently) \(2\kappa _\textrm{B}\geqslant \lambda t \,\kappa _\textrm{NC}\), then \(\gamma =\gamma ^\star =0\) is optimal, which in practical terms means that one should very frequently update; if on the other hand \(F'(0)< 0\), or (equivalently) \(2\kappa _\textrm{B}<\lambda t \,\kappa _\textrm{NC}\), there is a strictly positive optimal update rate \(\gamma ^\star \). In the latter case, \(\gamma ^\star \) cannot be computed in closed form, but that can be easily numerically evaluated; see also Fig. 7.

Fig. 7
figure 7

The functions \(F(\gamma )\), \(F_1(\gamma )\), and \(F_2(\gamma )\) for \(2\kappa _\textrm{B}\geqslant \lambda t \,\kappa _\textrm{NC}\) (left panel) and for \(2\kappa _\textrm{B}< \lambda t \,\kappa _\textrm{NC}\) (right panel). In the former case \(\gamma ^\star =0\), whereas in the latter case \(\gamma ^\star >0\)

We can also find the (reduced) second moments \(v_{ij}(t)\), with \(i,j=1,2\), by using the techniques from Section 5. Indeed, for \(t\geqslant 0\), we have

$$\begin{aligned} \begin{aligned} v_{11}'(t)&=2\lambda \,m_1(t) - \gamma v_{11}(t),\\ v_{12}'(t)&=\lambda \,m_2(t) -\gamma v_{12}(t),\\ v_{22}'(t)&= \gamma v_{11}(t) + 2 \gamma \, v_{12}(t). \end{aligned} \end{aligned}$$

Solving the differential equations we obtain

$$\begin{aligned} \begin{aligned} v_{11}(t)&= \Big (\frac{\lambda }{\gamma }\Big )^2 \Big [ 2\,(1-e^{-\gamma t}) -2 \, \gamma \, t \, e^{- \gamma t} \Big ]\\ v_{12}(t)&= \Big (\frac{\lambda }{\gamma }\Big )^2 \Big [ \gamma t \,(1+ e^{-\gamma t}) + 2( e^{-\gamma t} -1) \Big ]\\ v_{22}(t)&= \Big (\frac{\lambda }{\gamma }\Big )^2 \Big [2 \,(1- e^{-\gamma t}) + \gamma ^2 \, t^2 - 2 \, \gamma \, t \Big ]. \end{aligned} \end{aligned}$$

Using the first and (reduced) second moments, we can derive the following expressions for the variances:

$$\begin{aligned} \begin{aligned} {\mathbb V}\textrm{ar}\, M_1(t)&= \Big (\frac{\lambda }{\gamma }\Big )^2 \Big [-2 \, \gamma \, t \, e^{-\gamma t} - e^{-2 \gamma t} - \frac{\gamma }{\lambda }\, e^{-\gamma t} + 1 + \frac{\gamma }{\lambda } \Big ]\\ {\mathbb V}\textrm{ar}\, M_2(t)&= \Big (\frac{\lambda }{\gamma }\Big )^2 \, \Big [-2 \, \gamma \, t \, e^{-\gamma t} - e^{-2 \gamma t} + \frac{\gamma }{\lambda }\, e^{-\gamma t} +\frac{\gamma ^2}{\lambda } \, t + 1 - \frac{\gamma }{\lambda } \Big ]. \end{aligned} \end{aligned}$$

8.2 Faulty Upload Link

This model can be further refined in various ways. In the first place, the link between the clients and the central storage may be faulty, in that it alternates between “functioning” and “broken”. Suppose that the time the link is up (down, respectively) is exponentially distributed with parameter \(q_\textrm{U}\) (\(q_\textrm{D}\), respectively). Then it easily seen, using the techniques developed in Section 5, that, in self-evident notation, with the second index corresponding to the link being up or down,

$$\begin{aligned} \begin{aligned} m_{1\textrm{U}}'(t)&= -q_\textrm{U}m_{1\textrm{U}}(t) + q_\textrm{D}m_{1\textrm{D}}(t) +\lambda \pi _\textrm{U}(t) - \gamma m_{1\textrm{U}}(t),\\ m_{1\textrm{D}}'(t)&= q_\textrm{U}m_{1\textrm{U}}(t) - q_\textrm{D}m_{1\textrm{D}}(t) +\lambda \pi _\textrm{D}(t),\\ m_{2\textrm{U}}'(t)&= -q_\textrm{U}m_{2\textrm{U}}(t) + q_\textrm{D}m_{2\textrm{D}}(t) + \gamma m_{1\textrm{U}}(t),\\ m_{2\textrm{D}}'(t)&= q_\textrm{U}m_{2\textrm{U}}(t) - q_\textrm{D}m_{2\textrm{D}}(t); \end{aligned} \end{aligned}$$

here the probabilities \(\pi _\textrm{U}(t)\) and \(\pi _\textrm{D}(t)\) follow from

$$\begin{aligned} \pi _\textrm{U}(t) = \pi _\textrm{U}(0)\pi _\textrm{UU}(t) + \pi _\textrm{D}(0)\pi _\textrm{DU}(t),\,\,\, \pi _\textrm{D}(t) = \pi _\textrm{U}(0)\pi _\textrm{UD}(t) + \pi _\textrm{D}(0)\pi _\textrm{DD}(t), \end{aligned}$$

with, abbreviating \(q:=q_\textrm{D}+q_\textrm{U}\),

$$\begin{aligned} \pi _\textrm{DU}(t)= 1- \pi _\textrm{DD}(t) = \frac{q_\textrm{D}}{q}(1-e^{-qt}),\,\,\,\, \pi _\textrm{UD}(t)= 1- \pi _\textrm{UU}(t) = \frac{q_\textrm{U}}{q}(1-e^{-qt}). \end{aligned}$$

The (reduced) second moments can be found relying on the theory of Section 5. As these can be found using the same line of reasoning as for the first moments, we do not provide the resulting differential equations here.

8.3 Failures in Storage Unit

A second extension includes the effect that the storage system itself can also be faulty, where upon failure all stored files are lost. We let these failure occur according to a Poisson process with intensity \(\bar{\gamma }\). Let \(M_3(t)\) denote the number of lost files. While the differential equations for \(m_{1\textrm{U}}(t)\) and \(m_{1\textrm{D}}(t)\) remain unchanged, the other ones become

$$\begin{aligned} \begin{aligned} m_{2\textrm{U}}'(t)&= -q_\textrm{U}m_{2\textrm{U}}(t) + q_\textrm{D}m_{2\textrm{D}}(t) + \gamma m_{1\textrm{U}}(t) -\bar{\gamma }m_{2\textrm{U}}(t) ,\\ m_{2\textrm{D}}'(t)&= q_\textrm{U}m_{2\textrm{U}}(t) - q_\textrm{D}m_{2\textrm{D}}(t)-\bar{\gamma }m_{2\textrm{D}}(t);\\ m_{3\textrm{U}}'(t)&= -q_\textrm{U}m_{3\textrm{U}}(t) + q_\textrm{D}m_{3\textrm{D}}(t)+\bar{\gamma }m_{2\textrm{U}}(t), \\ m_{3\textrm{D}}'(t)&= q_\textrm{U}m_{3\textrm{U}}(t) - q_\textrm{D}m_{3\textrm{D}}(t)+\bar{\gamma }m_{2\textrm{D}}(t). \end{aligned} \end{aligned}$$

For a company offering this service, a typical design goal would be: how small should \(q_\textrm{U}\), respectively \(\bar{\gamma }\), be to make sure that the mean number of lost files at time t, i.e., \(m_{3\textrm{U}}(t) + m_{3\textrm{D}}(t)\), is below a given threshold? More advanced criteria could also involve the corresponding variance.

To mitigate the effect of file loss due to failures of the storage unit, an evident policy is to copy all files to multiple storage units. For instance, one could consider the mechanism in which every file is simultaneously copied to two storage units at Poisson instants (again with rate \(\gamma \)), and that each of the storage units fails at Poisson instants (again with rate \(\bar{\gamma }\)). This means that a file is lost only if both storage units have failed. The above analysis (for the case of a single storage unit, that is) can be extended in an evident manner to this situation, thus facilitating the quantification of the performance gain due to making multiple backups. In order to decide whether such a policy should be implemented, this gain should be compared to the cost of the additional storage unit.

9 Concluding Remarks

This paper developed a versatile Markovian model that describes the dissemination of wealth over a population, but with the potential to be broadly applied across a wide range of disciplines. It is demonstrated that the evaluation of transient moments reduces to solving a system of coupled linear differential equations, while their stationary counterparts require solving (ordinary) linear systems of equations. Also a stability condition has been developed. Three examples evidence the model’s wide application potential.

In various directions there is scope for generalizations. It is in particular noted that, conditional on the state of the background process, the wealth units evolve independently, which is not in all application areas a realistic mechanism. In addition, one may try to relax the various exponentiality assumptions.