1 Introduction

The mathematical modelling of social interactions has been a topic of high recent interest. In early years this was naturally a field of qualitative models, which could at most be used to explain few macroscopic statistical data, and hence the interest in computing detailed distributions or spatial dependencies was limited. With the propagation of the internet and in particular the wide spreading of social networks, the field is changing significantly in recent years, since suddenly there is a huge amount of data to which models and predictions can be compared. This change is accompanied with increasing computational power, which allows for microscopic simulations. The corresponding field of agent-based models is of increasing importance within the social sciences and related fields like history or linguistics (cf. e.g. [3, 40, 43]).

From a mathematical point of view, it is natural to approach the transition from microscopic interaction to macroscopic models with the methods of statistical physics and kinetic theory, yielding (systems of) partial differential equations for distributions, with well-established asymptotic methods to further simplify or to analyze pattern formation (cf. [6, 8, 17,18,19, 22, 34, 36, 46, 50]). Such approaches have been used recently with success to explain macroscopic distributions in socio-economic interactions (cf. e.g. [15, 31, 32, 41, 48, 50,51,52, 59, 60]) as well as several aspects of opinion formation and polarization (cf. e.g. [9, 12, 27, 58]).

An issue that is naturally built into social processes is the network structure of interactions, which is commonly modelled via random networks. In meso- and macroscopic limits the network structure is usually lost, only few general characteristics of the network models feed into the remaining equations. From a rigorous point of view, the network limit poses particular challenges that are only partly resolved (cf. [20, 23]). In this paper we want to take another route towards incorporating a certain network structure into meso- and macroscopic models. We avoid to describe the detailed network structure of the N-particle (resp. agent) system, but rather describe the agents by a structural variable x (that can be considered as the spatial variable), which describes the position of an agent within a network. Then we can associate to each agent at x and a second agent at \(x^{\prime }\) a weight in the network, which corresponds to a rate of interaction and consider the agents overall as indistinguishable in the larger space of configurations consisting of their state and position, which reduces the mean-field limit to a standard setting. The limit then yields a kinetic equation with an additional structural variable. Let us mention that also the Boltzmann-Povzner kinetic models previously student for multi- agent systems such as flocking (cf. [30]) can be viewed in this spirit, due to the physics of the underlying systems x is naturally a spatial variable in these models and the weight depends only on the distance \(x-x^{\prime }\). Slight differences to our approach appear from the fact that x changes by standard transport in those applications and the type of interaction is rather symmetric, which is both not the case in the modelling setup we consider here.

When the structural variable x is viewed as a spatial variable, there is an interesting connection to the classical modelling of spatial systems with interactions such as reaction-diffusion equations (possibly with nonlocal diffusion). The key difference is that in our approach the particles do not change their position x when interacting with others, while in classical kinetic theory interactions are local in space and only appear when the particles change position. We will comment on the differences between these models and implications to macroscopic equations. Let us mention that with respect to social behaviour the classical approach is very natural for face-to-face interactions, thus basically the only relevant one up to the last century. In modern digital communication, a fully nonlocal interaction without any change of position seems to be more relevant however.

We will highlight our approach by case studies for three applications. The first concerns the propagation of dialects, which has been a topic a strong recent interest (cf. e.g. [13, 14, 37, 42, 47, 61]). Using different arguments, a macroscopic PDE model has been obtained by Burridge [13, 14], which is remarkably successfull in explaining the dialect maps in different countries as well as their coarsening. Naturally the structural variable is space and the rate of interactions between x and \(x^{\prime }\) is a frequency of communication, still maximal at close distance. We will show that through the basic assumptions on the interactions made in [13], we can derive a network structured kinetic model. The original model by Burridge is recovered as a monokinetic solution of a Vlasov equation approximation of this model. Moreover, we show that solutions of the Vlasov model converge exponentially to monokinetic solutions in a Wasserstein-type metric. It is apparent that the original kinetic model, respectively a second-order Fokker–Planck type approximation, include further information about the stochastic nature of the process. The second case study concerns the emergence of social norms, which has mainly been studied by agent-based models and their direct simulation (cf. e.g. [33, 39, 53]). We use a recent agent-based model proposed by Shaw [56] and perform analogous reasoning. The properties of the meso- and macroscopic models can easily be used to understand the long-time evolution and the emergence of segregation and coarsening effects as observed in simulations. The final one concerns the spread of an epidemic, actually a classical topic of mathematical modelling, for which we use a novel approach motivated by recent findings on pandemic spread.

As another consequence of these exemplary cases we work out a frequent very asymmetric nature of interactions. There is an active and a passive agent in each interaction. The active agent chooses some action with some probability depending on the state, but does not change his state due to the interaction. Vice versa the passive agent only changes his state based on the action of the active one. Our approach can be used to derive macroscopic models and study pattern formation or phase separation effects in such models.

The overall organization of the paper is as follows: in Section 2 we introduce the basic modelling approach and derive mean-field models via a hierarchy of marginals. We also discuss further approximations for small change in the interactions. In Section 3 we investigate a model for the evolution of dialects (respectively phonetic variables) and show that our approach reproduces the previously proposed macroscopic model by Burridge [13] as a monokinetic solution in a natural Vlasov approximation. We also show that for this case there is a decay of variance in the Vlasov approximation, i.e. the monokinetic solutions approximate the overall dynamics well. In Section 4 we investigate a model for the construction of social norms, which has been studied by agent-based simulations previously. We demonstrate that the model fits into our framework in an analogous way and derive a macroscopic model of similar structure to the dialect model. In Section 5 we discuss a network structured model for the spread of a pandemic disease, which highlights the similarities and differences to conventional nonlocal reaction-diffusion models. We finally conclude and present several open questions in Section 6.

2 Network-structured Kinetic Equations

The setup in this paper is as follows: We start with a system of N particles (synonymously called agents), each described by a structural variable \(x_{i} \in \mathbb {R}^{d}\) and a state \(v_{i} \in \mathbb {R}^{s}\) for i = 1,…,N. We will use the notation \(z_{i} =(x_{i},v_{i}) \in \mathbb {R}^{d+s}\) for the phase-space variable. Contrary to the classical transport model, we assume the position to be fixed, i.e., \(\frac {dx_{i}}{dt} = 0\), it is just used to encode a weighting of the particle interactions with a rate ωN(xi,xj). Thus obtain a network (more precisely a finite weighted graph) of particles with weights wij = ωN(xi,xj), together with a discrete function zi on the vertices of the network. Interactions between particles i and j are not assumed to be necessarily symmetric respectively conservative, but instead there is often an active and a passive role. Let us start with a general two particle interaction between the i-th and the j-th particle

$$ \begin{array}{@{}rcl@{}} v_{i}^{\prime} &=& v_{i} + a ,\\ v_{j}^{\prime} &=& v_{j} + b , \end{array} $$

where a and b are random variables chosen from a joint probability distribution \(\nu _{z_{i},z_{j}}\) on \(\mathbb {R}^{s} \times \mathbb {R}^{s}\).

The particularly relevant case for social interactions, as we shall also see in case studies below, is when one of the particles assumes an active and the other a passive role. An example is an opinion expressed by the active particle leading to a change of opinion of the passive one. Often \(\nu _{z_{i},z_{j}}\) is concentrated at zero in a, reflecting that the active agent does not change state. Given some initial distribution of the particles, the system can be described via the probability measure μN(⋅;t) on \(\mathbb {R}^{(d+s)N}\) for each \(t \in \mathbb {R}_{+}\), whose evolution is governed by

$$ \frac{d}{d t} \int \psi(Z) \mu_{N}(dZ_{N};t) = \underset{i \neq j}{\sum} \int\int \omega^{N}\left( x_{i}, x_{j}\right) \left( \psi(Z_{N}^{a,b,i,j})-\psi(Z_{N})\right) \nu_{z_{i},z_{j}}(da,db)\mu_{N}(dZ_{N};t) $$
(2.1)

for each smooth test function ψ on \(\mathbb {R}^{(d+s)N}\), ZN = (z1,…,zN), and \(Z_{N}^{a,b,i,j}\) being a modified version of ZN with vi and vj changed to vi + a, vj + b. The right-hand side of (2.1) can be simplified using

$$ \begin{array}{@{}rcl@{}} &&\int\int \omega^{N}\left( x_{i}, x_{j}\right) \left( \psi(Z_{N}^{a,b,i,j})-\psi(Z_{N})\right)\nu_{z_{i},z_{j}}(da,db)\mu_{N}(dZ_{N};t) \\ &=& \int \omega^{N}(x_{i}, x_{j}) \left( \int \psi(Z_{N}^{a,b,i,j}) \nu_{z_{i},z_{j}}(da,db) - \psi(Z_{N}) \right) \mu_{N}(dZ_{N};t). \end{array} $$

We can verify the well-posedness of this evolution equation as in the Picard–Lindelöf Theorem on the space of Radon measures together with conservation of mass and nonnegativity if ωN and \(\nu _{z_{i},z_{j}}\) depend continuously on the states:

Theorem 2.1

Let \({\mu _{N}^{0}} \in \mathcal {P}(\mathbb {R}^{(d+s)N})\), \(\omega ^{N} \in C_{b}(\mathbb {R}^{2d})\), and \(\nu \in C_{b}(\mathbb {R}^{d+s} \times \mathbb {R}^{d+s};\mathcal { P}(\mathbb {R}^{2s}))\). Then there exists a unique solution

$$ \mu_{N} \in C^{1}(\mathbb{R}_{+}; \mathcal{P}(\mathbb{R}^{(d+s)N})) $$

of (2.1) with initial value \({\mu _{N}^{0}}\).

Proof

The evolution is of the form

$$ \partial_{t} \mu_{N} = \mathcal{L}^{\ast} \mu_{N}, $$

with \({\mathscr{L}}^{\ast }: \mathcal {P}(\mathbb {R}^{(d+s)N}) \rightarrow \mathcal {P}(\mathbb {R}^{(d+s)N})\) defined as the adjoint of

$$ \begin{array}{@{}rcl@{}} \mathcal{L}: C_{b}(\mathbb{R}^{(d+s)N})~&\rightarrow&~ C_{b}(\mathbb{R}^{(d+s)N}), \\ \psi~ &\mapsto&~ \underset{i \neq j}{\sum} \int \omega^{N}(x_{i}, x_{j}) \left( \psi(Z_{N}^{a,b,i,j})-\psi(Z_{N})\right)\nu_{z_{i},z_{j}}(da,db). \end{array} $$

With our assumptions on ν and ωN the linear operator \({\mathscr{L}}\) is well-defined and bounded and so is \({\mathscr{L}}^{\ast }\). Thus, the Picard–Lindelöf theorem immediately yields existence and uniqueness of a solution in the Banach space of Radon measures. The fact that μN preserves mass one in time follows immediately with ψ ≡ 1 and the preservation of nonnegativity follows with the form

$$ \begin{array}{@{}rcl@{}} && \frac{d}{d t} \left( e^{\lambda(N^{2}-1)t}\int \psi(Z) \mu_{N}(dZ_{N};t) \right)\\ &&\quad= e^{\lambda(N^{2}-1)t} \underset{i \neq j}{\sum} \int \omega^{N}(x_{i}, x_{j}) \int \psi(Z_{N}^{a,b,i,j})\nu_{z_{i},z_{j}}(da,db) \mu_{N}(dZ_{N};t)\\ &&\qquad + e^{\lambda(N^{2}-1)t} \underset{i \neq j}{\sum} \int (\lambda- \omega^{N}(x_{i}, x_{j})) \psi(Z_{N}) \mu_{N}(dZ_{N};t) \end{array} $$

and \(\lambda \geq \|\omega ^{N}\|_{\infty }\). □

Let us mention that alternatively we can derive existence and uniqueness in the W1-Wasserstein metric (equivalent to the bounded Lipschitz metric)

$$ d_{W_{1}}(\mu_{N};\tilde{\mu}_{N}) = \underset{\psi \in C^{0,1}, \|\psi\| \leq 1}{\sup} \int \psi(z) \mu_{N}(dz) - \int \psi(z) \tilde{\mu}_{N}(dz), $$

if ωN and the maps

$$ (z,\tilde{z}) \mapsto \int |a| \nu_{z,\tilde{z}}(da,db), \qquad (z,\tilde{z}) \mapsto \int |b| \nu_{z,\tilde{z}}(da,db) $$

are Lipschitz-continuous.

2.1 Mean-Field Limit

In the following we shall derive a mean-field limit of the evolution equation (2.1) using a key assumption on the weights, namely

$$ \omega^{N}(x,y) = \frac{1}N w(x,y) \qquad \forall (x,y) \in \mathbb{R}^{d} \times \mathbb{R}^{d}, $$

which is a natural scaling. Then (2.1) reads

$$ \frac{d}{d t} \int \psi(Z) \mu_{N}(dZ_{N};t) = \frac{1}N \underset{i \neq j}{\sum} \int\!\!\int w\left( x_{i}, x_{j}\right) \left( \psi(Z_{N}^{a,b,i,j})-\psi(Z_{N})\right) \nu_{z_{i},z_{j}}(da,db) \mu_{N}(dZ_{N};t) $$

and we can derive in a standard way a BBGKY-type hierarchy for the marginals

$$ \mu_{N:k} = \int {\ldots} \int \mu_{N}(dz_{k+1} {\ldots} dz_{N}). $$

Equations for the marginals can be derived easily when using a test function ψk that only depends on Zk = (z1,…,zk), which yields for k = 1,…,N

$$ \begin{array}{@{}rcl@{}} &&\frac{d}{d t} \int \psi_{k}(Z_{k}) \mu_{N:k}(dZ_{k};t)\\ && = \frac{1}N \underset{1 \leq i \neq j \leq k}{\sum} \int\int w\left( x_{i}, x_{j}\right) \left( \psi_{k}(Z_{k}^{a,b,i,j})-\psi_{k}(Z_{k})\right)\nu_{z_{i},z_{j}}(da,db) ~\mu_{N:k}(dZ_{k};t)\\ && \quad + \frac{N-k}N \sum\limits_{i=1}^{k} \int\int w\left( x_{i}, x_{k+1}\right) \left( \psi_{k}(Z_{k}^{a,i})-\psi_{k}(Z_{k})\right) ~\nu_{z_{i},z_{k+1}}(da,db) ~\mu_{N:k+1}(dZ_{k+1};t)\\ && \quad+ \frac{N-k}N \sum\limits_{i=1}^{k} \int\int w\left( x_{i}, x_{k+1}\right) \left( \psi_{k}(Z_{k}^{b,i})-\psi_{k}(Z_{k})\right) ~\nu_{z_{k+1},z_{i}}(da,db) ~\mu_{N:k+1}(dZ_{k+1};t). \end{array} $$

Here we use the notation \(Z_{k}^{a/b,i}\) for a version of Zk with zi changed to zi + a/b. In the infinite limit \(N\rightarrow \infty \) we formally arrive at the infinite hierarchy

$$ \begin{array}{@{}rcl@{}} && \frac{d}{d t} \int \psi_{k}(Z_{k}) \mu_{\infty:k}(dZ_{k};t) \\ &&= \sum\limits_{i=1}^{k} \int\int w\left( x_{i}, x_{k+1}\right) \left( \psi_{k}(Z_{k}^{a,i})-\psi_{k}(Z_{k})\right)\nu_{z_{i},z_{k+1}}(da,db) ~\mu_{\infty:k+1}(dZ_{k+1};t) \\ && \quad + \sum\limits_{i=1}^{k} \int\int w\left( x_{i}, x_{k+1}\right) \left( \psi_{k}(Z_{k}^{b,i})-\psi_{k}(Z_{k})\right) \nu_{z_{k+1},z_{i}}(da,db) ~\mu_{\infty:k+1}(dZ_{k+1};t) \end{array} $$

for \(k \in \mathbb {N}\).

2.2 Kinetic Equation

The infinite hierarchy of marginals allows for a solution in terms of product measures \(\mu _{\infty :k} = \mu ^{\otimes k}\), which characterizes the mean-field limit. The single-particle measure \(\mu =\mu _{\infty :1}\) solves the kinetic equation

$$ \begin{array}{@{}rcl@{}} \frac{d}{d t} \int \varphi(z) \mu (dz;t) &=& \int\int\int w\left( x,\tilde{x}\right) \left( \varphi(z^{a})-\varphi(z)\right)\nu_{z,\tilde{z}}(da,db)\mu(d\tilde{z};t)\mu(dz;t) \\ &&+ \int\int\int w\left( x,\tilde{x}\right) \left( \varphi(z^{b})-\varphi(z)\right) \nu_{\tilde{z},z}(da,db)\mu(d\tilde{z};t)\mu(dz;t). \end{array} $$

We define η as the projection of the measure μ to the structural variables, i.e.,

$$ \eta(\cdot;t) = \int \mu(\cdot,dv;t). $$
(2.2)

By using test functions of the form φ(z) = ψ(x) we immediately see

$$ \frac{d}{d t} \int \psi(x) \eta(dx;t) = \frac{d}{d t} \int \varphi(z) \mu (dz;t) = 0, $$

thus it is straight-forward to show the following result:

Lemma 2.2

Assume \(\eta \in C(0,T;\mathcal {P}(\mathbb {R}^{d}))\) is well-defined by (2.2). Then η is stationary, i.e. η(⋅;t) = η(⋅;0) for all t ∈ [0,T].

The stationarity of η is a natural consequence of our modelling assumption that the network does not change. For higher moments we do not get equally simple results, e.g. the evolution of the first moment in v is determined by

$$ \frac{d}{d t} \int v \mu (dz;t) = \int\int W(z,\tilde{z}) \mu(d\tilde{z};t)\mu(dz;t) $$

with

$$ W(z,\tilde{z}) = w(x,\tilde{x})\int a \nu_{z,\tilde{z}}(da,db) + \int b\nu_{\tilde{z},z}(da,db). $$

2.3 Vlasov Approximation

As usual for kinetic equations we can proceed to local approximations if the changes a and b are small. If their higher order moments are negligible compared to the expectations, we can proceed in a straight-forward way to a Vlasov approximation, which is given in weak formulation

$$ \frac{d}{d t} \int \varphi(z) \mu (dz;t) = \int\int \nabla_{v} \varphi(z) W(z,\tilde{z})\mu(d\tilde{z};t) \mu(dz;t) $$
(2.3)

with

$$ W(z,\tilde{z}) = w(x,\tilde{x}) \left( \int a \nu_{z,\tilde{z}}(da,db) + \int b\nu_{\tilde{z},z}(da,db) \right). $$
(2.4)

In order to perform suitable asymptotic analysis it is more convenient to assume that there is a small parameter ε scaling the interactions and

$$ \int (v^{a} - v)\nu_{z,\tilde{z}}(da,db) + \int (v^{b} - v) \nu_{\tilde{z},z}(da,db) = \mathcal{O}(\varepsilon^{\alpha}) $$

for some α > 0 and rescale time by εα as well. Then we obtain instead (2.3) with

$$ W(z,\tilde{z}) = \underset{\varepsilon \downarrow 0}{\lim}\varepsilon^{-\alpha} w(x,\tilde{x}) \left( \int a \nu_{z,\tilde{z}}(da,db) + \int b\nu_{\tilde{z},z}(da,db) \right). $$

The corresponding strong form is

$$ \partial_{t} \mu + \nabla_{v} \cdot (\mu \int W(z,\tilde{z})\mu(d\tilde{z};t) ) =0, $$
(2.5)

which is reminiscent of the classical Vlasov equation, however without transport term in x and possibly a strong interaction in the v-space. It is well-known that the stability of this type of equations for sufficiently smooth W can be derived in Wasserstein metrics (cf. [24, 36]) or equivalently via the method of characteristics (cf. [10, 36, 49]). In the case of (2.5), (2.4) the characteristic curves are given by the solutions of

$$ \frac{dX}{dt}(z,t) = 0, \qquad \frac{dV}{dt}(z,t) = \int W(X(z,t),V(z,t),X(\tilde{z},t),V(\tilde{z},t)) \mu_{0}(d\tilde{z}) $$

with initial values X(z,0) = x, V (z,0) = v. Due to the stationarity of X we can compute X(x,v,t) and formulate the characteristics solely in V as

$$ \frac{dV}{dt}(z,t) = \int W(x,V(z,t),\tilde{z},V(\tilde{z},t)) \mu_{0}(d\tilde{z}). $$

If K is sufficiently regular, in particular Lipschitz with respect to V, the existence and uniqueness as well as stability estimated can again be obtained by ODE techniques.

Similar to the analysis along characteristics, we can also find a particular class of solutions corresponding to monokinetic solutions in classical kinetic theory. Monokinetic solutions are of the form μ(dz;t) = η(dx) ⊗ δV (x,t)(dv), with a nonnegative Radon measure η on \(\mathbb {R}^{d}\) and V being a solution of

$$ \partial_{t} V(x,t) = \int W(x,V(x,t),\tilde{x},V(\tilde{x},t))\eta(d\tilde{x}). $$
(2.6)

Note that as before η is a stationary measure, which is due to the stationarity of characteristics in x-space. Under suitable properties of the kernel K (respectively the measure ν) we may find exponentially fast convergence of solutions of the Vlasov equation to monokinetic ones with initial values \(V(x,0) = {\int \limits } v \mu _{0}(dz)\), as we shall see in case studies below. Then the equation (2.6) for V is the relevant one to understand the dynamics and possible pattern formation. Indeed it is an interesting nonlinear and nonlocal equation, which can yield rich dynamics such as phase separation and coarsening, again illustrated below in examples. In other cases it can be relevant to study the full network-structured kinetic equation respectively its Vlasov approximation.

2.4 Fokker–Planck Approximation

For a better approximation of the variance the second moment can be included in order to obtain a nonlinear and nonlocal Fokker–Planck equation, in weak form

$$ \begin{array}{@{}rcl@{}} \frac{d}{d t} \int \varphi(z) \mu (dz;t) &=& \int\int \nabla_{v} \varphi(z) W(z,\tilde{z})~\mu(d\tilde{z};t)\mu(dz;t) \\ &&+ \int\int {\nabla_{v}^{2}} \varphi(z) : A(z,\tilde{z})\mu(d\tilde{z};t)\mu(dz;t) \end{array} $$

with K defined by (2.4), denoting the Frobenius scalar product

$$ A(z,\tilde{z}) = \frac{1}2 w(x,\tilde{x}) \left( \int a \otimes a \nu_{z,\tilde{z}}(da,db) + \int b \otimes b \nu_{\tilde{z},z}(da,db) \right). $$

In strong form the Fokker–Planck equation becomes

$$ \partial_{t} \mu + \nabla_{v} \cdot \left( \mu \int W(z,\tilde{z})\mu(d\tilde{z};t)\right) = \nabla \cdot \left( \nabla\left( \mu \int A(z,\tilde{z})\mu(d\tilde{z};t)\right)\right). $$

We leave a detailed discussion of the analysis of the second order equation for typical interactions as considered in the examples later to future research.

2.5 Variants: Discrete Structures or States

There are several variants of the model we have formulated above in a purely continuum setting. However, there are some variants of the model in discrete or semidiscrete settings. An obvious case is related to a finite set of structural variables x ∈{x1,…,xM}. This can be set up in an analogous way as the model above, choosing a measure of the form

$$ \mu(dx,dv;t) = \frac{1}M \sum\limits_{i=1}^{M} \delta_{x_{i}}(dx) \lambda_{i}(dv;t). $$

The weights w need to be specified only for the discrete values xi.

Another semidiscrete case concerns the state variables v, between which the agents oscillates. Such a model arises as a special case of our approach if \(\nu _{z,\tilde {z}}\) and μ(⋅,dv;t) are concentrated at a finite number of possible states, with transitions a and b such that this finite state space remains invariant.

3 Case Study: Evolution of Dialects

The first model we study in our framework is related to the evolution of dialects, as discussed by Burridge [13]. We will rederive this model as a monokinetic solution to a network-structured equation, respectively a spatially local approximation.

The model by Burridge [13] is based on the memory of the way certain vowels are used in words, which change during interactions with others (i.e. hearing them speak). Under the assumption that there are M ways to use that vowel, the memory of each agent is of the form v = (v1,…,vM) being an element of the convex set

$$ K=\left\{\textbf{v} \in \mathbb{R}^{M}~|~v_{i} \geq 0, \sum v_{j} = 1 \right\}, $$

i.e. vi is perceived as a relative frequency of the appearance of certain words. When using the vowel in conversations, an agent with memory v will choose variant i with probability pi(v), the expression proposed in [13] is

$$ p_{i}(\textbf{v}) = \frac{v_{i}^{\alpha}}{{\sum}_{j} v_{j}^{\alpha}} $$
(3.1)

with α > 1 in order to give stronger weight to those with highest memory. We will also use the notation

$$ \textbf{p}: K \rightarrow K,\quad \textbf{v} \mapsto (p_{i}(\textbf{v}))_{i} . $$

It can be shown that with the above choice of p with α ≥ 1 we obtain a monotone invertible map \(\textbf {p}: K \rightarrow K\) (cf. [26]).

The collisions are due to hearing a certain variant, with post-collisional memory

$$ \textbf{v}_{i}^{\prime}= \frac{1}{1+\gamma} \textbf{v} + \frac{\gamma}{1+\gamma} \textbf{e}_{i} $$

in case the speaker (active agent) has chosen variant i, with ei being the i-th unit vector. This leads to the following formula for the pre-collisional memory

$$ \textbf{v}_{i}^{\ast}= (1+\gamma) \textbf{v} - \gamma \textbf{e}_{i}. $$

Here γ > 0 is a parameter related to the weight given to the last appearance compared to the long-term memory. It is natural to think of γ as a small parameter, since a single appearance of a variant will have low impact.

3.1 Boltzmann Equation

The dialect model can be put in a semidiscrete state setting corresponding to our general framework above. There are only M different transitions possible, the total state space is however continuous due to the dependence of the transitions on v.

$$ \nu_{z,\tilde{z}}(da,db) = \delta_{0}(da) \otimes \sum\limits_{i=1}^{M} p_{i}(\textbf{v}) \delta_{\frac{\gamma}{1+\gamma} (\textbf{e}_{i}-\tilde{\textbf{v}})}(db). $$

Thus, the corresponding Boltzmann-type equation for the evolution of the measure μ is given by

$$ \frac{d}{dt} \int \varphi(z) \mu(dz;t) = \sum\limits_{i=1}^{M} \int\int w(x,\tilde{x}) \left( \varphi(z_{i}^{\prime}) - \varphi(z)\right) p_{i}(\tilde{\textbf{v}}) \mu(d\tilde{z};t) \mu (dz;t) $$

with \(z_{i}^{\prime }=(x,\textbf {v}_{i}^{\prime })\). Note that due to \( {\sum }_{i=1}^{M} p_{i}(\textbf {v}) = 1\) for all v we can simplify the loss term to

$$ \sum\limits_{i=1}^{M} \int\int w(x,\tilde{x}) \varphi(z) p_{i}(\tilde{\textbf{v}}) \mu(d\tilde{z};t) \mu (dz;t) = \int \kappa(x) \varphi(z) \mu(dz;t) $$

with

$$ \kappa(x) = \int w(x,\tilde{x}) \mu(d\tilde{z};t) = \int w(x,\tilde{x}) \rho(\tilde{x})~d\tilde{x}. $$

Note that we used the stationarity of \({\int \limits } \mu (\cdot ,dv;t)\) to derive the stationary coefficient κ.

In the remainder of this chapter we shall assume that μ is absolutely continuous with respect to the Lebesgue measure on \(\mathbb {R}^{d} \times K\) and write μ = f dz with a probability measure f. Moreover, we define the spatial density

$$ \rho(x,t) = \int f(x,v,t)~dv. $$

For this special interaction we can also get an equation for the mean value

$$ \rho(x,t) V(x;t) = \int v f(x,\textbf{v};t)~d\textbf{v} $$

by choosing φ(z) = ψ(x) ⊗ v, namely

$$ \begin{array}{@{}rcl@{}} \frac{d}{dt} \int \psi(x) V(x,t)~dx &=& \sum\limits_{i=1}^{M} \int w(x,\tilde{x}) \psi(x) \left( \frac{\gamma}{1+\gamma} (\textbf{e}_{i}-\tilde{\textbf{v}})\right) p_{i}(\tilde{\textbf{v}})f(\tilde{z};t)d\tilde{z} f(z;t)~dz \\ &=& \frac{\gamma}{1+\gamma} \sum\limits_{i=1}^{M} \int\int \psi(x) w(x,\tilde{x}) \rho(x) \textbf{p}(\tilde{\textbf{v}}) f(\tilde{z};t)d\tilde{z} dx\\ &&-\frac{\gamma}{1+\gamma} \int \psi(x) \kappa(x) \rho(x,t) V(x,t) ~dx. \end{array} $$

This a non-closed equation if p is nonlinear, in particular we see that v is not a collision invariant, which is due to the asymmetric structure of the interactions.

3.2 Vlasov Approximation

Noticing that naturally γ is a small parameter we may perform a (formal) asymptotic as \(\gamma \rightarrow 0\) in order to derive a Vlasov equation. As above, with an additional rescaling of time by \(\frac {\gamma }{1+\gamma }\) we arrive at the equation in weak form

$$ \frac{d}{dt} \int \varphi(x,\textbf{v}) f(x,\textbf{v},t)~dz = -\int w(x,\tilde{x}) \nabla_{\textbf{v }} \varphi(x,\textbf{v}) f(x,\textbf{v},t)\int f(\tilde{x},\tilde{\textbf{v}},t) (\textbf{v} - \textbf{p}(\tilde{\textbf{v}})) ~d\tilde{z}~dz $$

respectively in strong form

$$ \partial_{t} f(x,\textbf{v},t) = \nabla_{\textbf{v}} \cdot \left( f(x,\textbf{v},t) \int w(x,\tilde{x}) f(\tilde{x},\tilde{\textbf{v}},t) (\textbf{v} - \textbf{p}(\tilde{\textbf{v}})) ~d\tilde{z} \right) , $$
(3.2)

3.2.1 Monokinetic Solutions

The monokinetic solutions in the case of the dialect model are of the form ρ(x)δvV (x,t), where V solves the nonlocal equation

$$ \partial_{t} V(x,t) = \int w(x,\tilde{x})\left( \textbf{p}(V(\tilde{x},t)) - V(x,t)\right) \rho(\tilde{x}) d\tilde{x}. $$

It is instructive to rewrite the model as

$$ \rho(x) \partial_{t} V(x,t) = - \rho(x) \kappa(x) (V - \textbf{p}(V)) + \int w(x,\tilde{x}) \rho(x)\rho(\tilde{x}) (\textbf{p}(V(\tilde{x},t)) - \textbf{p}(V(x,t))dx, $$
(3.3)

which highlights its structure as a nonlocal reaction-diffusion equation. The first term is a multistable reaction, and it is easy to figure out that its stable steady states are in the corners of K, which corresponds to phase separation. The second term is a nonlocal diffusion operator acting on p(V ), which is known to promote coarsening behaviour as in the celebrated Allen–Cahn equation (cf. [1, 28, 44]).

Let us mention an alternative modelling approach, which is more convenient in literature: assuming that agents interact only locally and move independently (with the same kind of kernel), we would obtain the more standard nonlocal reaction-diffusion model

$$ \rho(x) \partial_{t} V(x,t) = - \rho(x) \kappa(x) (V - \textbf{p}(V)) + \int w(x,\tilde{x}) \rho(x)\rho(\tilde{x}) V(\tilde{x},t) - V(x,t)dx. $$
(3.4)

The key difference to (3.3) is the linearity of the nonlocal diffusion term, it remains to understand the implications.

3.2.2 Concentration to Monokinetic Solutions

In the following we investigate the concentration behaviour of solutions to (3.2) in the v-space. For this sake we compute the evolution of the variance. We denote the support of ρ by \({{\varOmega }} \subset \mathbb {R}^{d}\), assuming Ω is a regular domain, and perform all integrations with respect to x on Ω.

Proposition 3.1

Let f be a sufficiently regular weak solution of (3.2) and let

$$ \overline{\mathbf{V}}(x,t) = \frac{1}{\rho(x)} \int \mathbf{v} f(x,\mathbf{v};t) d\mathbf{v} $$

denote the expectation of v. Then the quadratic variation in v, given by

$$ \mathcal{V}(t) = \int \int \vert\mathbf{v} - \overline{\mathbf{V}}(x,t) \vert^{2} f(x,\mathbf{v},t)~dz, $$

is nonincreasing in time, in particular it is increasing as long f is not concentrated at \(\overline {\mathbf {V}}(x,t)\) on the support of κ. If there exists a positive constant κ0 such that κ0κ(x) for almost all xΩ, then

$$ \mathcal{V}(t) \leq e^{-2 \kappa_{0} t} \mathcal{V}(0). $$

Proof

Integrating (3.2) with respect to v we find

$$ \partial_{t} \overline{\textbf{V}}(x,t) = - \kappa(x) \overline{\textbf{V}}(x,t) + \int w(x,\tilde{x}) f(\tilde{x},\tilde{\textbf{v}},t)\textbf{p}(\tilde{\textbf{v}})~d\tilde{z}. $$

Thus,

$$ \begin{array}{@{}rcl@{}} && \frac{d}{dt} \frac{1}2 \int \int |\textbf{v} - \overline{\textbf{V}}(x,t)|^{2} f(x,\textbf{v},t)~dz \\ &&\qquad = \frac{1}2 \int |\textbf{v} - \overline{\textbf{V}}(x,t)|^{2} \partial_{t} f(x,\textbf{v},t)~dz - \int (\textbf{v} - \overline{\textbf{V}}(x,t))\cdot \partial_{t} \overline{\textbf{V}}(x,t) f(x,\textbf{v},t)~dz\\ &&\qquad= - \int \int w(x,\tilde{x}) (\textbf{v} - \overline{\textbf{V}}(x,t))\cdot f(x,\textbf{v},t) f(\tilde{x},\tilde{\textbf{v}},t) (\textbf{v} - \textbf{p}(\tilde{\textbf{v}})) ~d\tilde{z}~dz\\ &&\qquad \quad + \int w(x,\tilde{x})(\textbf{v} - \overline{\textbf{V}}(x,t))\cdot (\kappa(x)\overline{\textbf{V}}(x,t) - \int f(\tilde{x},\tilde{\textbf{v}},t) \textbf{p}(\tilde{\textbf{v}}) ~d\tilde{z}) f(x,\textbf{v},t)~dz \\ &&\qquad = - \int \kappa(x) \vert\textbf{v} - \overline{\textbf{V}}(x,t) \vert^{2} f(x,\textbf{v},t)~dz. \end{array} $$

The assertions follow directly, respectively with Gronwall’s lemma. □

Let us mention that analogous statements can be derived for other moments p ≥ 1, those are equivalent for estimates of the metric

$$ d_{p}(f,\rho \delta_{\overline{\textbf{V}}}) = \left( {\int}_{{{\varOmega}}} W_{p}\left( f(x,\cdot),\rho(x) \delta_{\overline{\textbf{V}}(x,\cdot)}\right)^{p}~dx\right)^{1/p}, $$

with Wp being the p-Wasserstein metric (taking into account the explicit form of Wasserstein metrics if one measure is concentrated). Together with a stability estimate on solutions of (3.2), we see that it can be expected that solutions are close to monokinetic ones, we leave a more quantitative analysis to future research.

3.3 Spatially Local Approximation

As a last step we consider the case of w being a spatially local kernel, for simplicity we assume it is convolutional and moreover ρ ≡ 1 on a domain \({{\varOmega }} \subset \mathbb {R}^{d}\). The local kernel is scaled such that

$$ w(x,\tilde{x}) = \varepsilon^{-d} k(\varepsilon^{-1} (x-\tilde{x})) $$

and k is assumed to be even. Then we find for a function φ being smooth

$$ \int w(x,\tilde{x}) (\varphi(\tilde{x}) - \varphi(x)) ~dx = \frac{C \varepsilon^{2}}2 {{\varDelta}} \phi + \mathcal{O}(\varepsilon^{4}), $$

with C being the second moment of k. Using the notation \(\sigma = \sqrt {C} \varepsilon \) we obtain the following approximation for monokinetic solutions:

$$ \partial_{t} V = - \kappa (V - \textbf{p}(V)) + \frac{\sigma^{2}}2 {{\varDelta}} \textbf{p}(V). $$

This is a nonlinear reaction-diffusion equation that was originally derived by Burridge [13]. For M = 2 rigorous existence and uniqueness of classical and weak solutions (globally in time) can be shown (cf. [26]), for M > 2 only local existence of classical solutions is known so far. Global existence and a quantitative analysis of the coarsening dynamics is a challenging open problem due to the degenerate nonlinear cross-diffusion effects and the absence of a gradient flow structure. Let us mention that by introducing the the inverse function of p, denoted by V, we can equivalently formulate an equation for the vector P of probabilities

$$ \partial_{t} \textbf{V}(P) = - \kappa (\textbf{V}(P) - P) + \frac{\sigma^{2}}2 {{\varDelta}} P. $$

From the derivation of the local equation we naturally expect κσ2. Using a time scaling such that σ is of order one, we see that κ is a large parameter, thus to leading order we have V = p(V ), so the approximation by the standard Allen–Cahn equation

$$ \partial_{t} V = - \kappa (V - \textbf{p}(V)) + \frac{\sigma^{2}}2 {{\varDelta}} V $$

respectively

$$ \partial_{t} P = - \kappa (\textbf{V}(P) - P) + \frac{\sigma^{2}}2 {{\varDelta}} P $$

may be equally accurate in the local limit. Note that the approximation for V is the corresponding local approximation to the reaction-diffusion model (3.4), so at least in this scaling limit we expect the two modelling approaches to coincide.

4 Case Study: Social Construction

The paper by Shaw [56] proposes an agent-based model of social learning, using a network of interaction between agents. In the model there are M (in particular M = 4 in [56]) different mental representations of a social actions, each with a different weight vi. In an interaction with another agent, who plays action i, the vector ω of weights is updated via

$$ \boldsymbol{\omega}_{i}^{\prime} = \boldsymbol{\omega} + e_{i}. $$

On the other hand, given a weight vector w, the action with highest weight is played in the next interaction, respectively one of those with highest weights is chosen with uniform probability if there are multiple ones. We can interpret this choice as a generalization of the probabilities pi in the dialect model above to a concentrated probability measure, it actually corresponds to the limit \(\alpha \rightarrow 0\) in (3.1). Moreover, the network interaction is rather discrete with N agents and associated interaction weights wk, between agents k and .

In order to derive a Boltzmann-type model we perform a suitable rescaling of the states from wi to

$$ v_{i} = \frac{\omega_{i}}{{\sum}_{j=1}^{M} \omega_{j}}, $$

and the number of interactions I to \(s = \frac {I}J\) for some reasonably large J. Thus vK with K as in the previous section and \(s \in \mathbb {R}_{+}\). The weight update in the interactions thus becomes

$$ \textbf{v}_{i}^{\prime} = \frac{s}{s+h} \textbf{v} + \frac{h}{s+h} e_{i}, \quad s^{\prime}=s+h $$

with \(h=\frac {1}J\). The measure \(\nu _{z,\tilde {z}}\) is given by

$$ \nu_{z,\tilde{z}} = \delta_{0}(da) \otimes \sum\limits_{i=1}^{M} \delta_{b_{i}}(db) p_{i}(\tilde{\textbf{v}}) $$

with \(b_{i}(\textbf {v},s) = \frac {h}{s+h} (e_{i}-\textbf {v},h)\).

Assuming again the existence of a single particle density fk(v,S,t) on \(\{1,\ldots ,N\} \times K \times \mathbb {R}_{+}\), for t > 0, i.e.,

$$ \mu(\cdot,t) = (f_{k}(\cdot,t)) \mathcal{H} \otimes \mathcal{L}, $$

where \({\mathscr{H}}\) denotes the (M − 1)-dimensional Hausdorff-measure and \({\mathscr{L}}\) denotes the one-dimensional Lebesgue measure on \(\mathbb {R}_{+}\), we obtain the Boltzmann equation in weak formulation as

$$ \frac{d}{dt} \int \varphi(\textbf{v},s) f_{k}(\textbf{v},s,t)~dz = \underset{i}{\sum} \underset{\ell}{\sum} w_{k \ell} \int \int \left( \varphi(\textbf{v}_{i}^{\prime},s^{\prime}) - \varphi(\textbf{v},s)\right) p_{i}(\tilde{\textbf{v}}) f_{k}(\textbf{v},s,t) f_{\ell}(\tilde{\textbf{v}},\tilde{s},t) d\tilde{z}~dz $$

with z = (v,s). Again existence and uniqueness of solutions can be shown by ODE arguments, in this case the density

$$ \rho_{k} = \int f_{k}(\textbf{v},s,t) ~dz $$

is stationary.

Using smallness of h and rescaling time with h we can derive the Vlasov approximation

$$ \begin{array}{@{}rcl@{}} &&\frac{d}{dt} \int \varphi(\textbf{v},s) f_{k}(\textbf{v},s,t)~dz\\ && \qquad = \underset{\ell}{\sum} w_{k \ell} \int \int \left( \nabla_{\textbf{v}} \varphi(\textbf{v},s) \frac{1}{s} (\textbf{p}(\tilde{\textbf{v}}) - \textbf{v}) + \partial_{s} \varphi(\textbf{v},s) \right)f_{k}(\textbf{v},s,t) f_{\ell}(\tilde{\textbf{v}},\tilde{s},t)~d\tilde{z}~dz. \end{array} $$

In strong form we obtain

$$ \partial_{t} f_{k}(\textbf{v},s,t) + \nabla_{\textbf{v}} \cdot \left( f_{k}(\textbf{v},s,t) \underset{\ell}{\sum} w_{k \ell} \int \frac{1}{s} (\textbf{p}(\tilde{\textbf{v}}) - \textbf{v}) f_{\ell}(\tilde{\textbf{v}},\tilde{s},t)~d\tilde{z} \right) + \partial_{s} (f_{k}(\textbf{v},s,t) \lambda_{k}) = 0 $$

with \( \lambda _{k} = \underset {\ell }{\sum } w_{k \ell } \rho _{\ell } \). Possible monokinetic solutions are characterized by

$$ \frac{d\textbf{V}_{k}}{dt} = \underset{\ell}{\sum} w_{k \ell} \frac{1}{s_{k}} (\textbf{p}(\textbf{V}_{\ell}) - \textbf{V}_{k}), \qquad\frac{ds_{k}}{dt} = \lambda_{k}, $$

which can be simplified to

$$ \frac{d\textbf{V}_{k}}{dt} = \underset{\ell}{\sum} w_{k \ell} \frac{1}{{s_{k}^{0}} + \lambda_{k} t} (\textbf{p}(\textbf{V}_{\ell}) - \textbf{V}_{k}). $$

The structure of the equation analogous to the dialect model above makes the phase separation and coarsening behaviour, i.e. the emergence of few social norms, quite clear. Due to the time-dependent weighting of the interactions it might be expected that equilibria are more dynamic however. Let us mention that using the discontinuous choice of p as in [56] the existence and uniqueness of monokinetic solutions as well as of characteristics in the Vlasov equation cannot be shown easily and remains an interesting question for future research.

5 Case Study: Pandemic Spread

Modelling disease spread is nowadays a classical problem in applied mathematics (cf. [16]), and in particular standard reaction and reaction-diffusion models are now a standard tool in epidemiology (cf. [25, 57]). However, nowadays diseases are spread by short-time travelers rather than by people moving to other locations. This is apparent in particular in the Covid-19 pandemic, where early infections in many countries and areas are due to short term travels (cf. e.g. [4, 21, 35, 38]), so that human mobility networks may be a more relevant modelling structure (cf. e.g. [5, 7, 45, 54]).

In order to illustrate the effects we study a network-structured SIR-model in the following. This is a discrete state model with the three different states S for susceptible, I for infected, and R for removed. The corresponding probability measure μ is thus discrete in the second variable, and we assume it to be continuous with respect to the Lebesgue measure in space, i.e. it is composed of three spatial densities

$$ \mu(\cdot;t) = \left( \rho_{S}(\cdot,t), \rho_{I}(\cdot,t), \rho_{R}(\cdot,t)\right) \mathcal{L}, $$

where \({\mathscr{L}}\) is the Lebesgue measure in some set \({{\varOmega }} \subset \mathbb {R}^{d}\). There is only one pair interaction happening, namely between susceptible and infected, with the latter one being the active agent not changing its state, while the susceptible changes to infected. Moreover, the infected get removed at constant rate β. This yields

$$ \begin{array}{@{}rcl@{}} \partial_{t} \rho_{S}(x,t) &=& - {\int}_{{{\varOmega}}} w(x,\tilde{x}) \rho_{S}(x,t) \rho_{I}(\tilde{x},t) ~d\tilde{x}, \end{array} $$
(5.1)
$$ \begin{array}{@{}rcl@{}} \partial_{t} \rho_{I}(x,t) &=& {\int}_{{{\varOmega}}} w(x,\tilde{x}) \rho_{S}(x,t) \rho_{I}(\tilde{x},t) ~d\tilde{x} - \beta \rho_{I}(x,t), \end{array} $$
(5.2)
$$ \begin{array}{@{}rcl@{}} \partial_{t} \rho_{R} &=& \beta \rho_{I}. \end{array} $$
(5.3)

Here the spatial domain Ω represents the region on which we consider the epidemic, e.g. a country with travel restrictions from and to abroad, or even the whole world. The weight w thus encodes a frequency of travel between x and \(\tilde {x}\).

As usual in the SIR model we can ignore ρR and simply consider the two-time-two system (5.1) and (5.2). For brevity we denote the densities of susceptibles and infectives by u, v instead of ρS, ρI. By introducing the nonlocal Laplacian nonlocal Laplacian

$$ {{\varDelta}}_{w} \varphi(x) = \int w(x,\tilde{x}) (\varphi(\tilde{x}) - \varphi(x))~dx $$

and using the notation \(\alpha (x) = {\int \limits } w(x,\tilde {x}) ~d\tilde {x}\) we can rewrite the system in a reaction-diffusion form

$$ \begin{array}{@{}rcl@{}} \partial_{t} u &=& - \alpha u v - u {{\varDelta}}_{w} v, \end{array} $$
(5.4)
$$ \begin{array}{@{}rcl@{}} \partial_{t} v &=& \alpha u v + u {{\varDelta}}_{w} v - \beta v. \end{array} $$
(5.5)

This allows to give some comparison to the more standard reaction-diffusion models of epidemics, respectively their nonlocal version (cf. e.g. [7])

$$ \begin{array}{@{}rcl@{}} \partial_{t} u &=& - \alpha u v +D_{1} {{\varDelta}}_{w} u, \end{array} $$
(5.6)
$$ \begin{array}{@{}rcl@{}} \partial_{t} v &=& \alpha u v + D_{2} {{\varDelta}}_{w} v - \beta v. \end{array} $$
(5.7)

The key difference is the linearity and non-degeneracy in the diffusion part, which induce a dispersal of both u and v, while only v disperses in the network-structured model.

The behaviour of (5.4), (5.5) is illustrated in Fig. 1 together with a comparison to the nonlocal reaction-diffusion model (5.6), (5.7). Those are based on a numerical solution of the models on the unit interval with periodicity, using the kernel w(x,y) = α(x0 −|xy|)+ with x0 = 0.2, α = 0.3, and β = 0.1. The initial value of u is constant equal to one, while the initial value of v is a peak at x = 0.5. The spatial grid size used is h = 0.01 and the time step τ = 0.01. We see that the overall dynamics in the two models is similar, but the nonlocal reaction-diffusion model smoothes the peak in the infected population stronger (see time sequence of u on the left), while the network structured model does not introduce a local peak in the susceptible one (see time sequence of v on the right). As a consequence, the network structured model predicts a higher number of infected persons in the long run.

Fig. 1
figure 1

Evolution of u (left) and v (right) at time steps t = 1,2,3,4,5 with the network structured model in full lines and the nonlocal reaction-diffusion model in dash-dotted lines

Let \({{\varOmega }} \subset \mathbb {R}^{d}\) be the maximal support of all involved functions, then we can confine the problem to Ω and provide a straightforward analysis:

Proposition 5.1

Let \(u^{0} \in L^{\infty }({{\varOmega }})\), \(v^{0} \in L^{\infty }({{\varOmega }})\) be nonnegative initial values, and \(w \in L^{\infty }({{\varOmega }};L^{1}({{\varOmega }}))\) be nonnegative. Then there exists a unique nonnegative solution \(u \in C^{1}(0,T;L^{\infty }({{\varOmega }}))\), \(v \in C^{1}(0,T;L^{\infty }({{\varOmega }}))\) of (5.4), (5.5). The solution satisfies

$$ 0 \leq u(x,t) + v(x,t) \leq u_{0}(x) + v_{0}(x) $$

for almost every xΩ and every t ∈ [0,T]. Moreover, tu is nonpositive almost everywhere.

Proof

The existence and uniqueness follows from a direct application of the Picard–Lindelöf Theorem in \(L^{\infty }({{\varOmega }})^{2}\) (cf. [11]), respectively a localized version. We apply the result first to obtain existence in the interval [0,τ] with time step

$$ \tau = \frac{1}{2 \|u_{0}\|_{L^{\infty}({{\varOmega}})} \|w \|_{L^{\infty}({{\varOmega}};L^{1}({{\varOmega}}))}} $$

to establish existence of a solution in the invariant subset \(\mathcal {I} \subset C(0,T;L^{\infty }({{\varOmega }}))\)

$$ \mathcal{I} = \{ 0 \leq u \leq \|u_{0}\|_{\infty}, 0 \leq e^{\beta t} v \leq 2 \|\tilde{v}_{0}\|_{L^{\infty}({{\varOmega}})}\}. $$

Since τ is uniform we can incrementally apply the same result to obtain existence and uniqueness of nonnegative solutions in an arbitrary interval [0,T]. Finally nonpositivity of tu follows from those of u and v directly from (5.4). □

As in the case of the dialect model, we can investigate the local limit, which, after appropriate scaling is of the form

$$ \begin{array}{@{}rcl@{}} \partial_{t} u &=& - \alpha u v - u {{\varDelta}} v, \end{array} $$
(5.8)
$$ \begin{array}{@{}rcl@{}} \partial_{t} v &=& \alpha u v + u {{\varDelta}} v - \beta v. \end{array} $$
(5.9)

The system (5.8), (5.9) is a rather degenerate cross-diffusion system. It is easy to see that the operator (u,v)↦(uΔv,uΔv) respectively its linearization are not normally elliptic, hence the standard theory for parabolic systems (cf. e.g. [2]) does not apply. Even worse, we see that (5.8) destroys some basic properties the model should naturally inherit, such as the nonpositivity of tu. If v is locally concave such that Δv < −αv this results in tu > 0, which contradicts the modelling assumption that the number of susceptibles cannot increase. The reason is that the local approximation beyond the leading order

$$ \partial_{t} u = - \alpha u v, \qquad \partial_{t} v = \alpha u v - \beta v $$

is justified only if the leading order solution is sufficiently small.

6 Conclusions and Outlook

In this paper we have derived network-structured kinetic equations and discussed their main properties, illustrated by some applications in human behaviour. We have demonstrated that challenging classes of (nonlocal) PDE systems can already arise as monokinetic equation of Vlasov approximations, further studies of the full model including a nontrivial variance in the state space are an interesting topic for future research. Our mainly formal approach also raises several further mathematical questions, e.g. the analysis of Vlasov and Fokker–Planck approximations, as well as the analysis of monokinetic equations and their local limits related to sparse graphs. Formal similarities to more standard models with explicit movement (or abstractly change in the structural variable), which we found in the local limit of monokinetic also raise further questions of asymptotics and also the analysis of differences between solutions in the non-asymptotic case.

A rather open topic is the derivation of macroscopic equations beyond monokinetic ones. Since there is no natural distinction into transport in x and collision in v as in standard kinetic models, the derivation of hydrodynamic equations cannot be based on asymptotics in the collision operators, not even by formal asymptotics as in the Hilbert or Chapman–Enskog expansion. The derivation of macroscopic equations is further impeded by the rather complicated and non-symmetric type of interactions found in behavioural sciences.

An obvious question for extension of the models concerns the modification of the networks in time, which may become relevant e.g. for applications in social networks where the links are created or deleted on the same time scale as other processes like opinion formation. From a mathematical point of view it is a key issue to derive kinetic and macroscopic models including the full network structure, which seems a rather open problem.

Another aspect one may naturally ask from a mathematical point of view is the (optimal) control of network-structured problem, in the Boltzmann, Vlasov or monokinetic case. From an ethical point of view, this raises some issues however, e.g. when trying to control (or just influence) opinions on social networks via bots. In other cases control may be beneficial however, e.g. for avoiding pandemic spread or maybe also for counteracting the decrease of cultural diversity. Similar control problems for finding consensus (cf. [55]) may also arise in mean-field models for robot swarms with a network communication structure (cf. [29]).