1 Introduction

Opinion formation models have been extensively studied in several research communities in the last decades. Classical models are based on the assumption that an individual’s opinion is influenced by binary interactions with others as well as their surrounding (for example, through social media). Most of them describe the dynamics of each individual, resulting in complex large systems, see, for example, DeGroot (1974), Amblard and Deffuant (2004), Hegselmann and Krause (2002), Galam (2008), Iacomini and Vellucci (2023), Motsch and Tadmor (2014), Sznajd-Weron and Sznajd (2000), Goddard et al. (2021), Jabin and Motsch (2014). In many of these models, the underlying microscopic dynamics lead to the formation of complex macroscopic patterns and collective states. Methodologies from statistical mechanics, especially kinetic theory, have been used successfully to derive and analyze these complex stationary states in suitable scaling limits. Toscani’s seminal work on kinetic opinion formation models, see Toscani (2006), was one of the starting points of various kinetic models for collective dynamics, studying, for example, the effects of leaders on the opinion formation process (Albi et al. 2014; Düring et al. 2009; Düring and Wolfram 2015), decision making (Pareschi et al. 2017, 2019) and the influence of exogenous factors (Bondesan et al. 2024; Franceschi et al. 2023; Zanella 2023) to name a few.

There has been a general agreement that individuals change their opinion due to interactions with others. (These interactions are almost always assumed to be binary.) Most models assume that only like-minded individuals interact, known as bounded confidence models, and that dynamics are strongly driven by the tendency to compromise. In addition, they often assume that individuals change their opinion due to self-thinking, for example because of exposure to different media channels. There is a rich literature on models for opinion formation in large interacting agent systems, see, for example, Hegselmann and Krause (2002), Borra and Lorenzi (2013), Motsch and Tadmor (2014) as well as the influence of social networks on the opinion formation process. The later trend was accelerated as more and more data from social networks, such as X (formerly known as Twitter) or Facebook became available. This allowed researchers to investigate, for example, challenging questions related to the influence of voter’s behavior on the success of vaccination campaigns, see, for example, Albi et al. (2024, 2017). For other related works about the control of opinions on evolving networks, see Albi et al. (2016), polarization see Lee et al. (2014), Matakos et al. (2017), Amelkin et al. (2017) and marketing aspects see Toscani et al. (2018).

Several works proposing different control strategies to enhance consensus formation can be found in the multi-agent system literature, see, for example, Carrillo et al. (2010a, 2010b). On the other hand, control strategies to prevent consensus formation, we will refer these strategies as declustering, have been less studied (Piccoli et al. 2019). However, these declustering strategies can be a useful tool to understand how, for example, software-managed social media accounts, also known as bots, can prevent consensus formation or steer opinions in social networks; see, for example, Alothali et al. (2018), Gilani et al. (2019). It is therefore of interest to understand control mechanisms, which prevent consensus formation in large social networks.

Social networks can be studied using graph theory. Graph theory has become one of the most active fields of research in connection with the collective behavior of large populations of agents, see, for example, Barabási and Albert (1999), Barabási (2009), Barré et al. (2017, 2018), Newman (2003), Watts and Strogatz (1998). The necessity to handle millions, and often billions, of vertices lead to the study of large-scale statistical properties of graphs. In recent years, large discrete networks have been treated as continuous objects through the introduction of new mathematical structures called graphons, which stands for “graph functions”, see Borgs et al. (2014), Lovász (2012), Van Der Hofstad (2016), Caron et al. (2023), Glasscock (2015). The main feature of graphons relies on the possibility to bypass the introduction of classical adjacency matrix by looking at an associate function W(xy) encoding all the information on the connectivity of the original discrete graph. Graph-theoretical research aside, the concept of graphons has been also applied to optimal control (Gao and Caines 2019; Hu et al. 2023) and especially epidemiological theory (Naldi and Patane 2022; de Dios et al. 2022; Erol et al. 2023). We mention the recent works on kinetic and mean-field equations on graphons (Bonnet et al. 2022; Bayraktar et al. 2023; Coppini 2022; Nurisso et al. 2024): In particular, graphons are leveraged to analyze the behavior of mathematical models acting on networks when their number of nodes becomes very large. Thorough analysis of such limiting procedures has been also carried out recently—see, e.g., Medvedev (2014), Petit et al. (2021) and references therein—increasing the interest on connections between differential equations models as continuous limits of stochastic processes happening on discrete graphs, like random walks and diffusion processes, for example.

Fokker–Planck-type equations acting on suitable limits of large dense graphs in particular have been receiving increasing attention (Coppini 2022; Bayraktar et al. 2023). Within this framework, our analysis is devoted to a continuous model acting on a very large network (not necessarily dense) via its graphon representation rather than its limiting sequence of finite graphs. This allows for the very natural interpretation, from the kinetic theoretical point of view, of the network as a (continuous) interaction kernel, which then leads to an effective surrogate mean-field model that incorporates the graphon.

In this paper, we propose and investigate possible strategies to prevent consensus formation in a kinetic model for opinion formation on networks. Our main contributions are

  • Development and analysis of a minimal control problem on the agent-based level.

  • Derivation of a closed form solution of the controlled model in suitable scaling limits, showing that the proposed strategy does indeed prevent consensus in the long time limit (under certain conditions on the parameters).

  • Provide extensive computational experiments illustrating the effectiveness of the proposed control strategy for power-law, \(k\)\( NN \) and small-world graphons.

This paper is structured as follows: In Sect. 2, we introduce and analyze a kinetic opinion formation model on a stationary network. In Sect. 3, we propose a simple but very effective control mechanisms, which prevents the formation of consensus. In particular, we are able to provide closed form optimal controls, which prevent consensus formation in certain parameter regimes. We corroborate our analytical results with computational experiments in Sect. 4.

2 Kinetic Models for Opinion Dynamics on Graphon Structures

We consider a large population of indistinguishable agents each characterized by their opinion w belonging to \(I:=[-1, 1]\), where \(\pm 1\) corresponds to two opposite believes. Agents change their opinion through binary interactions with an interaction frequency modulated by an underlying static network. The opinion formation process itself is based on two different mechanisms:

  • first compromise dynamics—so individuals with close opinion try to find a compromise

  • and second opinion fluctuation, which are included via random variables.

The interaction frequency of agents depends on the underlying graph structure, which we model by a graphon in this paper.

Before discussing the binary agent interactions, we recall some basics about graphons (and refer the interested reader to “Appendix A” for a more detailed introduction as well as further references). Graphons are continuous objects that generalize the concept of simple graphs with a large number of vertices. In case of discrete graphs, nodes are usually referred to using the index \(i = 1, 2,\ldots , N\), where N is the number of vertices. However, in graphons the discrete set \(\{1,\ldots , N\}\) is mapped onto the continuous interval [0, 1], so that nodes are labeled as \(x \in \Omega \subseteq [0,1]\), where \(\Omega \) is a suitable subset of the unit interval of \(\mathbb R\). We will therefore consider agents which are not only characterized by their individual opinion w on a topic, but also their static position on the graphon \(x\in \Omega \subseteq [0,1]\).

We consider the following setup for binary interactions: Given two interacting agents characterized by their opinion and position in the graphon, that is, \((x,w),(y,w_*) \in \Omega \times I\), we compute their post-interaction opinions \((x,w^\prime ),(y,w_*^\prime )\) (note that their positions x and y in the graphon did not change) as:

$$\begin{aligned} \begin{aligned} w'&= w - \gamma P(x,y)(w - w_*) + \eta D(x,w)\\ w_*'&= w_* - \gamma P(y,x)(w_* - w) + \tilde{\eta }D(x,w_*), \end{aligned} \end{aligned}$$
(1)

where \(\gamma \in (0,1)\) is the so-called compromise parameter. The interaction function \(P(\,\cdot \,,\,\cdot \,) \in [0,1]\) depends on the graphon coordinates \(x,y \in \Omega \) and may also depend on the opinion variables \(w,w_* \in I\). For instance, in Toscani (2006) the case \(P = P(|w|)\) as a non-increasing function with respect to \(|w|\) and such that \(0 \le P(|w|) \le 1\) is explored. A different choice, resembling the case of a bounded confidence model like Hegselmann–Krause’s Hegselmann and Krause (2002), can be obtained setting \(P = P(|w - w_*|\)). Moreover, we remark that this particular choice has the advantage of ensuring the conservation of the average opinion but presents analytical difficulties arising from the presence of the absolute value. In the rest of the manuscript, we focus on the case where the interaction function only depends on the agents’ positions on the graphon. A possible choice could be

$$\begin{aligned} P(x,y) = \exp (-\alpha \, d_i(x)/d_i(y)), \quad \alpha > 0, \end{aligned}$$

where \(d_i(z)\) is the in-degree of the node at coordinate \(z\in \Omega \) as defined in Definition 2 of Appendix A. Hence, interactions depend on the connectivity of each agent—the more connected an agent is the less it is influenced by the other, while agents with a lower connectivity are affected more. In particular,

\(d_i(x)/d_i(y) \gg 1\):

implies that, on average with respect to the random variable \(\eta \), the agent with the highest degree keeps their opinion;

\(d_i(x)/d_i(y) \approx 1\):

implies that agents with a similar number of incoming connections are the ones that can most influence each other;

\(d_i(x)/d_i(y) \ll 1\):

implies that the less influential agents tend to adopt the opinion of the more connected ones.

A different possible choice of P(xy) that would give rise to similar dynamics would be, for instance, \(P(x,y) = (1 + d_i(x)/d_i(y))^{-\alpha }\), with \(\alpha > 0\).

In Eq. (1), \(\eta \) and \(\tilde{\eta }\) are independent and identically distribution centered random variables with finite moments up to order three and such that \(\langle \eta ^2\rangle = \langle \tilde{\eta }^2\rangle = \sigma ^2 < +\infty \). Here we denote by \(\langle \,\cdot \,\rangle \) the expectation with respect to the distribution of the random variables. The variables \(\eta \), \(\tilde{\eta }\) account for random fluctuations in an individual’s opinion due to, for example, media exposure. The function \(D(\,\cdot \,, \,\cdot \,)\ge 0\) encodes the local relevance of the diffusion, possible choices include \(D(w) = \sqrt{1-w^2}\). In this case, agents diffuse the most if they have an indifferent opinion, that is, \(w \approx 0\), while they are less influenced by external factors once they settled on one of the two ‘extreme’ choices. Note that this choice also ensures that opinions stay within \(I\).

Next we discuss some basic properties of the binary interaction defined before. Under the assumption that the compromise propensity function P satisfies \( 0 \le P(x,y) \le 1\) and \(0 < \gamma \le 1/2\), the following Proposition (see, e.g., Pareschi et al. 2019; Toscani 2006; Toscani et al. 2018) ensures that the post-interaction opinions still belong to the reference interval.

Proposition 1

Assuming \(0 \le P(x,y) \le 1\) and

$$\begin{aligned} \left\{ \begin{aligned} |\eta |&\le \ell ,\\ |\tilde{\eta }|&\le \ell , \end{aligned} \right. \quad \text {with}\quad \ell :=\min _{w \in I} \Bigl \lbrace \frac{(1 - w)}{D(w)}, D(w) \Bigr \rbrace , \end{aligned}$$

then the binary interaction (1) preserves the interval and the post-interaction opinions are such that \(w^\prime \), \(w_*^\prime \in I\).

A direct computation shows that for all w, \(w_* \in I\)

$$\begin{aligned} \langle w' + w_*'\rangle = w + w_* + \gamma (P(x,y)-P(y,x))(w_*-w). \end{aligned}$$
(2)

If \(P(x,y) = P(y,x)\), i.e., the compromise function P, is symmetric, mean opinion is preserved in interactions, that is,

$$\begin{aligned} \langle w' + w_*'\rangle = w + w_*. \end{aligned}$$

On the other hand, the energy is not conserved on average since

$$\begin{aligned} \begin{aligned} \langle (w')^2 + (w_*')^2\rangle&= w^2 + w_*^2 \\&\quad + \gamma ^2\bigl [ P^2(x,y)+P^2(y,x)\bigr ](w_*-w)^2\\&\quad + 2\gamma \bigl [ P(x,y)w-P(y,x)w_*\bigr ](w_*-w) \\&\quad + \sigma ^2(D^2(x,w)+D^2(y,w_*)). \end{aligned} \end{aligned}$$

If \(\sigma ^2 \equiv 0\) and we have symmetric interactions, we see that the mean energy is dissipated.

We can now state the evolution equation for the distribution of agents \(f = f(x, w, t) \) with respect to their position \(x \in \Omega \) and opinion \(w \in \mathcal {I}\). Consider a fixed number of players, N, then the binary interactions (1) induce a discrete-time Markov process with N-particle joint probability distribution \(P_N(x_1,w_1,x_2,w_2,\ldots ,x_N, w_N,t)\). This allows us to write a kinetic equation for the one-marginal distribution function,

$$\begin{aligned} P_1(x,w,t)=\int P_N(x, w,x_2,x_2,\ldots , x_N,x_N,t)\, dx_2 dw_2 \cdots dx_N dw_N, \end{aligned}$$

using only the one- and two-particle distribution functions (Cercignani 2012; Cercignani et al. 1994),

$$\begin{aligned}{} & {} P_1(x, w,t+1)-P_1(x, w,t)\\{} & {} \quad =\Bigg \langle \frac{1}{N} \Biggl [\int P_2(x_i,w_i, x_j, w_j,t) \bigl ( \delta _0(x-x_i,w-w_i)\\{} & {} \qquad +\delta _0(x-x_j,w-w_j) \bigr )\, dx_idw_i dx_jdw_j - 2P_1(x, w,t) \Biggr ]\Biggr \rangle . \end{aligned}$$

Here, \(\langle \cdot \rangle \) denotes the mean with respect to the random variables \(\eta ,\tilde{\eta }\). By continuing this process, one obtains a hierarchy of equations, the so-called BBGKY hierarchy (Cercignani 2012; Cercignani et al. 1994), describing the dynamics of the system of a large number of interacting agents. It is a standard assumption to neglect correlations, implying that

$$\begin{aligned} P_2(x_i,w_i, x_j, w_j,t)=P_1(x_i, w_i,t)P_1(x_j, w_j,t). \end{aligned}$$

By scaling time and performing the thermodynamical limit \(N\rightarrow \infty \), one can use standard methods of kinetic theory (Cercignani 2012; Cercignani et al. 1994) to show that the time evolution of the one-agent distribution function f is governed by the following non-Maxwellian Boltzmann equation

$$\begin{aligned} \partial _t f(x,w,t) = Q(f,f)(x,w,t), \end{aligned}$$
(3)

where Q(ff) is the so-called collisional operator

(4)

Here \((x,{}' w)\) and \((y,{}' w_*)\) are pre-interaction opinions generating the post-interaction opinions (xw) and \((y, w_*)\) and \({}' J\) is the Jacobian of the transformation \(({}' w,{}' w_*)\rightarrow (w,w_*)\). In Eq. (4), the kernel \({\mathcal {B}}(x,y) :\Omega ^2 \rightarrow \mathbb R^+\) is a given graphon. It can be thought of as the continuous equivalent of an adjacency matrix. Its use in (4) allows us to include an underlying network structure on the continuous level. Recent approaches to opinion formation modeling in the kinetic communities, e.g., Albi et al. (2024, 2016, 2017), Toscani et al. (2018), take into account a graph structure via some of its statistical descriptions, like, for example, considering the number of connections of each agents as an adjoint variable. Instead, the use of a graphon kernel allows for a richer and more general description of individual connections among agents.

Solutions to Eq. (3) preserve features of the underlying microscopic interaction rule. To compute the evolution of the mean opinion, we consider the weak formulation of Eq. (3). Let \(\varphi (x,w)\) be a test function, then

(5)

Setting \(\varphi (x,w)\equiv 1\) in (5) yields conservation of mass. Furthermore, from the conservation of the microscopic average opinion (2) we see that \(\varphi (x,w) = w\) gives conservation of the mean opinion (again assuming that the interaction function is symmetric).

Using conservation of the mean opinion, we can show that for any \(\varphi (x,w) = w^\alpha \phi (x)\), where \(\phi (\,\cdot \,)\) is a suitable test function, the macroscopic quantities

(6)

are conserved in time. Indeed, from (5) choosing \(\varphi (x,w) = \phi (x) w^\alpha \) we get

From the above equation, we get—for \(\alpha = 0,1\)—conservation of any weighted macroscopic moment of order \(\alpha \) defined in (6).

To investigate the second-order moment of f(xwt), we introduce the quantities

$$\begin{aligned} \Lambda (x,t) :=\int _Iw f(x,w,t)\, dw, \quad \text {and} \quad \Xi (x,t) :=\int _Iw^2 f(x,w,t)\, dw, \end{aligned}$$
(7)

that is, the first- and second-order moment, respectively, with respect to w of agents with label \(x\in \Omega \). Clearly, integrating \(\Lambda (x,t)\) and \(\Xi (x,t)\) over \(\Omega \) gives us the mean opinion and the energy of the population.

We will see that for bounded graphons \({\mathcal {B}}(x,y)\) and no diffusion, that is, \(\sigma =0\), the agent distribution \(f(\,\cdot \,,w,t)\) converges toward a Dirac delta distribution centered in the initial mean opinion. This is not surprising since the opinion dynamics corresponds to a consensus formation process modulated by the graphon. Indeed, we have for all \(x \in \Omega \),

$$\begin{aligned} \begin{aligned} \frac{d}{dt}\Xi (x,t)&= \int _I\int _I\int _\Omega {\mathcal {B}}(x,y) \langle ({w'})^2 - w^2 \rangle f(x,w,t)f(y,w_*,t)\, dy\, dw\, dw_*\\&\le \Vert {\mathcal {B}}(x,y)\Vert _{L^\infty (\Omega ^2)} \int _I\int _I\int _\Omega \bigl [2\gamma ^2 P^2(x,y)(w-w_*)^2-2\gamma wP(x,y)(w-w_*)\bigr ]\\&f(x,w,t) f(y,w_*,t) \, dy\, dw\, dw_*. \end{aligned} \end{aligned}$$

Therefore, in the uniform interaction case \(P(x,y)\equiv 1\), even in the presence of non-homogeneous graphon structure supposing \(\Vert {\mathcal {B}}(x,y) \Vert _{L^\infty (\Omega \times \Omega )}>0\), we get

$$\begin{aligned} \frac{d}{dt}\Xi (x,t) \le -2\gamma (1-\gamma )\Vert {\mathcal {B}}(x,y)\Vert _{L^\infty (\Omega ^2)} \bigl (\Xi (x,t) - \Lambda ^2(x,t)\bigr ). \end{aligned}$$

Thus, we have that the second-order moment of \(f(\,\cdot \,,w,t)\) tends to its mean squared, and its variance vanishes exponentially fast. Integrating both sides with respect to \(x \in \Omega \) gives the estimate

$$\begin{aligned} \frac{d}{dt}E(t) \le -2\gamma (1-\gamma )\Vert {\mathcal {B}}(x,y)\Vert _{L^\infty (\Omega ^2)} \bigl (E(t) - m^2)\bigr ), \end{aligned}$$

where

$$\begin{aligned} E(t) :=\int _{\Omega \times I} w^2 f(x,w,t)\, dx\, dw, \quad m :=\int _{\Omega \times I} w f(x,w,t)\, dx\, dw \end{aligned}$$

are the second-order moment for the whole population and the global mean opinion, respectively. The latter quantity, thanks to Eq. (6), is conserved in time. Therefore, a bounded graphon kernel implies exponential convergence of the distribution f(xwt) toward a Dirac’s delta centered at the initial mean opinion.

2.1 Derivation of a Mean-Field Description

Since large time statistical properties of the introduced kinetic model are very difficult to obtain, several reduced complexity models have been proposed. In this direction, a deeper insight into the large time distribution of the introduced kinetic model can be obtained in the quasi-invariant regime presented in Toscani (2006). The idea is to rescale both the interaction and diffusion parameters making the binary scheme (1) quasi-invariant. The idea has its roots in the so-called grazing collision limit of the classical Boltzmann equation, see Düring et al. (2009), Pareschi and Toscani (2013) and the references therein. The resulting model has the form of an aggregation–diffusion Fokker–Planck-type equation which is capable of encapsulating the information of microscopic dynamics and for which the study of asymptotic properties is typically easier.

We consider \(\epsilon \ll 1\) and introduce the following scaling

$$\begin{aligned} \gamma \mapsto \epsilon \gamma , \quad \sigma \mapsto \sqrt{\epsilon }\sigma , \end{aligned}$$

for which \(w'\approx w\) and \(w_*' \approx w_*\). Next we Taylor expand the term encoding the binary interactions in the weak form of the collision operator of Eq. (5).

$$\begin{aligned} \varphi (x,w') - \varphi (x,w)= & {} \partial _w \varphi (x,w)(w'-w) + \dfrac{1}{2}\partial _w^2 \varphi (x,w) (w'-w)^2 \\{} & {} +\dfrac{1}{6}\partial _w^3\varphi (w,{\tilde{w}})(w'-w)^3, \end{aligned}$$

with \({\tilde{w}} \in (\min \{w,w'\},\max \{w,w'\})\). Introducing the new time variable \(\tau = \epsilon t\) and the corresponding rescaled density \(g(x,w,\tau ) = f(x,w,\tau /\epsilon )\), we can rewrite (5) as

where

(8)

and \(R_\varphi (f,f)/\epsilon \rightarrow 0\) under the hypothesis \(\langle |\eta |^3\rangle <+\infty \), see Cordier et al. (2005), Toscani (2006). Consequently, in the limit \(\epsilon \rightarrow 0^+\) we get

Therefore, integrating back by parts, we formally obtained a Fokker–Planck equation for the evolution of the distribution \(g(x,w,\tau )\)

$$\begin{aligned} \partial _\tau g(x,w,\tau )= & {} \gamma \partial _w \bigl [ {\mathcal {K}}[g](x,w,t)g(x,w,\tau ) \bigr ] \nonumber \\{} & {} +\frac{\sigma ^2}{2}\partial _{w}^2 \bigl [ {\mathcal {H}}[g](x,t) D^2(x,w) g(x,w,\tau )\bigr ], \end{aligned}$$
(9)

where

(10)

The operator \({\mathcal {K}}\) corresponds to the network-modulated compromise process, the operator \({\mathcal {H}}\) corresponds to the network-weighted density \(g(x,w,\tau )\). Equation (9) is complemented wit no-flux boundary conditions for all \(x \in \Omega \):

$$\begin{aligned} \begin{aligned} \gamma {\mathcal {K}}[g](x,w,t)g(x,w,\tau ) +\frac{\sigma ^2}{2} \partial _w \bigl [{\mathcal {H}}[g](x,t) D^2(x,w) g(x,w,\tau )\bigr ]\Big |_{w=\pm 1}&= 0,\\ {\mathcal {H}}[g](x,t) D^2(x,w) g(x,w,\tau )\Big |_{w=\pm 1}&= 0. \end{aligned} \nonumber \\ \end{aligned}$$
(11)

This choice of boundary conditions ensures that system (9)–(11) shares the same conservation properties as its microscopic kinetic counterpart. Indeed, we see that the mean opinion is conserved since

(12)

where we dropped for clarity the dependence on x, w and t for the operators \({\mathcal {K}}[g]\) and \({\mathcal {H}}[g]\). Furthermore, any macroscopic quantity of the form

(13)

is conserved in time.

2.2 Large Time Agent Distribution

Due to the presence of a general compromise propensity \(P(\,\cdot \,,\,\cdot \,)\), a closed solution to Eq. (9) is difficult to obtain. Nevertheless, under suitable assumptions on the graphon structure and on the diffusion function, we can write down a closed formulation for the large time agent distribution (9).

In the following, we restrict our analysis to the simplified situation where the interactions are homogeneous, i.e., \(P(x,y)\equiv 1\), and the diffusion function is defined as

$$\begin{aligned} D(x,w) = \sqrt{1-w^2} \quad \text {for all}\quad x \in \Omega , \end{aligned}$$
(14)

We recall that this choice of diffusion function ensures that w stays within the domain \(I\). Furthermore, we suppose separability of the graphon \({\mathcal {B}}(x,y)\), which corresponds to

$$\begin{aligned} {\mathcal {B}}(x,y) = {\mathcal {B}}_1(x){\mathcal {B}}_2(y). \end{aligned}$$
(15)

From the modeling point of view, this choice is coherent with relevant examples of graphon structures, like the graphon associated with the case of scale-free networks as proposed in Borgs et al. (2014). Indeed, in this case we have

$$\begin{aligned} {\mathcal {B}}(x,y) = (xy)^{-\alpha }, \quad 0< \alpha < 1 \end{aligned}$$

satisfying the introduced separability assumption. Network structures that are found commonly in life and social sciences are often modeled using scale-free networks (Van Der Hofstad 2016; Barabási 2009; Barabási and Albert 1999), i.e., simple graphs whose degree distribution possesses fat tails.

From (13), we define the weighted mass and momentum as

(16)

Note that both quantities \(\rho \) and \(\mu \) are conserved in time.

Assuming relation (15) holds, the steady state \(g^\infty (x,w)\) of the Fokker–Planck model (9) satisfies the following equation

Due to mass conservation and definitions (16), we can simplify it further and obtain

$$\begin{aligned} \gamma {\mathcal {B}}_1(x) (\rho w - \mu ) + \dfrac{\sigma ^2}{2}\partial _w \left[ {\mathcal {B}}_1(x)\rho \right] = 0. \end{aligned}$$

For our particular choice of diffusion function, that is, (14), we can compute the steady state of \(g^\infty (x,w)\) explicitly, see Toscani (2006). In particular, setting \(\lambda = \sigma ^2/\gamma \), we get

$$\begin{aligned} g^\infty (x,w)= & {} \frac{\Gamma (2/\lambda )2^{1 - 2/\lambda }}{\Gamma \Bigl (\frac{1 + \mu /\rho }{\lambda }\Bigr )\Gamma \Bigl (\frac{1 - \mu /\rho }{\lambda }\Bigr )d_i(x)\cdot C_{{\mathcal {B}}}}\nonumber \\{} & {} {\mathcal {B}}_1(x) (1 + w)^{\frac{1 + \mu /\rho }{\lambda } - 1} (1 - w)^{\frac{1 - \mu /\rho }{\lambda } - 1}. \end{aligned}$$
(17)

which, as a function of the opinion, is a Beta distribution, weighted by the in-degree \(d_i(x)\) at \(x \in \Omega \) times a graphon-dependent constant \(C_{{\mathcal {B}}}\) which depends on the way the splitting \({\mathcal {B}}(x,y) = {\mathcal {B}}_1(x){\mathcal {B}}_2(y)\) is obtained and such that the right-hand side of Eq. (17) has unitary mass.

Model parameters appearing in the steady state allow to get insights into the shape and other characteristics of the equilibrium opinion distribution: For instance, when \(\mu = 0\), \(g^\infty \) is an even function of w, so that the population has a neutral opinion on average. On the other hand, the balance between the actions of compromise and self-thinking dynamics expressed by the parameter \(\lambda \) tells us that if the action of the self-thinking is much stronger than the compromise one, i.e., \(\lambda \gg 1\), then the tendency of the population would be to polarize at the extremes, tending to a mixture of Dirac’s deltas at the boundary points of \(I\). We refer the interested reader to Toscani (2006) for an in-depth analysis of the roles of interactions parameters on the equilibrium distribution.

2.3 Analytical Properties

We continue by discussing some analytical properties for solutions to (9). First we show that (9) preserves the \(L^1\) regularity. To this end, we may rewrite Eq. (9) as

$$\begin{aligned} \partial _\tau g(x,w,\tau )= & {} \gamma \rho _P(x,\tau ) \partial _w \bigl [(w - \mu _P(x,\tau ))g(x,w,\tau )\bigr ]\nonumber \\{} & {} +\frac{\sigma ^2}{2}{\mathcal {H}}[g](x,\tau ) \partial _w^2\bigl [D^2(w)g(x,w,\tau )\bigr ]. \end{aligned}$$
(18)

Note that we will again consider a specific form of diffusion, that is, \( D(w)=\sqrt{1-w^2}\). Furthermore, we introduce the following quantities, dependent on the compromise propensity function

Note that \(\mu _P(\,\cdot \,, \tau )\) is well defined since we are considering P to be positive almost anywhere and the graphon \({\mathcal {B}}\) to be nonnegative, as well. Next, we take, for a given parameter \(\xi \), a regularized non-decreasing approximation of the sign function \({{\,\textrm{sgn}\,}}_\xi (\,\cdot \,)\), and then introduce the antiderivative of \({{\,\textrm{sgn}\,}}_\xi [g (x,w,\tau )](w)\) for every \(w \in I\) as the function \(|g (x,w,\tau )|_\xi (w)\), where we stress the dependence on these functions on the variable w. Now, let us fix \(x \in \Omega \), multiply each side of Eq. (18) and integrate with respect to w. This gives

where we used the boundary conditions (11). Since

we can substitute this expression and obtain

Now, the first integrand on the right-hand side vanishes as \(\xi \rightarrow 0^+\) if we integrate by parts one more time (since by construction \(\lim _{\xi \rightarrow 0^+} {{\,\textrm{sgn}\,}}_\xi [g (x,w,\tau )](w)g (x,w,\tau )= |g (x,w,\tau )|(w)\) for almost every \(w \in I\)). This leaves us with the second integrand, which is nonnegative since we chose a non-decreasing approximation of the sign function. Finally, since graphons are nonnegative by definition at all points in their domain, we conclude that

$$\begin{aligned} \frac{d}{d\tau } \Vert g (x,w,\tau )\Vert _{L^1([-1,1])} = \lim _{\xi \rightarrow 0^+} \frac{d}{d\tau } \int _I|g (x,w,\tau )|_\xi (w)\, dw \le 0 \end{aligned}$$

for all \(x \in \Omega \). This implies that an initial datum in \(L^1(I)\) would ensure that \(g (x,w,\tau )\in L^1(I)\) for all \(\tau > 0\).

Remark 1

The weak contractivity of the \(L^1\) norm with respect to the opinion also allows us to prove uniqueness of solutions to (18). The proof is based on contradiction; assume there exist two solutions \(g (x,w,\tau )\) and \(s(x,w,\tau )\) and evaluate the regularized modulus of their difference for each point \(x \in \Omega \). If we fix \(x \in \Omega \), then due to linearity with respect to w we have that \(g (x,w,\tau )- s(x,w,\tau )\) is a solution to Eq. (18), too. Therefore,

$$\begin{aligned} \lim _{\xi \rightarrow 0^+}\frac{d}{d\tau } \int _I|(g - s)(x,w,\tau )|_\xi \, dw \le 0, \end{aligned}$$

which implies that, at \(x \in \Omega \), \(g = s\) for almost all \(w \in I\) and \(\tau > 0\) since by construction we have \(g(x,w,0) = s(x,w,0)\) for all \(x \in \Omega \) and \(w \in I\). The claim then follows since \(x\in \Omega \) was chosen arbitrarily.

Remark 2

The weak contractivity of the \(L^1\) norm with respect to w (i.e., the norm is not increased in time) gives us as a corollary that the model (18) is positivity-preserving. The claim follows noting that its solution \(g (x,w,\tau )\) has a vanishing negative part if the initial datum is nonnegative. Indeed, we can express the negative part of \(g (x,w,\tau )\) via the regularization we introduced earlier, that is,

$$\begin{aligned} g ^-_\xi (x,w,\tau ) = \frac{1}{2} (|g (x,w,\tau )|_\xi (w)- g (x,w,\tau )), \quad \text {for all}\quad x \in \Omega \end{aligned}$$

This way, if we integrate with respect to w, we have

$$\begin{aligned} 2\frac{d}{d\tau } \int _Ig ^-_\xi (x,w,\tau )\, dw = \underbrace{\frac{d}{d\tau }\int _I|g (x,w,\tau )|_\xi (w)\, dw}_{\le 0} {}+ 0, \quad \text {for all}\quad x\in \Omega , \end{aligned}$$

thanks to the first boundary condition in (11). Then it holds

$$\begin{aligned} \frac{d}{d\tau } \int _Ig ^- (x,w,\tau )\, dw = \lim _{\xi \rightarrow 0^+} \frac{d}{d\tau } \int _Ig ^-_\xi (x,w,\tau )\, dw \le 0, \end{aligned}$$

for all \(x \in \Omega \), which implies that the non-negativity of the initial datum is preserved by the model (18).

We conclude this section by extending the previous regularity result—if the initial datum is in \(L^{p}(I)\), \(p>1\) at \(x\in \Omega \) then the solution \(g \in L^p(I)\) at x for all \(\tau > 0\). We will use both the positivity-preserving and the \(L^1\) regularity of its solution in the following.

We note that we only show \(L^p\)-regularity for all \(p \ge 2\) since the result will also hold also for \(p \in (1, 2)\) due to the boundedness of the interval \(I\). The idea is to rewrite Eq. (18) and impose the associated no-flux boundary conditions in order to estimate the time evolution of \(\Vert g\Vert _{L^p}\) using integration by parts. The hypothesis on p to be greater or equal than 2 is needed since we will need to take the derivative of \(g ^{p-1} (x,w,\tau )\) under the integral sign, but as stated above it is not restrictive.

We define the right-hand side of (18) as \({\mathcal {Q}}(g,g)(x,w,\tau )\), that is,

$$\begin{aligned} {\mathcal {Q}}(g,g)(x,w,\tau ):= & {} \gamma \rho _P(x,\tau )\biggl [\partial _w\bigl ([(1 - \sigma ^2)w - \mu _P(x,\tau )]g (x,w,\tau )\bigr )\\{} & {} +\frac{{\mathcal {H}}[g](x,\tau )\sigma ^2}{2\gamma \rho _P(x,\tau )}\partial _w\Bigl ((1 - w^2)\partial _wg (x,w,\tau )\Bigr )\biggr ]. \end{aligned}$$

Suppose that the following no-flux boundary conditions hold for all \(x \in \Omega \) and for all \(\tau > 0\)

$$\begin{aligned} \begin{aligned} \gamma \rho _P(x,\tau )\bigl [((1 - \sigma ^2)w - \mu _P(x,\tau ))g (x,w,\tau )\bigr ]\Big |_{w=\pm 1}&= 0,\\ \frac{{\mathcal {H}}[g](x,\tau )\sigma ^2}{2\gamma \rho _P(x,\tau )}\bigl [(1 - w^2) \partial _w g (x,w,\tau )\bigr ]\Big |_{w=\pm 1}&= 0. \end{aligned} \end{aligned}$$
(19)

Now let us multiply each side of Eq. (9) by \(g ^{p-1} (x,w,\tau )\) and integrate with respect to w

$$\begin{aligned} \begin{aligned}&\frac{1}{p} \frac{d}{d\tau }\Vert g (x,w,\tau )\Vert _{L^p([-1, 1])}^p = \int _I{\mathcal {Q}}(g,g)g ^{p-1} (x,w,\tau )\, dw\\&\quad = \underbrace{\gamma \rho _P(x,\tau )\int _I\frac{\partial }{\partial w}\bigl ([(1 - \sigma ^2)w - \mu _P(x,\tau )]g (x,w,\tau )\bigr )g ^{p-1} (x,w,\tau )\, dw}_{:={\mathcal {T}}_1}\\&\qquad + \underbrace{\frac{{\mathcal {H}}[g](x,\tau )\sigma ^2}{2\gamma \rho _P(x,\tau )}\int _I\frac{\partial }{\partial w}\Bigl ((1 - w^2)\frac{\partial }{\partial w}g (x,w,\tau )\Bigr )g ^{p-1} (x,w,\tau )\, dw}_{:={\mathcal {T}}_2}. \end{aligned} \end{aligned}$$

Now the goal is to show that \({\mathcal {T}}_2\) is non-positive, so that it can be ignored in the estimate, and then focus on \({\mathcal {T}}_1\). Starting with \({\mathcal {T}}_2\), we integrate by parts and use the second boundary condition in (19) to obtain

$$\begin{aligned} {\mathcal {T}}_2 = -\frac{{\mathcal {H}}[g](x,\tau )\sigma ^2(p-1)}{2\gamma \rho _P(x,\tau )}\int _I(1 - w^2) (\partial _w g (x,w,\tau ))^2 g ^{p - 2} (x,w,\tau )\, dw \le 0, \end{aligned}$$

since \(w \in I\), \(g (x,w,\tau )\) is nonnegative and \(\rho _P(x,\tau )\) is nonnegative at all \(x \in \Omega \). Next we consider \({\mathcal {T}}_1\) and derive two different estimates for it. If we expand the derivative with respect to w under the integral sign, we have

$$\begin{aligned} {\mathcal {T}}_1= & {} \gamma \rho _P(x,\tau )(1 - \sigma ^2)\int _Ig ^p (x,w,\tau )\, dw \\{} & {} + \gamma \rho _P(x,\tau )\int _I\bigl [[(1 - \sigma ^2)w - \mu _P(x,t)] \partial _wg (x,w,\tau )g ^{p-1} (x,w,\tau )\bigr ]\, dw. \end{aligned}$$

On the other hand, we could as well integrate by parts and using the first boundary condition in (19) and get

$$\begin{aligned} {\mathcal {T}}_1 = -\gamma \rho _P(x,\tau )(p - 1)\int _I[(1 - \sigma ^2)w - \mu _P(x,t)] \partial _w g (x,w,\tau )g ^{p-1} (x,w,\tau )\, dw. \end{aligned}$$

If we now use the identity

$$\begin{aligned} {\mathcal {T}}_1 = \frac{p-1}{p} {\mathcal {T}}_1 + \frac{1}{p} {\mathcal {T}}_1 \end{aligned}$$

to replace \({\mathcal {T}}_1\) as the appropriate convex combination of the two equations, we obtain:

$$\begin{aligned} {\mathcal {T}}_1 = \gamma \rho _P(x,\tau )(1 - \sigma ^2)\int _Ig ^p (x,w,\tau )\, dw. \end{aligned}$$

Putting everything together, we deduce that

$$\begin{aligned} \frac{d}{d\tau }\Vert g (x,w,\tau )\Vert _{L^p([-1, 1])}^p \le p\gamma \rho _P(x,\tau )(1 - \sigma ^2)\Vert g (x,w,\tau )\Vert _{L^p([-1, 1])}^p \end{aligned}$$

for a given \(x \in \Omega \). Then Gronwall’s lemma implies that if the initial datum belongs to \(L^p([-1, 1])\) at \(x \in \Omega \), then \(g (x,w,\tau )\in L^p([-1, 1])\) at x for all \(\tau > 0\).

3 Declustering: Preventing Consensus via Control Strategies

In this section, we focus on control strategies to prevent consensus. In particular, the one driven by the compromise process. This objective is different to more common optimal control strategies, which would, for example, steer the average opinion to a given target (Albi et al. 2017, 2015, 2016). To this end, we propose an additional interaction to prevent the formation of opinion clusters by enforcing a controlled interaction. In particular, we consider a convex combination of two updates weighted by the parameter \(\theta \in (0, 1)\) such that a fraction \(1-\theta \) of the population follows an opinion transition of the type (1), whereas a fraction of size \(\theta \) follows an opinion update given by a controlled interaction of the form

$$\begin{aligned} w'' = w - \gamma u^* S(w), \end{aligned}$$
(20)

where \(u^*\) is an agent-based control arising from the solution of a suitable optimization problem and \(S(w) \ge 0\) a suitable selection function dependent on the opinion. In particular, the optimization problem focuses on the minimization of a suitable convex cost functional \({\mathcal {J}}\) on the set \({\mathcal {U}}\) of admissible controls, which in our case are those such that the post-interaction opinion \(w''\) stays within the interval \(I\) and has the form

$$\begin{aligned} u^* = {\mathop {\mathrm {arg\,min}}\limits _{u \in {\mathcal {U}}}}\, {\mathcal {J}}(w''_*, u). \end{aligned}$$
(21)

The quantity \(w''_*\) appearing on the right-hand side of (21) is a virtual update which the functional \({\mathcal {J}}\) would be subject to: In fact, the optimal control \(u^*\) is the solution of the following optimization problem

$$\begin{aligned} \left\{ \begin{aligned} u^*&= \mathop {\mathrm {arg\,min}}\limits _{u \in {\mathcal {U}}} \left( \frac{1}{2} \left( w''_* - m\right) ^2 + \frac{\nu }{2} u^2 \right) ,\\ w''_*&= w +\gamma u^* S(w), \end{aligned} \right. \end{aligned}$$
(22)

where m denotes the average opinion of the population on the network at time t and \(\nu > 0\) is a regularization parameter. Notice that the actual update \(w''\) and the virtual update \(w''_*\) have an opposite effect on w. Solving the associated Lagrange multiplier problem

$$\begin{aligned} \left( w''_* - m\right) \frac{\partial w''_*}{\partial u^*} + \nu u^* = 0, \end{aligned}$$

gives

$$\begin{aligned} u^* = -\frac{\gamma S(w)}{\nu + \gamma ^2 S^2(w)} (w - m), \end{aligned}$$

and so, the resulting interaction is

$$\begin{aligned} w'' = w + \frac{\gamma ^2 S^2(w)}{\gamma ^2 S^2(w)+ \nu } (w - m). \end{aligned}$$
(23)

In particular, in the rest of the paper we focus on the selection function

(24)

Remark 3

Thanks to the presence of the indicator function in (24), we can verify that \(w'' \in [-1, 1]\) and therefore that the controlled interaction is admissible.

We recall that we balance the two types of interactions: At a rate \(1-\theta \), agents update their opinion according to (1), at the rate \(\theta \) they interact with the external control (23). This yields a kinetic model whose right-hand side is convex combination of two non-Maxwellian operators

$$\begin{aligned} \partial _t f(x,w,t) = (1- \theta ) Q(f,f)(x,w,t) + \theta Q_u(f)(x,w,t). \end{aligned}$$
(25)

The role of the parameter \(\theta \in [0,1]\) is to model the frequency at which the different kinds of interaction take place: It can be thought as the percentage of automated users (e.g., bots programmed by a third party) on the network. The operator Q(ff) is the same introduced in Eq. (4); the operator \(Q_u(f)(x,w,t)\), instead, encodes the controlled update of agents’ opinions as prescribed by the elementary interaction (23) and is therefore given by

(26)

for any test function \(\varphi (\,\cdot \,,\,\cdot \,)\).

3.1 Mean-Field Limit of the Controlled Model

In this section, we explicitly show how the introduced control is capable of breaking consensus on the mean-field level for suitable choice of the penalization. We proceed like we did in Sect. 2.1 to derive a more approachable mean-field limit of Eq. (25), using the same scaling

$$\begin{aligned} \tau \rightarrow t/\epsilon , \quad \gamma \rightarrow \gamma \epsilon , \quad \sigma ^2 \rightarrow \sigma ^2\epsilon , \quad \nu \rightarrow \kappa \epsilon , \end{aligned}$$
(27)

for a certain \(\kappa \in \mathbb R^+\). In this case, particular care is needed in treating the weak form (26) due to the presence of the indicator function in the interaction (23). Using Taylor expansion like we did in Sect. 2.1 yields

Here, \({\mathcal {K}}[g]\), \({\mathcal {H}}[g]\) and \(R_\phi (g,g)\) are the same operators as in Eqs. (10) and (8) in Sect. 2.1, while we denote

which can be shown to go to zero with computations analogous to the ones for \(R_\phi (g,g)\). We continue with the term \(\theta {\mathcal {A}}[g](x,w,\tau )/\epsilon \):

since in the limit \(\epsilon \rightarrow 0^+\) we have

where we recall that \(\kappa > 0\) and \(|m| \le 1\). Therefore, we obtain the following Fokker–Planck equation

$$\begin{aligned} \partial _\tau g(x,w,\tau )= & {} \gamma \partial _w \bigl [ {\mathcal {K}}_\theta ^u[g](x,w,\tau )g(x,w,\tau )\bigr ]\nonumber \\{} & {} + \frac{\sigma ^2}{2}\partial _{w}^2 \bigl [ {\mathcal {H}}_\theta [g](x,\tau ) D^2(x,w) g(x,w,\tau )\bigr ], \end{aligned}$$
(28)

where this time we define

(29)

Here \(d_i(x)\) is the in-degree of x as defined in Sect. A and \({\mathcal {H}}_\theta [g] :=(1 - \theta ){\mathcal {H}}[g]\). The associated boundary conditions which are necessary to perform the integration by parts are

$$\begin{aligned} \begin{aligned} \gamma \bigl [ {\mathcal {K}}_\theta ^u[g](x,w,\tau )g(x,w,\tau ) \bigr ] +\frac{\sigma ^2}{2} \partial _w \bigl [{\mathcal {H}}_\theta [g](x,t) D^2(x,w) g(x,w,\tau )\bigr ]\Big |_{w=-1}^{w=1}&= 0,\\ {\mathcal {H}}_\theta [g](x,t) D^2(x,w) g(x,w,\tau )\Big |_{w=-1}^{w=1}&= 0. \end{aligned} \nonumber \\ \end{aligned}$$
(30)

Remark 4

The mean opinion of the population is preserved. Indeed, multiplying each side of Eq. (28) and then integrating by parts we get

thanks to Eqs. (29) and (12). The evolution of the second-order moment is trickier, due to the presence of general graphon kernel and the compromise propensity function. In case of the specific diffusion function \(D(\,\cdot \,,\,\cdot \,)\) (14), we can simplify the expression and obtain

(31)

where we note \(\Phi (x,\tau ) :=\int _Ig(x,w,\tau )\, dw\). Equation (31) is still quite general: Further insights into the trend of the second-order moment can be found in the specialized setting of Remark 6.

3.2 Effects of the Penalization Coefficient on the Quasi-Equilibrium Distribution

In the following, we will compute the quasi-equilibrium distribution of (28). This corresponds to solving

$$\begin{aligned} \gamma \partial _w \bigl [ {\mathcal {K}}_\theta ^u[g](x,w,\tau )g(x,w,\tau ) \bigr ] + \frac{\sigma ^2}{2}\partial _{w}^2 \bigl [ {\mathcal {H}}_\theta [g](x,\tau ) D^2(x,w) f(x,w,t)\bigr ] = 0 \end{aligned}$$

which we can write in closed form (due to the no-flux bc (30)). Then the quasi-equilibrium distribution \(f^\textrm{qe}\) is given by

$$\begin{aligned} g^\textrm{qe}(x,w,\tau ) = C\, \exp \left( -\int _{-1}^w \frac{\gamma {\mathcal {K}}_\theta ^u[f](x,v,\tau ) + \sigma ^2/2 \, {\mathcal {H}}_\theta [g](x,\tau )\, \partial _v D^2(x,v)}{\sigma ^2/2\, {\mathcal {H}}_\theta [g](x,\tau )\, D^2(x,v)}\, dv\right) , \end{aligned}$$

where C is a normalizing constant and under the formal assumption that \({\mathcal {H}}[g](x,\tau ) D^2(x,v)\) can be taken to be nonzero almost everywhere at \(x\in \Omega \) and on \([-1, 1]\).

To compare the controlled case to the one obtained in Eq. (17), we focus on the case \(D(x,w) = \sqrt{1 - w^2}\), and having chosen a unitary compromise propensity function, we can write

$$\begin{aligned} g^\textrm{qe}(x,w,\tau ) = C (1 - w)^{\alpha _-} (1 + w)^{\alpha _+}, \end{aligned}$$
(32)

where we define

$$\begin{aligned} \begin{aligned} \alpha _-(\theta ,\gamma ,\sigma ,\kappa ,\tau )&:=\frac{\gamma \bigl [\kappa (1-\theta )(\rho _p(x,\tau ) - \mu _p(x,\tau )) - \gamma \theta d_i(x)(1 - m)\bigr ]}{\kappa \rho _p(x,\tau )(1-\theta )\sigma ^2} - 1,\\ \alpha _+(\theta ,\gamma ,\sigma ,\kappa ,\tau )&:=\frac{\gamma \bigl [\kappa (1-\theta )(\rho _p(x,\tau ) + \mu _p(x,\tau )) - \gamma \theta d_i(x)(1 + m)\bigr ]}{\kappa \rho _p(x,\tau )(1-\theta )\sigma ^2} - 1. \end{aligned} \end{aligned}$$

If we fix\(x \in \Omega \), the quasi-equilibrium state of Eq. (32) is a Beta distribution with respect to the variablew. In fact, taking \(\theta = 0\) and a separable graphon kernel \({\mathcal {B}}(x,y)\) in Eq. (32) gives us precisely the steady state we found for the uncontrolled problem, given by (17).

We can exploit our knowledge of the quasi-equilibrium to influence the level of declustering of the system: Indeed, the opinion distribution is in its least clustered form when it is uniform, i.e., when \(\alpha _- = \alpha _+ = 0\). If we impose these constraints, we can solve them for the penalty term \(\kappa \) as a function of the network position\(x\in \Omega \) and of time\(\tau \), i.e., \(\kappa =\kappa (x,\tau )\). We obtain

$$\begin{aligned} \left\{ \begin{aligned} \kappa (x,\tau )&= \frac{\gamma ^2\theta d_i(x)}{\rho _P(x,\tau )(1-\theta )(\gamma - \sigma ^2)}\\ m&= 0\\ \mu _P(x,\tau )&= 0, \end{aligned} \right. \end{aligned}$$
(33)

which, under the further hypothesis that \({\mathcal {B}}(x,y) = {\mathcal {B}}_1(x){\mathcal {B}}_2(y)\) and \(P(x,y) \equiv 1\), simplifies to

$$\begin{aligned} \bar{\kappa }= \frac{\gamma ^2\theta }{(1-\theta )(\gamma - \sigma ^2)}. \end{aligned}$$
(34)

Remark 5

The constraint \(m = 0\) in (33) is necessary to have a uniform distribution over a symmetric domain, while imposing \(\mu _1(x,\tau ) = 0\) is needed to have \(\alpha _- =0\) and \(\alpha _+ = 0\) simultaneously.

When \(g \approx U(\Omega \times [-1, 1])\), that is, the distribution is almost uniform over the domain \( \Omega \times [-1, 1]\), we have \(\rho _1(x,\tau ) \approx d_i(x)\), so that

$$\begin{aligned} g^\textrm{qe}(x,w,t) \rightarrow g^\infty (x,w) \implies \kappa (x,\tau ) \rightarrow \bar{\kappa }, \quad \text {for all}\quad x\in \Omega . \end{aligned}$$

Finally, we stress that the choices (33) and (34) that would appear in the controlled update (20) come from the analysis of a quasi-equilibrium state for the distribution\(g(x,w,\tau )\) which has been computed from the Fokker–Planck Eq. (28). This implies that the penalty term would be effective in achieving the declustering effect only for a parameter regime in which \(\epsilon \) is sufficiently small.

Remark 6

If we consider again the evolution of the second-order moment as in (31), assume that \(P(\,\cdot \,,\,\cdot \,) \in L^\infty (\Omega \times \Omega )\), and use the \(\kappa = \kappa (x,\tau )\) defined in Eq. (33), we obtain

$$\begin{aligned} \frac{d}{d\tau } \Xi (x,\tau )\le & {} -3\Vert P(x,y)\Vert _{L^\infty (\Omega ^2)}(1-\theta ){\mathcal {H}}[g](x,\tau )\sigma ^2\\{} & {} \Bigl [\Xi (x,\tau ) - \frac{\phi (x,\tau )}{3} + \bigl (\gamma \mu _P(x,\tau ) + (\gamma - \sigma ^2)m\bigr )\Lambda (x,\tau )\Bigr ]. \end{aligned}$$

This estimate can be further simplified if \(m = \Lambda (x,\tau ) = 0\) for all \(x\in \Omega \) (a condition needed to observe a centered uniform distribution):

$$\begin{aligned} \frac{d}{d\tau } \Xi (x,\tau ) \le -3\Vert P(x,y)\Vert _{L^\infty (\Omega ^2)}(1-\theta ){\mathcal {H}}[g](x,\tau )\sigma ^2 \Bigl [\Xi (x,\tau ) - \frac{\phi (x,\tau )}{3}\Bigr ]. \end{aligned}$$

In particular, by Gronwall’s inequality we have that whenever the graphon kernel is bounded the variance converges exponentially in time toward 1/3, since integrating both sides with respect to \(x\in \Omega \) gives

$$\begin{aligned} \frac{d}{d\tau } E(\tau ) \le -3\Vert P(x,y)\Vert _{L^\infty (\Omega ^2)}\Vert {\mathcal {B}}(x,y)\Vert _{L^\infty (\Omega ^2)}(1-\theta )\sigma ^2 (E(\tau ) - 1/3), \end{aligned}$$
(35)

where 1/3 is the variance of the uniform distribution over \(\Omega \times I\).

4 Numerical Tests

We conclude by illustrating the declustering strategy with various computational experiments. We show first the consistency of the quasi-invariant limit of the controlled model (25) in the network-homogeneous case. In this case, we choose a uniform, constant graphon kernel \({\mathcal {B}}(x,y)\equiv c \in (0,1]\); in particular, \({\mathcal {B}}(x,y) \equiv 1\). In the second example, we check the quasi-invariant limit using the power-law graphon, which is separable, to model an interaction happening on a scale-free network. In the third experiment, we illustrate the dynamics for non-separable, graphon kernels. All tests were performed using direct simulation Monte Carlo methods for the Boltzmann Eq. (25); we refer to Pareschi and Toscani (2013), Pareschi et al. (2019) and references therein for further details.

Table 1 List of parameters used within numerical experiments

We start describing our method by rewriting Eq. (25) in strong form as a sum of gain and loss parts:

(36)

We indicate with \(Q^\Sigma \) and \(Q_u^\Sigma \) the operators obtained replacing the graphon kernel \({\mathcal {B}}(x,y)\) with the approximated version \({\mathcal {B}}^\Sigma (x,y\), given by

$$\begin{aligned} {\mathcal {B}}^\Sigma (x,y) :=\min \{ {\mathcal {B}}(x,y), \Sigma \}, \end{aligned}$$

where \(\Sigma \) is an upper bound for \({\mathcal {B}}(x,y)\) over \(\Omega ^2\). Whenever we consider an unbounded interaction kernel (e.g., the power-law case), we consider a suitable truncation for \(\Sigma \). If we now highlight the gain and loss parts of \(Q^\Sigma \) and \(Q_u^\Sigma \), we have

where we define

Then, we discretize the time interval [0, T] with time step \(\Delta t > 0\) and denote as \(f^n(x,w)\) the time approximation \(f(x,w,n\Delta t)\) to consider the forward Euler-type scheme

$$\begin{aligned} f^{n+1} = (1 - \Sigma \Delta t) f^n + \Sigma \Delta t \frac{{\mathcal {S}}(f^n, f^n)}{\Sigma }, \end{aligned}$$

where we define

We remark that under the condition \(\Sigma \Delta t \le 1\), \(f^{n+1}\) is well defined as a probability density.

Table 1 presents all the parameters we used in our computational experiments. Moreover, we always fix \(D(x,w) = \sqrt{1 - w^2}\) as diffusion function and \(P(x,y) \equiv 1\) as compromise tendency function. Finally, we consider as initial distribution a state close to full consensus represented by a truncated Gaussian distribution over the interval \({\mathcal {I}}\) for all \(x \in \Omega \)

$$\begin{aligned} f_0(w) ={\left\{ \begin{array}{ll} C e^{\frac{(w-u_0)^2}{2\sigma _0^2}}, &{} w \in [-1,1], \\ 0 &{}\text {otherwise}, \end{array}\right. } \end{aligned}$$

where \(u_0 = 0\), \(\sigma _0^2 = \frac{1}{10}\) and \(C>0\) is a normalization constant.

4.1 Consistency of the Mean-Field Model: The Network-Homogeneous Case

We take model (28) with \({\mathcal {B}}(x,y) \equiv 1\), which corresponds to a fully connected network, in which every node is adjacent to every other. Using the quasi-invariant scaling, we again approximate the dynamics of (25) by the one-dimensional Fokker–Planck equation

$$\begin{aligned} \partial _\tau g(w,\tau )= & {} \gamma \partial _w\bigl [1 - \theta (1 + \gamma /\kappa )(w-m)g(w,\tau )\bigr ]\\{} & {} + (1-\theta )\frac{\sigma ^2}{2}\partial _w^2\bigl [(1 - w^2)g(w,\tau )\bigr ]. \end{aligned}$$

Note that we choose \(\kappa = \bar{\kappa }\), as in Eq. (34), which corresponds to the optimal scaling to ensure declustering. Figure 1 shows that as \(\epsilon \) approaches zero, the controlled update gets fully effective and the state relaxes toward a uniform distribution.

Fig. 1
figure 1

Time evolution of the opinion distribution \(g(w,\tau )\) for the network-homogeneous case with different choices of parameter\(\epsilon \), respectively, \(\epsilon = 10^{-1}\), \(\epsilon = 10^{-2}\) and \(\epsilon = 5\times 10^{-4}\) from left to right

This is also testified by Fig. 2, where we report the profile of the distribution \(g(w,\tau )\) at time \(\tau = T\), with \(T=8\). We also report the evolution from \(\tau = 0\) to \(\tau = T\) of the entropy, computed as

$$\begin{aligned} H[g](\tau ) = -\int _Ig(w,\tau )\log g(w,\tau )\, dw. \end{aligned}$$

We can see that when \(\epsilon \ll 1\) the entropy approaches the value \(\log (2)\), which corresponds to the entropy of the uniform distribution over the interval \(I\).

Fig. 2
figure 2

Left: comparison of the distribution g(wT) for various values of \(\epsilon \), as in Fig. 1. Right: comparison of the time evolution for the entropy \(H[g](\tau )\) for the same values of \(\epsilon \)

4.2 Consistency of the Mean-Field Model: The Power-Law Network Case

Next, we take model (28) with \({\mathcal {B}}(x,y) = 9/16(xy)^{-1/4}\), i.e., the power-law graphon. We recall that this special choice yields Eq. (9). Since the power-law graphon is separable, the optimal value for the penalty term is \(\bar{\kappa }\) of Eq. (34) as for the network-homogeneous case. Figure 3 shows the evolution of \(g(x,w,\tau )\) for different values of\(\epsilon \). Since the distribution depends on both the opinion and the network position, we illustrate \(g(x,w,\tau )\) at three instances in time in the first row, that is, \(\tau = 0\), \(\tau = T/2\) and \(\tau = T\), where this time we fix \(T = 32\). The second row of plots in Fig. 3 shows the opinion marginal \(\int _\Omega g(w,t)\, dx \) for different values of \(\epsilon \).

Fig. 3
figure 3

Top row: evolution in time of the distribution function \(g(x,w,\tau )\) for different time snapshots, respectively, \(\tau = 0\), \(\tau = 16\) and \(\tau = 32\). Bottom row: evolution in time of the opinion marginal \(g(w,\tau )\). Columns, from left to right: simulations results for \(\epsilon = 10^{-1}\), \(\epsilon = 10^{-2}\) and \(\epsilon = 5\times 10^{-4}\)

Figure 4 again shows for ease of viewing the power-law graphon kernel that we use in our simulations and the time evolution of the entropy, computed as

Again, we see that in the limit \(\epsilon \rightarrow 0^+\) the state reaches an uniform distribution over \(\Omega \times I= [0, 1]\times [-1, 1]\), since the power-law graphon is defined on the entire unit square\([0, 1]^2\).

Fig. 4
figure 4

Left: surface plot of \({\mathcal {B}}(x,y) = 9/16(xy)^{-1/4}\). Right, time evolution of the entropy \(H[g](\tau )\) for the same values of \(\epsilon \) of Fig. 3

4.3 Declustering on Non-separable Networks

The last computational experiments illustrate the dynamics in case of non-separable graphons: the \(k\)\( NN \) graphon and the small-world graphon. The first one models the \(k\)\( NN \) networks as described, e.g., in de Dios et al. (2022), Watts and Strogatz (1998), Van Der Hofstad (2016) and it is defined de Dios et al. (2022) as

where \(\chi (\,\cdot \,)\) is the indicator function and \(r, p \in (0, 1)\) are constant real numbers, while the small-world graphon (see, e.g., de Dios et al. 2022; Watts and Strogatz 1998; Van Der Hofstad 2016) is defined as

Fig. 5
figure 5

Top row: simulation results for the small-world graphon kernel. Bottom row: simulation results for the k\( NN \) graphon kernel. Column-wise, from left to right: surface plot of the graphon kernel; time snapshots slices of the distribution \(g (x,w,\tau )\) for \(\tau =0\), \(\tau = 4\) and \(\tau = 8\); time evolution of the opinion marginal distribution \(g(w,\tau )\)

Figure 5 shows the surface plots for both graphon kernels for fixed values of \(r = 1/8\) and \(p = 3/4\). We consider model (28) with scaling parameter\(\epsilon =10^{-3}\) and let evolve in time until\(T = 8\), where for this test we considered the network- and time-dependent optimal penalty coefficient\(\bar{\kappa }_{x,\tau }\) as written in Eq. (33). As we can see, \(g (x,w,\tau )\) approaches a uniform distribution over both network topologies.

5 Conclusion

In this paper, we proposed a simple yet very efficient optimal control strategy to break consensus in a kinetic model for opinion formation on graphons. The proposed approach allows us to include complex microscopic features, such as social networks, on the continuum limit and understand the impact of simple declustering mechanisms.

In doing so, we investigate the uncontrolled and controlled models in the mean-field limit. We then investigate the large time behavior and are able to write down closed form solutions of the (quasi)-stationary agent distribution of the uncontrolled and controlled problem for certain choices of parameters. This formulation allows us to identify the necessary controls to prevent consensus and steer the crowd toward a uniform distribution. We corroborate our analytical results with computational experiments for various types of graphons. The numerical results confirm our theoretical findings and the success of the proposed declustering strategy. Extensions of the designed approach to include dynamic networks for fully nonlinear equations are actually under study and will be presented in future researches.