Mean-Field Selective Optimal Control via Transient Leadership

A mean-field selective optimal control problem of multipopulation dynamics via transient leadership is considered. The agents in the system are described by their spatial position and their probability of belonging to a certain population. The dynamics in the control problem is characterized by the presence of an activation function which tunes the control on each agent according to the membership to a population, which, in turn, evolves according to a Markov-type jump process. In this way, a hypothetical policy maker can select a restricted pool of agents to act upon based, for instance, on their time-dependent influence on the rest of the population. A finite-particle control problem is studied and its mean-field limit is identified via \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varGamma $$\end{document}Γ-convergence, ensuring convergence of optimal controls. The dynamics of the mean-field optimal control is governed by a continuity-type equation without diffusion. Specific applications in the context of opinion dynamics are discussed with some numerical experiments.


Introduction
Multipopulation agent systems have drawn much attention in the last decades as a tool to describe the evolution of groups of individuals with some features that can change with time. These models find their application in contexts as varied as evolutionary population dynamics [11,36,48], economics [53], chemical reaction networks [39,42,44], and kinetic models of opinion formation [31,50]. In these models, each agent carries a label that may describe, for instance, membership to a population (e.g., leaders or followers), or the strategy used in a game. While this label space is often discrete, for many applications (and also as a necessary condition for the existence of Nash equilibria [43]) it is useful to attach to each agent located at a point x ∈ R d a continuous variable which describes their mixed strategies or, referring back to the context of leaders and followers, their degree of influence. If U denotes the space of labels, this may be encoded by a probability measure λ ∈ P(U ). It is natural to postulate that λ can vary with time according to a spatially inhomogeneous Markovtype jump process with a transition rate T (x, λ, (x, λ)) that may depend on the position x of the agent and on the global state of the system (x, λ), containing the positions and the labels of all the agents. Leadership may indeed be temporary and affected, for instance, by circumstances, need, location, and mutual distance among the agents.
Mean-field descriptions of such systems allow for an efficient treatment by replacing the many-agent paradigm with a kinetic one [21,23], consisting of a limit PDE whose unknown is the distribution of agents with their label, as those obtained in [6,7,41,49] (see also [40] for a related Boltzmann-type approach).
A further step which we devise in this paper is the extension of the mean-field point of view to the problem of controlling such systems, possibly in a selective way. The underlying idea is the presence of a policy maker whose control action, at any instant of time, concentrates on a subset of the population chosen according to the level of influence of the agents.
More precisely, in a population of N agents, the time-dependent state of the i-th agent is given by t → y i (t) = (x i (t), λ i (t)), where x i ∈ R d and λ i ∈ P(U ), for every i = 1, . . . , N , and evolves according to the controlled ODE system where v is a velocity field, u i is the control on the i-th agent belonging to a compact convex subset K of R d , and h ≥ 0 is a non-negative activation function selecting the set of agents targeted by the decision of the policy maker, depending on their state and, possibly, on the global state of the system. The values u i are determined by minimization of the cost functional (2) where Ψ N t is the empirical measure defined as Ψ N t := 1 N N i=1 δ y i (t) and φ is a positive convex cost function, superlinear at infinity, and such that φ(0) = 0; finally, the Lagrangian L N (·) is continuous and symmetric (see Definition 1 and Remark 2 below).
In this paper we show that the variational limit, in the sense of Γ -convergence [13,26] in a suitable topology, of the functional introduced in (2) is given by where L is a certain limit Lagrangian cost and where Ψ t ∈ P(R d × P(U )) and w are coupled by the mean-field continuity equation with the request that w be integrable with respect to the measure hΨ t ⊗ dt. From the point of view of the applications, we remark that our main result Theorem 2 implies that a minimizing pair (Ψ , w) for the optimal control problem (3) can be obtained as the limit of minimizers ( y N , u N ) of the finite-particle optimal control problem (2) (a precise statement is given in Corollary 1). In this sense, our result extends to the multipopulation setting the results of [32,34] with the relevant feature that the activation function h allows the policy maker to tune the control action on a subset of the entire population which is not prescribed a priori, but rather depends on the evolution of the system. At fixed time t > 0, it can target its intervention on the most influential elements of the population according to a threshold encoded by h. This is similar, in spirit, to a principle of sparse control, as considered, e.g., in [1,22,33]. Again, our model includes additional features; in particular, a control action only through leaders is already present in [33], where however the leaders population is fixed a priori and discrete. A localized control action on a small time-varying subset ω(t) of the state space of the system is presented in [22] as an infinite-dimensional generalization of [37]; there, no optimal control is considered and the evolution of ω(t) is algorithmically constructed to reach a desired target, instead of being determined by the evolution itself. The numerical approach of [1] makes use of a selective state-dependent control specifically designed for the Cucker-Smale model. For other recent examples of localized/sparse intervention in mean-field control systems, we refer the reader to [3,4,18,24,38,46,51].
The role of the variable λ deserves some attention. It can be generally intended as a measure of the influence of an agent, accounting for a number of different interpretations according to the context. Similar background variables have been used in recent literature to describe wealth distribution [28,30,45], degree of knowledge [16,17], degree of connectivity of an agent in a network [5,15], and also applications to opinion formation [29], just to name a few. Comparing to these other approaches, our mean-field approximation (3), (4) features a more profound interplay between the variable λ and the spatial distribution x of the agents, resulting in a higher flexibility of the model: not only is λ changing in time, but its variation is driven by an optimality principle steered by the controls.
We present some applications in Sect. 5 in the context of opinion dynamics, where λ represents the transient degree of leadership of the agents. Specifically, in the former example we highlight the emergence of leaders and how this can be exploited by a policy maker; in the latter, two competing populations of leaders with different targets and campaigning styles are considered, and the effect of the control action in favoring one of them is analyzed.
The plan of the paper is the following: in Sect. 2 we introduce the functional setting of the problem and we list the standing assumptions on the velocity field v, on the transition operator T , and on the cost functions L N , L, and φ; in Sect. 3 we present and discuss the existence of solutions to the finite-particle control problem; in Sect. 4 we introduce the mean-field control problem and prove the main theorem on the Γ -convergence to the continuous problem. In Sect. 5 we discuss the applications mentioned above.

Technical Aspects
We highlight the main technical aspects of the proof of Theorem 2. The Γ -liminf inequality builds upon a compactness property of sequences of empirical measures Ψ N t with uniformly bounded cost E N . The hypotheses on the velocity field v and on the transition operator T in (1) (see Sect. 2) imply, by a Grönwall-type argument, a uniform-in-time estimate of the support of Ψ N t , ensuring the convergence to a limit Ψ t . The lower bound and the identification of the control field w are consequences of the convergence of L N to L and of the convexity and superlinear growth of the cost function φ. As for the Γ -limsup inequality, we remark that the sole integrability of w (contrary to the situation considered in [32]) does not guarantee the existence of a flow map for the associated Cauchy problem and therefore does not allow for a direct construction of a recovery sequence based on the analysis of (5) due to the lack of continuity with respect to the data. Following the main ideas of [34], we base our approximation strategy on the superposition principle [11,Theorem 5.2], [41,Theorem 3.11] (see also [9,10,20,47]), which indeed selects a sequence of trajectories z N such that the corresponding empirical measures . The explicit dependence of (5) on the global state of the system calls for a further modification of the trajectories z N . Here, the fact that h may take the value 0 introduces an additional technical difficulty as we cannot exploit the linear dependence on the controls in (1). To overcome this problem, we resort once again to the local Lipschitz continuity of v and of T , and construct the trajectories y N by solving the Cauchy problem By Grönwall estimates, we can conclude that the distance between Λ N t and the empirical measure Ψ N t generated by y N is infinitesimal, so that we obtain the desired convergences of Ψ N t to Ψ t and of E N ( y N , u N ) to E(Ψ , w). Let us also mention that the symmetry of the cost is used in a crucial way to deal with the initial conditions in (6).

Mathematical Setting
In this section we introduce the mathematical framework and notation to study our system. Basic notation. Given a metric space (X , d X ), we denote by M(X ) the space of signed Borel measures μ in X with finite total variation μ TV , by M + (X ) and P(X ) the convex subsets of nonnegative measures and probability measures, respectively. We say that μ ∈ P c (X ) if μ ∈ P(X ) and the support spt μ is a compact subset of X . For any K ⊆ X , the symbol P(K ) denotes the set of measures μ ∈ P(X ) such that spt μ ⊆ K . Moreover, M(X ; R d ) denotes the space of R d -valued Borel measures with finite total variation.
As usual, if (Z , d Z ) is another metric space, for every μ ∈ M + (X ) and every μ- For a Lipschitz function f : X → R we define its Lipschitz constant by and we denote by Lip(X ) and Lip b (X ) the spaces of Lipschitz and bounded Lipschitz functions on X , respectively. Both are normed spaces with the norm f Lip := f ∞ + Lip( f ), where · ∞ is the supremum norm. Furthermore, we use the notation Lip 1 (X ) for the set of functions f ∈ Lip b (X ) such that Lip( f ) ≤ 1.
In a complete and separable metric space (X , d X ), we shall use the Wasserstein distance W 1 in the set P(X ), defined as Notice that W 1 (μ, ν) is finite if μ and ν belong to the space is a Banach space and μ ∈ M + (E), we define the first moment m 1 (μ) as Notice that for a probability measure μ finiteness of the integral above is equivalent to μ ∈ P 1 (E), whenever E is endowed with the distance induced by the norm · E . Furthermore, the notation C 1 b (E) will be used to denote the subspace of C b (E) of functions having bounded continuous Fréchet differential at each point. The symbol ∇ will be used to denote the Fréchet differential. In the case of a function φ : [0, T ]×E → R, the symbol ∂ t will be used to denote partial differentiation with respect to t. Functional setting. We consider a set of pure strategies U , where U is a compact metric space, and we denote by Y := R d × P(U ) the state-space of the system.
According to the functional setting considered in [11,41], we consider the space Y := R d × F(U ), where we have set (see, e.g., [8,12] and [52,Chap. 3]) The closure in (7) is taken with respect to the bounded Lipschitz norm · BL , defined as We notice that, by definition of · BL , we always have in particular, λ BL ≤ 1 for every λ ∈ P(U ). Finally, we endow Y with the norm y Y = (x, λ) Y := |x| + λ BL . For every R > 0, we denote by B Y R the closed ball of radius R in Y , namely B Y R = {y ∈ Y : y Y ≤ R} and notice that, in our setting, B Y R is a compact set. As in [41], we consider, for every Ψ ∈ P 1 (Y ), a velocity field v Ψ : Y → R d such that As for T , for every Ψ ∈ P 1 (Y ) we assume that the operator T Ψ : Y → F(U ) is such that (T 0 ) for every (y, Ψ ) ∈ Y × P 1 (Y ), the constants belong to the kernel of T Ψ (y), i.e., where ·, · denoted the duality product; (T 1 ) there exists M T > 0 such that for every y ∈ Y and every Ψ ∈ P 1 (Y ) (T 2 ) for every R > 0, there exists L T ,R > 0 such that for every (y 1 , (T 3 ) for every R > 0 there exists δ R > 0 such that for every Ψ ∈ P 1 (Y ) and every For every y ∈ Y and every Ψ ∈ , which is the velocity field driving the evolution; we also consider an activation function h Ψ : Y → [0, +∞) satisfying: (h 1 ) h Ψ is bounded uniformly with respect to Ψ ∈ P 1 (Y ); (h 2 ) for every R > 0 there exists L h,R > 0 such that for every Ψ 1 , Ψ 2 ∈ P 1 (B Y R ) and every y 1 ,
In order to define the optimal control problems of Sects. 3 and 4 , we have to introduce some further notation. For every N ∈ N, we define In particular, we notice that, up to a permutation, every N -tuple y N := (y 1 , . . . , y N ) ∈ Y N can be identified with an element Ψ ∈ P N (Y ). We now give the following two definitions (see also [34,Definition 2.1].
Remark 2 Notice that, by symmetry and by identifying y N : For the cost functionals for the finite particle control problem and for their meanfield limit we consider the functions φ : R d → [0, +∞), L N : Y × Y N → [0, +∞), and L : Y × P 1 (Y ) → [0, +∞) such that (φ 1 ) φ is convex and superlinear with φ(0) = 0; (L 1 ) L N is continuous and symmetric; (L 2 ) L N P 1 -converges to L uniformly on compact sets;  (i) for every R > 0 there exists a constant L R ≥ 0 such that for every c 1 , c 2 ∈ C ∩ B R and t ∈ [0, T ] (ii) there exists M > 0 such that for every c ∈ C, there holds (iii) for every c ∈ C the map t → A(t, c) belongs to L 1 ([0, T ]; E); (iv) for every R > 0 there exists θ > 0 such that Then for everyc ∈ C there exists a unique absolutely continuous curve c :

The Finite Particle Control Problem
We now introduce the finite particle control problem. We fix a compact and convex subset K of R d of admissible controls with 0 ∈ K . For every N ∈ N and every control function u i ∈ L 1 ([0, T ]; K ), i = 1, . . . , N , the dynamics of the N -particles system is driven by the Cauchy problem where we have set , applying Theorem 1 we deduce that the Cauchy problem (8) admits a unique solution y N := (y 1 , . . . , y N ) ∈ AC([0, T ]; Y N ), which is also identified with the empirical measure Ψ N t , up to a permutation. To ease the notation in our analysis, we give the following definition. In a similar way, if y N 0 = (y 0,1 , . . . , y 0,N ) ∈ Y N , we say that

Definition 3 We say that
Given y N 0 = (y 0,1 , . . . , y 0,N ) ∈ Y N , we define the set of couples trajectory-control solving the Cauchy problem (8) as Given functions φ, L N , and L satisfying conditions (φ 1 ), (L 1 ), and (L 2 ), for every initial condition y N 0 ∈ Y N and every ( y, u) where (Ψ N , ν N ) is the pair generated by ( y, u). Therefore, the optimal control problem for the N -particle system reads as follows: We now prove the existence of solutions of the minimum problem (11). First, we state the boundedness of the trajectories y for given control and initial datum, which will also be useful in the Γ -convergence analysis of Sect. 4.

Proposition 1 For every N
for a positive constant C independent of N .
for some positive constant C depending only on h and K . Taking the supremum over i ∈ {1, . . . , N } in the previous inequality and applying Grönwall inequality we deduce (12).

Proposition 2 For every N ∈ N and every initial datum
is a solution of (11). If the cost function φ satisfies and k ∈ N. Then, by (v 1 ), (T 1 ), and (h 1 ), for every s < t ∈ [0, T ], every i = 1, . . . , N , and every k we have that Thus, y N k is bounded and equi-Lipschitz continuous in [0, T ]. By Ascoli-Arzelà Theorem, y N k converges uniformly to some . Finally, the continuity of L N and the convexity of φ yield the lower semicontinuity of the cost functional E and ( y N , u N ) is a solution of (11).
The second part of the statement follows from the structure of system (8). Indeed, if we defineũ N as in (13), the trajectory y N solution of (8) does not change and , the previous inequality and the minimality of ( y N , u N ) imply that u i (t) =ũ i (t) for t ∈ [0, T ] and i = 1, . . . , N , and the proof is concluded.

Mean-Field Control Problem
Before introducing the mean-field optimal control problem and stating the main Γ -convergence result, we discuss the compactness of sequences of pairs trajectory- N . To ease the notation, given a curve Then, up to a subsequence, the curve The proof of Proposition 3 is provided in Sect. 4.1.
In view of the compactness result in Proposition 3, for Ψ ∈ C([0, T ]; (P 1 (Y ); W 1 )) and ν ∈ M([0, T ] × Y ; R d ) we define the cost functional for the mean-field control problem as where we have set for With the above notation at hand, the mean-field optimal control reads as In order to discuss the existence of solutions to (19), we introduce the auxiliary functionals In the next two propositions we show the existence of solutions to (19). We start by proving that for each (Ψ , ν) ∈ S( Ψ 0 ), the support of Ψ t is bounded in Y uniformly.
Proof Let (Ψ , ν) be as in the statement of the proposition. In particular, we may write Since φ(0) = 0 and φ ≥ 0, without loss of generality we may suppose w(t, Let us first give a bound on the first moment m 1 (Ψ t ). To do this, we fix a function ζ ∈ C c (F(U )) such that 0 ≤ ζ ≤ 1 and ζ(λ) = 1 for λ ∈ P(U ), which is possible since P(U ) is a compact subset of F(U ). For every n ∈ N and every ε > 0, let us fix g ε (x) := |x| 2 + ε 2 and θ n ( Since for a positive constant C dependent only on h and on K . Passing to the limit, in the order, as ε → 0 and n → ∞, and using (v 3 ) and (T 1 ), we deduce from (22) that for some positive constant C only depending on h, K , v, and T . We now prove the uniform bound of the support of Ψ t . To do this, we will apply the superposition principle [11,Theorem 5.2]. The curve Ψ ∈ AC([0, T ]; (P 1 (Y ); W 1 )) solves the continuity equation where the velocity field b : and is extended to 0 in Y \ Y . By (24), (v 3 ), (T 1 ), and (h 1 ), and by the fact that w(t, y) ∈ K , we can estimatê We are therefore in a position to apply [11,Theorem 5.2] with velocity field b. Hence, there exists π ∈ P(C([0, T ]; Y )) such that Moreover, π is concentrated on solutions of the Cauchy problems For where C is as in (22). Again by Grönwall inequality, since y 0 ∈ spt( Ψ 0 ) and (24) holds, we deduce from (30) that there exists R > 0 independent of t such that every solution t → y(t) of the Cauchy problem (29) for every t ∈ [0, T ] and every y ∈ sptΨ t . Since Ψ solves (25), we deduce that t → Ψ t is Lipschitz continuous, with Lipschitz constant L only depending on R. In particular, all the above computations are independent of the choice of (Ψ , ν) ∈ S( Ψ 0 ). This concludes the proof of the proposition.

Proposition 5 For every
Proof The proof of existence follows from the Direct Method. Let (Ψ k , ν k ) ∈ S( Ψ 0 ) be a minimizing sequence for (19). For every k, we may write Without loss of generality we may suppose w k (t, By Proposition 4, Ψ k,t have a uniformly bounded support in Y and is equi-Lipschitz continuous. By Ascoli-Arzelà theorem, there exists Ψ ∈ AC([0, T ]; (P 1 (Y ); W 1 )) such that, up to a subsequence, Ψ k converges to Ψ uniformly in C([0, T ]; (P 1 (Y ); W 1 )). Since In particular, we may assume that Thus, thanks to (h 1 ), to the uniform convergence of Ψ k to Ψ , and to the fact that spt(Ψ k,t ) ⊆ B Y R , we also have that ν = h Ψ μ. By definition of Φ min and of Φ (see (17)- (18) and (20)), we have that for every k Applying [19,Corollary 3.4.2] we infer that μ Ψ and Thus, (Ψ , ν) ∈ S( Ψ 0 ) and, by (32), Finally, by (L 3 ), by the uniform convergence of Ψ k to Ψ , and by the uniform inclu- Combining (33) and (34) we infer that which concludes the proof of the proposition.
We are now in a position to state our main Γ -convergence result.
. Then the following facts hold: Then We provide the proof of Theorem 2 in Sect. 4.1.
As a corollary of Theorem 2, we obtain the convergence of minima and minimizers.
Proof The result is standard in Γ -convergence theory (see, e.g., [13,26]) and follows from the compactness result in Proposition 3 and from Theorem 2.

Proofs of Proposition 3 and Theorem 2
Before proving Proposition 3 and Theorem 2, we state two lemmas regarding the control part of the cost functional E y N 0 N and the functionals Φ and φ defined in (20). by ( y N , u N ); finally, let Then, for a.e. t ∈ [0, T ] we have N and t ∈ [0, T ], then for a.e. t ∈ [0, T ], we have that Proof The proof of (38) can be found in [34, Lemma 6.2, formula (6.
2)]. Arguing as in the proof of [34, Lemma 6.2, formula (6. 3)] we may also prove (39). Referring to the notation in [34, Lemma 6.2], the only modification we have to make is that, whenever a.e. in S, and the proof can be concluded as in [34,Lemma 6.2].
Proof We define the auxiliary measures μ N and μ N t as in (37) and we notice that In particular, we deduce that there exists μ ∈ M([0, T ]×Y ; R d ) such that, up to a not relabelled subsequence, (Y ); W 1 )), and μ N and ν N have uniformly compact support, in the limit it holds ν = h Ψ μ.
We now prove Proposition 3. 0 , ( y N , u N ), (Ψ N , ν N ), and Ψ N 0 be as in the statement of the proposition. Since W 1 (Ψ N 0 , Ψ 0 ) → 0 as N → ∞, by Proposition 1 we obtain that for every t ∈ [0, T ] the probability measure Ψ N t has support contained in the compact set B Y R for a suitable R > 0 independent of t and N . This implies that the curve Ψ N takes values in a compact subset of P 1 (Y ) with respect to the 1-Wasserstein distance. Let us now show that the sequence Ψ N is equi-continuous. Thanks to the assumptions (v 1 ), (T 1 ), and (h 1 ), to the fact that u N (t) ∈ K N and spt(
Since u N takes values in K N with K compact and h Ψ N is bounded by (h 1 ), we have that, up to a further subsequence, We finally show that (Ψ , ν) solves the corresponding continuity equation in the sense of distributions. By the uniform convergence of Ψ N to Ψ , we have that we only have to determine the limit of the second integral on the right-hand side of (43). To do this, we estimate By the regularity of the test function ϕ, by assumptions (v 2 ) and (T 2 ), and by the uniform inclusion spt(Ψ N t ) ⊆ B Y R , we may estimate I N 1 with for a positive constant L R depending only on R. Since Ψ N → Ψ in C([0, T ]; (P 1 (Y ); W 1 )), we deduce from the previous inequality that I N 1 → 0 as N → ∞. Again by (v 2 ) and (T 2 ), the function y → ∇ϕ(t, y)b Ψ t (t) is Lipschitz continuous on B Y R for every t ∈ [0, T ], with Lipschitz constant C R > 0 uniformly bounded in time. Since and I N 2 → 0 as N → ∞. We can now pass to the limit in (44) to obtain that which in turn implies, by passing to the limit in (43), that By the arbitrariness of ϕ ∈ C ∞ c ((0, T ) × Y ), we conclude that (Ψ , ν) ∈ S( Ψ 0 ). This completes the proof.
Eventually, we prove the Γ -convergence result.

Proof of Theorem 2
The proof follows the lines of [34,Theorem 3.2]. We divide the proof into two steps.
Step 1: Γ -liminf inequality. Let (Ψ , ν), ( y N , u N ), y N 0 , (Ψ N , ν N ), and Ψ N 0 be as in the statement. If lim inf N →∞ E y N 0 N ( y N , u N ) = +∞ there is nothing to show. Without loss of generality we may therefore assume that which implies, by definition (10) for every N . Furthermore, by Proposition 1 there exists R > 0 independent of N and t such that spt(Ψ N t ) ⊆ B Y R . By Proposition 3 we have that the limit pair (Ψ , ν) belongs to S( Ψ 0 ) and spt(Ψ t ) ⊆ B Y R for every t ∈ [0, T ]. Applying Lemma 2 we infer that Since L N P 1 -converges to L uniformly on compact sets and spt( Combining (45) and (46) we conclude that which is (35).
Step 2: Γ -limsup inequality. We will construct a sequence ( y N , u N ) such that and we recall that this condition is equivalent to (36). Ψ ). In particular, we may assume that w = 0 on the set As in [34,Theorem 3.2], the construction of a recovery sequence is based on the superposition principle [11,Theorem 5.2]. The curve Ψ ∈ AC([0, T ]; (P 1 (Y ); W 1 )) solves indeed the continuity equation where the velocity field b : and is extended to 0 in Y \ Y . By Proposition 4, there exists R > 0 such that spt(Ψ t ) ⊆ B Y R for every t ∈ [0, T ], and arguing as in (27) we infer that there exists a probability measure π ∈ P(Γ ) concentrated on Δ such that for every t ∈ [0, T ] Ψ t = (ev t ) # π . We further notice that Proposition 4, the boundedness of the control w, and assumptions (v 3 ), (T 1 ), We define the auxiliary functional (2022) 85:22 Page 23 of 44 22 We notice that by Fubini Theorem (Ψ , ν).
Furthermore, F is lower semicontinuous in Δ. Indeed, if γ k , γ ∈ Δ are such that γ k → γ with respect to the uniform convergence in Γ , since w takes values in the compact set K we immediately deduce that w(·, γ k (·)) is bounded in L ∞ ([0, T ]; R d ), and therefore converges weakly * , up to a subsequence, to some g ∈ L ∞ ([0, T ]; R d ) and, by convexity of φ, we have the lower-semicontinuity (see, e.g., [25,Theorem 3.23 Since γ k ∈ Δ for every k, for s < t ∈ [0, T ] we can write Passing to the limit in the previous equality we deduce, thanks to (v 1 ), (T 2 ), and (h 2 ), On the other hand, being γ ∈ Δ we have that which implies, by the arbitrariness of s and t, that Since φ ≥ 0 and φ(0) = 0, we finally obtain By Lusin theorem, we can select an increasing sequence of compact sets Δ k Δ k+1 Δ such that π(Δ \ Δ k ) < 1 k and F is continuous on Δ k . Setting we have that In particular, the first limit follows from (50) and from the narrow convergence of π k to π . The second limit is a consequence of the first. Since Δ k is compact, we can select a sequence of curves {(γ k ) m i : i = 1, . . . , m, m ∈ N} ⊆ Δ k such that for every k the measures (53) where the second equality is due to the fact that F is continuous and bounded on Δ k . Let us fix a countable dense set We recall that, by construction, on the set Δ k the function γ → F(γ ) is continuous. Since φ is superlinear, this implies that w(·, γ j (·)) → w(·, γ (·)) in L p ([0, T ]; R d ) for every p < +∞ whenever γ j , γ ∈ Δ k with γ j → γ . Hence, also the map γ →ˆT 0 ϕ (t, γ (t))w(t, γ (t))h Ψ t (γ (t)) dt is continuous in Δ k for every ∈ N. Combining this fact with (52) and (53), we are able to select a suitable strictly increasing sequence m(k) such that for every m ≥ m(k) it holds where in the last inequality we have used that π m k converges narrowly to π k as m → ∞ and that π m k is concentrated on curves belonging to Δ k . Therefore we set π N := π N k for m(k) ≤ N < m(k + 1) and obtain that lim N →∞ so that We now construct the recovery sequence ( y N , u N ). First, we define the auxiliary curves Λ N t : = (ev t ) # π N ∈ AC([0, T ]; (P 1 (Y ); W 1 )) and the corresponding curves z N = (z 1 , . . . , z N every i = 1, . . . , N , and every  N ∈ N, and u N := (u 1 , . . . , u N ) ∈ L 1 ([0, T ]; K N ). In particular, each component of z N solves the ODEż with initial point z i (0) ∈ spt( Ψ 0 ). The curves z N have to be further modified, since in the ODE (59) the velocity field b Ψ t still contains the state of the limit system Ψ t rather than Λ N , and the initial data z N 0 = (z 1 (0), . . . , z N (0)) do not coincide with y N 0 . Being Ψ N 0 and Λ N 0 two empirical measures, we can find a sequence of permuta- Let us further denote by σ N R d : where, as for the Cauchy problem in (8), we have set ( y N , u N

By [41, Corollary 2.3] system (61) admits a unique solution and
, where, with a slight abuse of notation, we have denoted by σ N R d the action of the permutation σ N on We denote by (Ψ N , ν N ) and (Λ N , η N ) the pairs generated by ( y N , u N ) and  by (z N , u N ), respectively, and notice that, by invariance with respect to permutations, (Ψ N , ν N ) coincides with the pair generated by ( y N , u N ). We want to show that To do this, we will prove that and that so that (62) follows by triangle inequality. Let us consider the pair (Λ N , η N ). Since z i (0) ∈ spt( Ψ 0 ) for every i = 1, . . . , N and Ψ 0 ∈ P c (Y ), Proposition 1 yields the existence of R > 0 independent of N and t such that spt( Repeating the computations performed in (42) we obtain that Λ N is equi-Lipschitz continuous with respect to t. The convergence in (57) implies that W 1 (Λ N t , Ψ t ) → 0 for every t ∈ [0, T ] as N → ∞, so that an application of Ascoli-Arzelà Theorem yields that Λ N → Ψ in C([0, T ]; (P 1 (Y ); W 1 )). This proves the second convergence in (63).
To prove the first convergence in (63), we estimate the distance between y N and z N . First we notice that, up to possibly taking a larger R, we have that y i (t) Y ≤ R for every i = 1, . . . , N for every N ∈ N and for every t ∈ [0, T ], so that spt( for some positive constant L R independent of N . Hence, by Grönwall and triangle inequalities we deduce from (65) that Summing (66) over i = 1, . . . , N and recalling (60), we infer that for every t ∈ [0, T ] Applying once again Grönwall inequality to (67) we obtain for every t ∈ [0, T ] → 0, and the second limit in (63) holds, from (68) we conclude (63) and the convergence of Ψ N to Ψ in C([0, T ]; (P 1 (Y ); W 1 )).
We now turn our attention to (64). The second convergence in (64) is a matter of a direct computation. Indeed, for every for some positive constant C independent of ε. We now estimate the right-hand side of (69). By definition of π N and by (54) and (56), for every N ∈ [m(k), m(k + 1)) with k ≥ we have that ˆΓˆT

γ (t))w(t, γ (t))h Ψ t (γ (t)) dt dπ k (γ )
Passing to the limit as N → ∞ in the previous inequality we get by the boundedness of w, ϕ , and h, that Therefore, passing to the limsup as N → ∞ in (69) we obtain lim sup By the arbitrariness of ε and ϕ we infer that η N ν weakly * in M([0, T ] × Y ; R d ).
We now turn to the first convergence in (64). For every ϕ ∈ C c ([0, T ] × Y ; R d ) we have that, using the definition of ν N , of η N , and of the controls u N , In order to continue in (70) let us fix a modulus of continuity ω ϕ for the function ϕ. Notice that, without loss of generality, we may assume ω ϕ to be increasing and concave. Thus, by (h 1 ), (h 2 ), by the fact that w(t, z i (t)) ∈ K and y i , z i ∈ B Y R for every t ∈ [0, T ] and every i = 1, . . . , N , and by the inequalities (67), (68), we can further estimate (70) with where C > 0 is a constant independent of N . Therefore, by (63) we conclude that which yields the first convergence in (64). Finally, we prove (47). As already observed, ( y N , u N ) ∈ S( y N 0 ) by construction, so that Since spt(Ψ N t ), spt(Ψ t ) ⊆ B Y R for every t ∈ [0, T ] and, by (L 1 ) and (L 2 ), L N is continuous and L N P 1 -converges to L uniformly on compact sets, we have that As for the second term on the right-hand side of (71), we recall that In view of (58), we infer that which implies, together with (72), that which is (47). This concludes the proof of the theorem.

Numerical Experiments
In this section we consider specific applications of our model in the context of opinion dynamics. In Sect. 5.1, we discuss the effects of controlling a single population of leaders. In Sect. 5.2, instead, two competing populations of leaders and a residual population of followers are considered, but the policy maker favors only one of the populations of leaders towards their goal. In both cases, for the continuity equation (4) we use a finite volume scheme with dimensional splitting for the state space discretization, following a similar approach to the one employed in [6]. Introducing a suitable discretization of the density Ψ n i = Ψ (t n , y i ) on uniform grid with parameters Δx, Δλ in the state space, and Δt in time, the resulting scheme reads where T i±1/2 , V i±1/2 are suitable discretizations of the transition operator and the non-local velocity flux, respectively, and w n denotes the control computed at the corresponding time. Notice that the update of Ψ follows a two-step approximation, first in λ then in x, of the continuity equation (4) (see also [7] for a rigorous convergence result).
The realization of the control is approximated using a nonlinear Model Predictive Control (MPC) tecnique. Hence, an open-loop optimal control action is synthesized over a prediction horizon [0, T p ], by solving the optimal control problem (3), (4). Having prescribed the system dynamics and the running cost, this optimization problem depends on the initial state and the horizon T p only. The control w * , which is obtained for the whole horizon [0, T p ], is implemented over a possibily shorter control horizon [0, T c ]. At t = T c the initial state of the system is re-initialized to Ψ (T c ) and the optimization is repeated. In this setting, to comply with an efficient solution of the dynamics, we perform the MPC optimization selecting T p = T c = Δt. This choice of the horizons correponds to a instantaneous relaxation towards the target state. For further discussion on MPC literature we refer to [1,27,35] and references therein.

A Leader-Follower Dynamics
In this setting, the set U consists of two elements, that is U := {F, L} and is endowed with a two-valued distance The space P 1 ({F, L}) is identified with the interval [0, 1]; accordingly, in the discrete model, λ i is a scalar value describing the probability of the i-th particle of being a follower.
In order to tune the influence of the control, the simplest possible choice is to fix a function h Ψ (x, λ) = h(λ) in (8) for a suitable bounded non-negative Lipschitz function h : [0, 1] → R. In the applications, where the policy maker aims at controlling only the population of leaders, the ideal function h should be non-increasing and equal to zero when λ is close to 1. As shown in Proposition 2, if the cost function φ satisfies {φ = 0} = {0}, the optimal control will steer only agents with small λ.
It is natural to partition the total population into leaders and followers, according to λ. Given Ψ ∈ P(R d × [0, 1]), and for a fixed Lipschitz function g : [0, 1] → [0, 1], we define the followers and leaders distributions as for each Borel set B ⊂ R d . In particular, the sum μ F Ψ (B) + μ L Ψ (B) coincides with the first marginal of Ψ and therefore it counts the total population contained in B. In the discrete setting, the leaders and followers distributions in (73) are given by A typical choice for g is any Lipschitz regularization of the indicator function of the set {λ ≥ m}, with m ≥ 0 a small given threshold. Doing so amounts to classifying agents with small λ (and therefore high influence) as leaders and the remaining ones as followers. However, different and softer choices for g are possible. For instance, the choice g(λ) = λ allows one to measure the average degree of influence of an agent sitting in the region B on the remaining ones. It is a common feature of many-particle models to assume that each agent experiences a velocity which combines the action of the overall followers and leaders distribution. Hence, these velocities are an average velocity of the system, weighted by the probability λ that an agent located at x has of being a follower, and have the general form where the functions g i : [0, 1] → R (for i = 1, 2) are given Lipschitz continuous functions and K • : R d → R d for , • ∈ {F, L} are suitable Lipschitz continuous interaction kernels. Let us remark that the choice g 1 = g 2 = g, so that the velocities actually depend on Ψ through the distributions μ F Ψ and μ L Ψ , is quite plausible in this kind of modeling. In the discrete setting, a velocity field of this kind reads as Similar principles can be used for defining the transitions rates. According to the identification of P 1 ({F, L}) with [0, 1], the transition operator T Ψ (x, λ) will be identified with a scalar (see (76) below), instead of taking values in the two-dimensional space F ({F, L}). Indeed, in this case (T 0 ) uniquely determines the second component of T Ψ once the first one is known. For instance, one can consider with α • having the typical form are satisfied (equivalently, the evolution of λ is confined into [0, 1]). If one chooses g 3 (λ) = λ, for fixed x and Ψ the evolution of λ is governed by a linear master equation. Instead, for g 3 = g, the switching rates α F and α L are activated depending on the population to which an agent belongs. The function H • can be used to localize the effect of the overall distribution on the transition rates; within this model, an agent sitting at x is able to interact only with agents in a small neighborhood around x. Similarly, with a proper choice of • , one can tune the influence of the surrounding agents according to their probability of belonging to the populations of followers or leaders. The choice F = 1 − L = g corresponds to having rates which depend on Ψ through the distributions μ F Ψ and μ L Ψ . Let us however stress that, in general, also with these choices it is not possible to decouple equation (14) into a system of equations for μ F Ψ and μ L Ψ , which, on the contrary, can only be reconstructed after solving for Ψ first. Some particular cases where this is instead possible are discussed in [41,Proposition 4.8].
With the arguments of [41,Sect. 4], one can see that choices of v Ψ and T Ψ made in (75) and (76) fit in our general framework. Let us remark that in [41,Sect. 4], only the case g(λ) = g i (λ) = λ, i = 1, 2, 3 was discussed, but the adaption to the current, more general situation, is straightforward.
A typical Lagrangian that we may consider should penalize the distance of the leaders from a desired goal. This may be encoded by a function of the form wherex ∈ R d is the position of the desired goal and θ : [0, 1] → [0, 1] is zero when λ is above a given threshold (a possible choice is even θ(λ) = 1 − g(λ)). Moreover, a competing effect, depending on the overall distribution of the population, can be taken into account: leaders should stay as close as possible to the population of followers, in order to influence their behavior. This may be encoded by a function of the form which favors a leader agent to be close to the barycenter of the followers distribution. Notice that the function L 2 depends continuously on Ψ as long as μ F Ψ (R d ) > 0, which is always the case in practical situations. Hence, the Lagrangian of the system is the sum for α ∈ [0, 1] a given constant. Finally, a very simple and natural family of cost functions is In particular, φ p is strictly convex and {φ p = 0} = {0}, so that the conclusions of Proposition 2 hold true in the case h Ψ = h mentioned above. Namely, the optimal control u ∈ L 1 ([0, T ]; (R d ) N ) in the N -particle problem will actually act only on the population of leaders, while the evolution of the population of followers will be determined by the velocities and transitions rates detailed above.

Test 1: Opinion Dynamics with Emerging Leaders Population
We study the setting proposed in [2,31] for opinion dynamics in presence of leaders influence, and we assume that x ∈ [−1, 1], where {±1} identify two opposite opinions. The interaction field v Ψ (75) is characterized by bounded confidence kernels with the following structure where ε ≥ 0 is a regularization parameter for the characteristic function χ and κ • represent the confidence intervals with the following numerical values, The weighting functions g 1 , g 2 are such that g 1 (λ) ≡ g 2 (λ) ≡ (λ) with The transition operator T Ψ (x, λ) in (76) is identified by the following quantities where the functions D F and D L represent the concentration of followers and leaders at position x and are defined by In Fig. 1, we report the choice of the initial data, and the marginals μ F Ψ (t, x), μ L Ψ (t, x) relative to the opinion space, and to the label space ν F Ψ (t, λ), ν L Ψ (t, λ). The structure of the initial data is a bimodal Gaussian distribution defined as follows 3, x L = 0.7 and C 0 is normalizing constant. Figure 2 reports from left to right four frames of the marginals up to time t = 10, without control. We observe transition from leader to follower, and viceversa, where, without the action of a policy maker, the initial clusters of opinions remain bounded away and no consensus is reached. In Fig. 3, control is activated and in this case we observe the steering action of the leaders towards the target positionx.
We summarize the evolution of controlled and uncontrolled case up to final time T = 50 in Fig. 4, comparing the controlled and uncontrolled cases, respectively. We compare marginals μ F Ψ and μ L Ψ and the percentage of followers and leaders as functions of time.

Two-Leader Game
A rather natural extension of the situation considered in Sect. 5.1 consists in studying the interaction between three different populations: one of followers, still denoted with the label F, and two of leaders, denoted by L 1 and L 2 , respectively, competing for gaining consensus among the followers and working to attract them towards their own objectives. A policy maker may choose to promote one of the two populations of leaders by favoring the interactions among these leaders and the followers. We discuss here how to model such a scenario within our analytical setting.
or, equivalently, to the subset Δ of R 2 Hence, in a discrete model the scalar values λ L 1 ,i , λ L 2 ,i stand for the probability of the i-th particle of being an L 1 -leader and an L 2 -leader, respectively. Clearly, λ F,i = (1 − λ L 1 ,i − λ L 2 ,i ) represents the probability of being a follower.
Assuming that the policy maker wants to promote the goals of the leaders L 1 , the influence of the controls on the populations dynamics may be tuned by the function h Ψ (x, λ) = h(λ L 1 ) for a bounded non-negative Lipschitz function h : [0, 1] → [0, 1] such that h(λ L 1 ) = 1 for λ L 1 close to 1 and h(λ L 1 ) = 0 for λ L 1 close to 0. Considering a cost function φ of the form (81), for instance, the control u ∈ R d will act only on the L 1 -leaders, as a consequence of Proposition 2.
Given Ψ ∈ P(R d × Δ) and a Lipschitz continuous , for j = 1, 2, we define the followers and leaders distributions as for every Borel subset B of R d . In the discrete setting, the leaders and followers distributions are A possible choice for f L j is any Lipschitz regularization of the indicator function of the set {λ L j ≥ m} with m > 1 2 and such that f L j (λ L j ) = 0 for λ L j ≤ 1 2 , compatible with the request that f maps Δ in Δ. We further notice that the choice f L j (λ L j ) = λ L j is still allowed with the same interpretation given in (74).
The velocity field v Ψ (x, λ) in (75) can be easily modified for the current scenario by setting, for instance, under the additional condition that f F (λ) = 1 − f L 1 (λ L 1 ) − f L 2 (λ L 2 ). The transition T Ψ (x, λ) is now given by where the transition rates α are defined as in (77) with the obvious modifications, and g L j have similar properties as f L j . To comply with (T 0 ), we need (see [41,Proposition 5.1]) in view of which we can write (omitting the dependence on x, Ψ ) in order to determine the evolution of the two independent parameters λ L 1 and λ L 2 .
Since the policy maker promotes the L 1 -leaders, the Lagrangian should penalize the distance of the population L 1 from their goal. As in (78), this is done by considering a function of the form L 1 (x, λ):= θ(λ L 1 )|x − x| 2 , where x ∈ R d denotes the desired goal of the L 1 -leaders and θ : [0, 1] → [0, 1] is a continuous function which is 0 close to 0 and 1 close to 1. With the same idea, the second term (79) is modified in order to penalize only the distance of the L 1 -leaders from the barycenter of the followers Again, we notice that L 2 is continuous as long as μ F Ψ (R d ) > 0. Finally, the Lagrangian L of the system has the same structure of (80), i.e., L = α L 1 + (1 − α) L 2 for a parameter α ∈ [0, 1] to be tuned.

Test 2: Opinion Dynamics with Competing Leaders
We consider the opinion dynamics presented in Test 1, where the opinion variable is x ∈ [−1, 1] with {±1} two opposite opinions. We introduce two populations of leaders competing over the consensus of the followers. The first population of leaders L 1 has a radical attitude aiming to mantain their position, and their strategy is driven by the policy maker. Instead, the population L 2 is characterized by a populistic attitude, without the intervention of an optimization process: they are willing to move from their position in order to have a broader range of interaction with the remaining agents.
The transition operator T Ψ (x, λ) in (86) is identified by the following quantities and α L 1 L 2 (x, Ψ ) = α L 2 L 1 (x, Ψ ) = 0, coherently with respect to (87). Functions D F and D L represent the concentration of followers and the total concentration of leaders at position x, defined similarly to (84). We use the following parameters a L j F = 0.015, a L j L j = 0.025, j = 1, 2.
The weighting function g j (λ) is defined as in (89) with C = 20 andλ = 0.5. Finally, the cost functional is defined by the Lagrangian defined in (80), with λ = λ L 1 since only the radical leaders are controlled. Radical leaders aim to steer followers towards x = − 0.75 and keeping track of followers average position with weighting parameter α = 0.85 and θ(λ) = 1 − (λ). We account for quadratic penalization of the control in (81) by choosing γ = 2. In Fig. 5, choice of the initial data, and the marginals μ F Ψ (t, x), μ L 1 Ψ (t, x) and μ L 2 Ψ (t, x) relative to the opinion space, and to the label space ν Ψ (t, λ), defined as follows where here λ = (λ L 1 , λ L 2 ), the parameters are σ 2 λ,F = 1/40, σ 2 λ, j = 1/100, σ 2 x, j = 1/60, σ 2 x,F = 1/250,λ F = (0.2, 0.2),λ 1 = (0.2, 0.65),λ 2 = (0.65, 0.2), x F = 0, x 1 = − 0.65, x 2 = 0.65, and C 0 is a normalizing constant. Figure 6 reports from left to right four frames of the marginals up to time t = {5, 15, 27.5, 50}, without control. Without the action of a policy maker, the majority of followers are driven close to the initial position of populist leaders L 2 , who interact with a wider portion of agents. In Fig. 7, the control action of the policy maker is activated resulting in a different distribution of the followers: while the populistic leaders retain some capability of attraction, the portion of the followers which is driven towards the target positionx of the radical leaders L 1 is considerably larger than in the uncontrolled case (Fig. 8).