Kinetic models for optimal control of wealth inequalities

We introduce and discuss optimal control strategies for kinetic models for wealth distribution in a simple market economy, acting to minimize the variance of the wealth density among the population. Our analysis is based on a finite time horizon approximation, or model predictive control, of the corresponding control problem for the microscopic agents’ dynamic and results in an alternative theoretical approach to the taxation and redistribution policy at a global level. It is shown that in general the control is able to modify the Pareto index of the stationary solution of the corresponding Boltzmann kinetic equation, and that this modification can be exactly quantified. Connections between previous Fokker–Planck based models for taxation-redistribution policies and the present approach are also discussed.


Introduction
Any society with a growing reliance on capital experiences an increasing concentration of wealth, which leads in general to a marked social inequality. How to reduce these social inequalities in capitalistic countries is a debated issue. The usual government policies are to use proportional taxation, with the expectation that a progressive tax system would prevent excess concentration of wealth. A recent approach to this relevant economic question can be found in Piketty [1], whose main conviction is that the effect of the tax on capital income is not only to reduce the total accumulation of wealth, but to modify the structure of the wealth distribution over the long run. In other words, a confiscatory tax on high incomes combined with a progressive tax on the value of the capital is viewed by Piketty as the only way to prevent the natural tendency of capitalism to head towards excessive inequality.
As a matter of fact, long term predictions on economic systems are very difficult to justify, and a serious debate would require a rigorous analysis based on well-established models of wealth distribution. In this developing area of research, mathematical modeling of economic systems has had interesting advances in recent years [2][3][4][5].
Starting from the pioneering studies of Angle [6], most of these models sink their roots into statistical mechanics [7,8], and are based on methods borrowed from the kinetic theory of rarefied gases and the Boltzmann equation [9,10]. The main original motivation at the basis of this modeling was to understand the possible reasons of formation of heavy tails in the distribution of a e-mail: bd80@sussex.ac.uk wealth, as predicted by the economic analysis of the Italian economist Vilfredo Pareto [11].
One of the kinetic models of wealth distribution able to reproduce the formation of Pareto tails on the basis of few physically plausible hypotheses has been introduced in 2005 in [12]. There, the evolution of wealth has been based on binary trades modeled to include the idea that wealth changes hands for a specific reason: one agent intends to invest their wealth in some asset, property etc. in possession of their trade partner. Typically, such investments bear some risk, and either provide the buyer with some additional wealth, or lead to the loss of wealth in a non-deterministic way. An easy realisation of this idea consists in coupling the saving propensity parameter [13,14] with some risky investment that yields an immediate gain or loss proportional to the current wealth of the investing agent. Leaving the details of the microscopic trade to Section 2.2, we recall here that the model for wealth distribution introduced in [12] revealed to be very flexible with respect to the addition of further economic aspects, including the possibility of studying the effects of taxation and redistribution [15][16][17], the role and consequences of the addition of a parameter describing agent's knowledge [18], and the possibility to use the kinetic interaction operator to construct suitable equations of hydrodynamics [19,20].
Going back to the problem of capitalistic societies and wealth inequality, it is interesting to remark that the numerical simulation of the evolution of the kinetic model for wealth and knowledge developed in [18], led to the conclusion that the unequal distribution of knowledge in a multi-agent society is itself a cause of an unequal distribution of wealth among agents. Other aspects of wealth inequality and surplus theory have been recently analysed from the mathematical point of view [21], with the aim to find a relationship between agents' risk aversion and inequality of incomes. These studies clearly outline the importance of resorting to mathematical modeling to test and eventually verify economical hypotheses.
In this paper, we will discuss a possible alternative to the standard taxation and redistribution rules, which relies on a suitable control applied to the microscopic trades describing the wealth distribution of the multiagent system. Recent applications of control problems to kinetic models with binary interactions describing opinion formation can be found in [22,23] (cf. also [24] for an exhaustive review). Indeed, the possibility to effectively exercise a control on opinion and to evaluate the impact of modern communication systems, like social networks, to the dynamics of opinions, is a challenging problem of increasing importance.
More precisely, we assume the existence of a policy maker (a government or a local administrator) that applies a suitable control process to each economic interaction with the aim to minimize a given cost functional measuring the wealth inequalities in the system. This control acts as an agent dependent taxation/redistribution dynamic and, for the sake of simplicity, it is assumed conservative over the whole set of agents so that the total amount of wealth remains unchanged. The resulting constrained dynamic takes the form of an optimal control problem which, for a large set of agents, turns out to be computationally prohibitive due to its intrinsic complexity and therefore approximate solution are sought even if suboptimal. Among various possible approaches here, following [22,23], we apply a finite time horizon strategy based on model predictive control. In the simpler case of instantaneous control the problem can be solved explicitly giving rise to a feedback control that can be embedded in the microscopic system.
By considering binary interactions, the application of this feedback control can be shown to change the saving propensities of the agents, which induces a smaller variance for the density of wealth of the population. For the binary dynamic introduced in [12] the corresponding feedback control originates a Boltzmann equation whose stationary states, compared to the original uncontrolled model, have a larger Pareto index. An explicit result in the direction of Piketty's opinion [1] is that, in the quasi-invariant interaction limit, among others, we can recover the same Fokker-Planck equation resulting from a standard taxation and redistribution process [15,16].
The rest of the manuscript is organized as follows. In Section 2 we introduce the microscopic model in the optimal control setting. For this model we derive the explicit feedback control in a finite time horizon approximation and focus on the binary interaction case. Section 3 is devoted to the study of the corresponding kinetic models. We focus on the CPT model [12] and show that the action of the control is capable to increase the Pareto index of the corresponding wealth distribution, thereby reducing inequalities. To have a further insight in the stationary states of the system, in Section 4 we pass to the limit controlled Fokker-Planck equation and show how it can be reinterpreted as a taxation-redistribution model. Some numerical simulations which confirm our analysis are also reported.
2 Optimal control of wealth inequalities

A microscopic model with control
Let us consider the microscopic evolution of the wealths of N agents, where each agent's wealth w i , i = 1, . . . , N, evolves according to the following first order dynamical systeṁ In (1) the nonnegative constants a ij define the exchange parameters of the trades and the u i 's are control terms. In general, to ensure the positivity in time of the wealth variables, it is assumed that the exchange parameters satisfy a ij < 1 for i, j = 1, 2, . . . , N . The controls u i act in order to redistribute wealth with the aim to decrease the variance of wealth among agents. This can be achieved by minimizing the functional arg min where U is the space of admissible controls, w = (w 1 , . . . , w N ), u = (u 1 , . . . , u N ) and L j (w) is a target cost functional which measures the level of wealth inequalities in the system. An example is given by where for m = 2 we have a classical quadratic cost functional which corresponds to minimize the variance of the wealth among all agents. The constant ν > 0 is a selective penalization parameter which takes into account that we may want to apply different taxation rules to different level of incomes. As we shall discuss later on, since the control essentially acts on interactions among agents of the system, the constant ν can be assumed to depend on the frequency of exchanges. In this way, the control u can be understood as the external action of a government which aims to reduce inequalities, by acting on exchanges, through wealth-dependent taxation and redistribution among agents.
Problems (1)-(2) can be reformulated as Mayer's problem and solved by dynamic programming or Pontryagin's maximum principle [25,26]. However, the main drawback relies on the fact that the equation for the adjoint variable has to be solved backwards in time over the full time interval [0, T ]. In particular, for large values of N the computational effort becomes prohibitive. Also, assuming u = A(x) where A fulfills a Riccati differential equation cannot be pursued here due to the large dimension of A ∈ R N ×N and a possible general nonlinearity in the coefficients a ij (see [27]). A standard methodology, when dealing with such complex system, is based on model predictive control where instead of solving the control problem over the whole time horizon, the system is approximated by an iterative solution over a sequence of finite time steps [28].

Instantaneous control
We derive a feedback control u based on a finite time horizon strategy. This feedback control will in general only be suboptimal. Rigorous results on the properties of u for quadratic cost functional and linear and nonlinear dynamics are available, for example, in [28]. The receding horizon framework applied here is also called instantaneous control in the engineering literature.
Following the approach in [24], we assume a finite time horizon ∆t ≤ 1 and in a time-discrete setting with times t n = n∆t we consider the problem In this case we are led to minimize the cost functional Let us first consider the case of a quadratic cost functional, namely (3) in the case m = 2. The necessary optimality conditions (which can be obtained by direct differentiation with respect to u n i ) yield where as usual δ ij denotes the Kronecker delta. Solving for the controls u n i we get wherew n+1 = N j=1 w n+1 j /N denotes the mean wealth of the agents at time (n + 1)∆t. Note that the above controls satisfy the identity which implies that all taxes are redistributed among agents.
Using the discrete dynamics (4) we finally obtain the explicit expressions Expression (7) furnishes a feedback control for the fully discretized problem, which can be plugged as an instantaneous control into (4). Note, however, that the instantaneous control (7) in the discretized dynamics (4) is of order O(∆t). In order to obtain an effective contribution of the control in the dynamics we will make some further natural assumptions. First, we assume that the penalization parameter ν scales with the time discretization as ν = 2γ∆t. This is consistent with the idea that for very short time horizons we need a stronger control to achieve the desired goal. Second, if one agrees with the fact that a control on wealth has to depend also on the frequency and intensity of interactions, one is lead to assume that the parameter γ has to depend on the sum A of the exchange parameters a ij , and it is inversely proportional to A. This guarantees that, in absence of exchanges in the system, the control on wealth looses its meaning. In this way the instantaneous controls reads In the above setting, if we assume a ij = a ji , the mean wealth is conserved, so thatw n+1 =w, and the minimization of the functional J ∆t (w, u) corresponds to minimize the quadratic inequality indicator Note that a standard indicator of wealth inequality, closely related to the one above, is the Gini coefficient, defined as In our setting, minimization of the Gini coefficient corresponds to the cost functional (3) where again we have i u n i = 0 and therefore all taxes are redistributed. In this case, however, even using the expression of the dynamic (4) it is not possible to give an explicit expression to the above control term. Similar conclusions are obtained for m > 2.
Remark 1. A more realistic dynamic typically includes a random part into the evolution of the wealth system, which now readṡ In (12), the η i , i = 1, 2, . . . , n, denote a sequence of independent and identically distributed random variables such that η i = 0 and η 2 i = σ, where · denotes mathematical expectation. The additional random part represents risks which are always present in economic trades [5]. It is reasonable, however, to assume that the control could act only on the deterministic part of the evolution.

Control of binary interactions
The special case N = 2 describes binary interactions. Binary interactions are at the basis of the kinetic description of wealth distribution in multi-agent systems [5]. In absence of risky components we obtain In the case of a quadratic cost functional we have Note that in the above formulation both the dynamics as well as the control functional operate at the level of the binary interaction pair (w i , w j ). Note again that the binary dynamics preserves the local mean wealth if and only ifā ij =ā ji . If we now define p = ∆tā ij , q = ∆tā ji we can write the controlled binary Boltzmann dynamics for the pair (v, w) in the form with Collecting all terms together, and setting the binary relations (15) can be rewritten as (18) Hence, we observe that in the binary case the feedback control can be reformulated as a modification of the original mixing coefficients of the binary interaction. Note that, since by definition both p and q are less than one, and 0 < β < 1 (where β = 0 coincides with absence of control and β = 1 yields maximum control), the new mixing coefficientsp andq still satisfy 0 <p,q < 1.
It is interesting to consider the model predictive control approximation originated by the minimization of the cost functional for m = 1 in the case of binary interactions. In this case, in fact, assuming ν = 2γ∆t we have the implicit control definition Now setting z n ij = w n i − w n j from the binary interaction dynamic (13) we obtain the nonlinear equation It is easy to verify that the above equation admits a solution only for . (15) we have the explicit feedback control

Now using the same notations as in
Therefore, a fixed taxation amount is applied to the richer (and redistributed to the poorer) of the two agents only if the difference in wealth is above a certain threshold. Note that, the taxation process is such that v * ≥ 0 and w * ≥ 0 and that the resulting dynamic cannot be reformulated as a modification of the original mixing coefficients of the binary interaction as in (18).

Boltzmann models for wealth distribution with control
The basic model discussed in this section has been introduced in 2005 in [12] within the framework of classical models of wealth distribution in economy, to understand the possible formation of heavy tails, as predicted by the economic analysis of the Italian economist Vilfredo Pareto [11]. This model belongs to a class of models in which the interacting agents are indistinguishable. In most of these models an agent's state at any instant of time t ≥ 0 is completely characterized by his current wealth v ≥ 0 [3,4]. When two agents encounter in a trade, their pretrade wealths v, w change into the post-trade wealths v * , w * according to the rule [13,14] v The interaction coefficients p i and q i are non-negative random variables. While q 1 denotes the fraction of the second agent's wealth transferred to the first agent, the difference p 1 − q 2 is the relative gain (or loss) of wealth of the first agent due to market risks. It is usually assumed that p i and q i have fixed laws, which are independent of v and w, and of time. This means that the amount of wealth an agent contributes to a trade is (on the average) proportional to the respective agent's wealth.

The control of the Cordier-Pareschi-Toscani (CPT) model
In [12] the trade has been modelled to include the idea that wealth changes hands for a specific reason: one agent intends to invest his wealth in some asset, property etc. in possession of his trade partner. Typically, such investments bear some risk, and either provide the buyer with some additional wealth, or lead to the loss of wealth in a non-deterministic way. An easy realization of this idea consists in coupling a constant saving propensity parameter [13,14] with some risky investment that yields an immediate gain or loss proportional to the current wealth of the investing agent where 0 < λ < 1 is the parameter which identifies the saving propensity, namely the intuitive behavior which prevents the agent to put in a single trade the whole amount of his money. In this case As specified above, the coefficients η 1 , η 2 are random parameters, which are independent of v and w, and distributed so that always v * , w * ≥ 0, i.e. η 1 , η 2 ≥ −(1 + λ)/2. Owing to classical arguments of kinetic theory [5], it has been shown in [12] that the evolution of the wealth density consequent to the binary interactions (21) obeys a Boltzmann-type equation. Let us denote with f (v, t) the distribution of the agents wealth v ≥ 0 at time t > 0. Then, the equation for the evolution of f (v, t) can be fruitfully written in weak form. It corresponds to say that, for any smooth function φ, f satisfies the equation A simple computation shows that, unless the random variables are centered, i.e. η 1 = η 2 = 0, the mean wealth is not preserved, but it increases or decreases exponentially (see the computations in [12]). For centered implying conservation of the average wealth, so that Various specific choices for the η i have been discussed in [29]. The easiest one leading to interesting results is η i = ±µ, where each sign comes with probability 1/2. The factor µ ∈ (0, λ) should be understood as the intrinsic risk of the market: it quantifies the fraction of wealth agents are willing to gamble on. Within this choice, one can display the various regimes for the steady state of wealth in dependence of λ and µ, which follow from numerical evaluation. In the zone corresponding to low market risk, the wealth distribution shows again socialistic behavior with slim tails. Increasing the risk, one falls into capitalistic, where the wealth distribution displays the desired Pareto tail. A minimum of saving (λ > 1/2) is necessary for this passage; this is expected since if wealth is spent too quickly after earning, agents cannot accumulate enough to become rich. Inside the capitalistic zone, the Pareto index decreases from +∞ at the border with socialist zone to unity. Finally, one can obtain a steady wealth distribution which is a Dirac delta located at zero. Both risk and saving propensity are so high that a marginal number of individuals manages to monopolize all of the society's wealth. In the long-time limit, these few agents become infinitely rich, leaving all other agents truly pauper. One obtains four zones as depicted in Figure 1. Note that Zone 1 is not allowed since |µ| < λ. Using the notations of Section 2.3 we can solve the control problem for the CPT-model with risk where In the case of a quadratic cost functional we obtain as feedback control on the deterministic part the value Consequently the post-control interaction (18) has deterministic interaction coefficients Finally, if we assume ∆t = 1 (α = 1/2), we can write the controlled binary interactions where now β = 1/(1 + γ). Note that negative values of the wealth are now avoided if µ ∈ (0, λ(1 − β)) for λ(1 − β) > 1/2. This gives an upper bound for the maximum admissible control β < 1 − 1/(2λ).
In a similar way, if we consider the explicit control obtained for the cost functional (3) for m = 1 using (17) we have for deterministic part of the CPT model In presence of noise positivity of the wealth is guaranteed for µ ∈ (0, λ/2) and all values of β < 1 are admissible. It should be noted, however, that large values of β implies a stronger control but over a smaller number of agents (see Fig. 2, left).

Control and Pareto tails
The formation of stationary states and their properties have been systematically investigated in [3,29]. We briefly recall the main results. The stationary curve f ∞ (w) satisfies the Pareto law with index r, provided that f ∞ decays like an inverse power function for large w, More precisely, f ∞ has Pareto index r ∈ [1, +∞) if the moments are finite for all positive s < r, and infinite for s > r. If all M s are finite (e.g. for a Gamma distribution), then f ∞ is said to possess a slim tail.
One studies the evolution equation for the moments which is obtained by integration of (22) against Using an elementary inequality for x, y ≥ 0, s ≥ 1, one calculates for the right-hand side of (32) where S is the characteristic function given by Solving (32) with (34), one finds that either M s (t) remains bounded for all times when S(s) < 0, or it diverges like exp[tS(s)] when S(s) > 0, respectively. The function S is convex in s > 0 and S(0) = 1. It has a trivial root in s = 1 (due to the conservation in the mean property). It may have another non-trivial root, either in (1, ∞) or in (0, 1). There are three distinct cases: (i) If s = 1 is the only root and S(s) < 0 for all s > 1, then all moments are bounded, and the steady state distribution has an exponential tail; (ii) if a non-trivial root s = r in (1, ∞) exists, moments up to the rth moment are bounded and the steady state distribution has a Pareto tail; (iii) if S(r) = 0 for some 0 < r < 1, then f ∞ (w) = δ 0 (w), a Dirac at w = 0. For further details, we refer to [3,29], we also refer to [30] for more complicated wealth-condensed distributions [30].
We now illustrate the effect of the instantaneous control in the quadratic case (25) on the formation of the Pareto tail. Figure 3 shows the effect of the control on the formation of tails in the CPT model for different parameters λ and µ.
The left plot shows the uncontrolled case. It is obtained by numerical evaluation of the characteristic function S. In Zone II, s = 1 is the only root and S(s) < 0 for all s > 1, hence all moments are bounded, and the steady state distribution has an exponential tail. In Zone III, a non-trivial root s = r in (1, ∞) exists, and moments up to the rth moment are bounded, i.e. the steady state distribution has a Pareto tail. The color coding in Zone III indicates the increasing Pareto tail index, increasing from darker blue (r close to one) to lighter blue as r increases and to yellow as r → ∞. In Zone IV, there is a non-trivial root in (0, 1), and condensation occurs: the steady state is a delta distribution at zero.
We can similarly consider the controlled case, and numerically evaluate the characteristic function with modified mixing parameters. The right plot in Figure 3 shows the effect of the control. As the control is applied the region with slim tails (Zone II) is enlarged, while the zone with Pareto tails (Zone III) is shifted towards the condensation zone (Zone IV). The dashed green curves indicate the position of the contours in the uncontrolled case for comparison.

Quasi invariant limits 4.1 Controlled limit Fokker-Planck equation
The analysis of [29] essentially shows that the microscopic interaction (21) considered in [12] is such that the kinetic equation (22) is able to describe all interesting behaviours of wealth distribution in a multiagent society.
By assuming and a unitary average value of the initial density, it has been shown in [12] that the scaled density h(v, τ ) = f (v, t) satisfies in the limit ε → 0 the Fokker-Planck equation It is immediately recognizable that equation (37) has a unique stationary solution of unit mass, given by the Γ -like distribution [12,31] where This stationary distribution exhibits a power-law tail for large values of the wealth variable. The limit procedure induced by the scaling (36), called quasi-invariant limit of the kinetic equation (22), corresponds to the situation in which are prevalent the exchanges of wealth which produce an extremely small modification the pre-interaction wealths (grazing interactions), but we are waiting enough time to still see the effects.
By using the same scaling in the controlled interactions (23) for ∆t = 1, we formally obtain in the limit ε → 0 the Fokker-Planck equation where and u 0 (v, w) is the limiting value of the scaled control.
More precisely, by further assuming β = νε, in the quadratic cost case we have Clearly, we obtain the same Fokker-Planck equation (37) where now λ 0 is replaced by Since the variance of the steady state is decreasing with respect to λ, whenever λ 1 > λ 0 in terms of variance the control improves the distribution of wealth towards equality. At variance, a control based on minimizing the Gini functional for m = 1 leads to In this latter case, however, the limiting equation has a different structure with respect to (37) and we cannot compute explicitly the steady state.

Taxation-redistribution and limit Fokker-Planck equation
The CPT model with taxation and redistribution has been proposed in [16]. There, taxation was acting on interactions (21) to take away a percentage δ of the trade wealth, Then, the wealth taken away was redistributed according to some redistribution policy, given by a redistribution operator of the form Here, m(t) denotes the first moment of f , which, in general, makes the operator R δ χ nonlinear. Hence, in presence of taxation and redistribution, the weak form of the CPT-model takes the form Note that, by construction, the mean wealth in the system is preserved by equation (46). The weight factor multiplying the distribution function inside the square brackets in (45) has been taken to be linear in v for simplicity, also in order to involve in the mechanism only the most meaningful moments, those of order zero and one. Such a weight function contains only one disposable real parameter χ, a constant that characterizes the type of redistribution, and that determines the slope of the straight line as well as the value of v, whether physical or non-physical, at which the weight itself vanishes. For χ > 0 the redistribution acts in order to reduce inequalities proportionally to the distance from the mean wealth m(t). The other parameter has been determined by the constraint that the redistribution operator preserves the number of agents and actually redistributes the total amount of money that is being collected by taxation. Further details on the redistribution operator can be found in [5,16].
In a very recent paper [15] the quasi-invariant limit of the kinetic equation (46) has been considered under the same scaling (36), by further assuming that δ = κε. The resulting Fokker-Planck equation is now Note that λ 2 > λ 0 whenever χ > −1. In this case, the effect of the taxation and redistribution is to improve the distribution of wealth towards equality.
In the case of a quadratic cost functional, apart from the different meaning of the parameters appearing in (42) and (48), both control and taxation with redistribution have the same effect on the quasi-invariant limit of the CPT model, namely to increase the value of the coefficient of the drift operator in the resulting Fokker-Planck equation, thus giving a stationary distribution with smaller variance with respect to the original one. Interestingly enough, at least at the level of the Fokker-Planck equation, the effect of the taxation and redistribution (the constant λ 2 ) can be obtained by an instantaneous optimal control of the binary interaction simply imposing that the penalization is chosen to give λ 1 = λ 2 . This gives the identity From this point of view, the conjecture by Piketty [1] is verified at the level of this simple kinetic model.

A numerical comparison
In this section we first compare the effects of the different control mechanisms induced by different cost functionals in our feedback controlled kinetic models and then analyze the behavior of the kinetic model with local control originated by a quadratic cost functional with the kinetic model based on a global redistribution mechanism. All models have been solved using a direct Monte Carlo simulation method (see [5] for more details). The noise term has been taken as η i = ±µ, where each sign comes with probability 1/2. Therefore, we have η 2 i = µ 2 in (36). The number of simulated sample agents has been fixed to N = 5 × 10 4 and standard averaging procedures have been used after the steady state has been reached to reduce the statistical fluctuations.

Test 1. The effects of different feedback controls
First we compare the effects of the different control induced by the choice of the cost functional in (3). More precisely we compare the controlled kinetic model (cCPT) defined by (23) where the feedback control is defined by (25) for m = 2 and by (28) for m = 1. We fix the strength of noise µ = 0.25 and select λ = 0.95. With these choices we are in the power law asymptotic region of the CPT model (see Fig. 1). The maximum admissible control value for β is about 0.47 for m = 2. Initially each sample agent has a wealth w = 1 so that f 0 (w) = δ(1). The results are reported in Figure 4 for β = 0.2 and β = 0.4. Both controls mechanisms provide a marked reduction of inequalities in the system, in particular the reduction of the Pareto index in the power law tail is proportional to the penalization term β and comparable in the two models (see Fig. 4, left). On the other hand, the effects of the different controls processes are clearly evident for lower values of the wealth. Increasing β for m = 1 implies a taxation/redistribution process for larger differences in wealth (accordingly to Fig. 2) which as a results gives less opportunities for agents with low wealth values to benefit of the inequality reduction process.  Next, to emphasize the reduction of wealth inequality in Figure 4 (right) we have also plotted the Lorentz curve defined as (10) can then be thought of as the ratio of the area that lies between the line of equality (the line y = x of perfect equality) and the Lorenz curve over the total area under the line of equality. In our test case we have a value of G 1 ≈ 0.46 in the uncontrolled case. For β = 0.2 it is quite evident that the feedback control with m = 1 yields a stronger reduction of the Gini coefficient (G 1 ≈ 0.3 for m = 2 and G 1 ≈ 0.28 for m = 1), whereas for β = 0.4 the two control gives analogous results (G 1 ≈ 0.25 for both models).

Test 2. Local control and global redistribution
Next we compare the kinetic model obtained by minimization of a quadratic cost functional defined by (27) with the corresponding model based on a global taxation/redistribution process in (44)-(45). The simulation is performed in the quasi-invariant scaling defined by (36) together with the further scaling β = νε and δ = κε. In all test cases, the parameters in the two models are related by assumption (49) so that in the limit ε → 0 their solution should coincide. We report the results obtained with the different models for various values of the scaling parameter ε. In this way for ε = O(1) we can emphasize the different behavior of the local control when compared to a global redistribution policy, whereas for ε 1 we can verify the asymptotic procedure that lead to the same Fokker-Planck equation.

The ε = O(1) regime
In the first test case we compare the controlled (cCPT) model for m = 2 and the redistributed (rCPT) model in absence of scaling, or equivalently taking ε = 1. We set µ = 0.25 and λ = 0.95 as in Test 1 and the same initial data. For the redistributed model we fix δ = λβ/2, so that the taxation process of the two models is the same in each binary interaction, and choose χ = 0 so that the redistribution process is independent from the wealth. As expected, with these choices the two models show a rather similar behavior. The results of the corresponding stationary solutions are reported in Figure 5 (left) for β = 0, 0.15 and 0.3. The different slopes of the tails clearly show how both models are capable to reduce the inequalities in the wealth distribution. Note that the models behavior is different for small values of the wealth, since in the redistributed model the density of agents with wealth below δ is exactly equal to zero. In Figure 5 (right) we report the corresponding Lorentz curves.

The limit ε → 0
Finally, we consider the scaling process that leads from the kinetic Boltzmann models to their corresponding Fokker-Planck descriptions (39) and (47). We consider the same data as before but for a fixed value of β = 0.15 and various values of the scaling parameter ε = 0.1 and 0.01. For this choice of parameters in the limit ε → 0 we obtain a Pareto index r = 1.8 in the uncontrolled case and r = 4.2 in the controlled case with β = 0.15.
The results are reported in Figure 6. Since for large values of the wealth the two models give very similar results and show the same power law behavior of the limit Fokker-Planck model, to remark the differences we considered a region of the density function close to the left boundary w = 0. The convergence of the models towards the analytic steady state of the Fokker-Planck model is evident.

Conclusions
In this paper, we have introduced a possible alternative to the standard taxation and redistribution rules, which relies on a suitable control applied to the microscopic trades describing the wealth distribution of a multi-agent system. The constrained system is then approximated by a finite time horizon strategy which allows to embed explicitly the control in the interaction rules. We emphasize that the resulting form of the control is closely related to the choice of the cost functional. Different cost functionals originate different taxation/redistribution strategies. We analyze in details the case of a cost functional which aims at minimizing the variance of the wealth distribution and the case of a cost functional which minimizes the well-known Gini coefficient. The corresponding kinetic models based on binary interactions can then be derived and show that the control is able to modify the corresponding Pareto tails. This can be further analyzed with the aid of some numerical simulations by considering the corresponding quasi-invariant Fokker-Planck limit and its relationship with previous models based on global taxation and redistribution.

Author contribution statement
Each author has contributed equally to the theoretical findings of this paper. The numerical experiments have been realized by B. Düring and L. Pareschi.
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.