Taxation and evasion: a dynamic model

In this paper we study tax evasion by means of a unified framework, based on a behavioral approach, where each individual decision with respect to tax compliance is driven by either personal evaluations of the available information, correlated to income and the perception of the quality of the public good, and social influences, derived by the known decision of neighbors. Our model relies on individual utility functions and describes the tax-evasion problem by means of a personal evolutionary scheme, in which each citizen dynamically adapts her behavior as a response to changing economic and social factors. We will show basic economic intuitions on the relevance of penalties, imitation, satisfaction and risk aversion by means of an analytical model and its agent-based companion version, in order to analyse different elements influencing tax evasion and their dynamic effects. In particular, it is shown how the tax-evasion probability changes as a function of the risk-aversion and specific focus is dedicated to the role played by non-monetary elements of utility in inducing high levels of tax compliance also at substantially reduced fines.


Introduction
The tax evasion and, more broadly, the black economy results as collective problems deriving from the aggregation of selfish behaviors assumed by free riders, i.e., citizens consuming public goods without contributing to their costs. Both result in diminishing tax revenues, thus leading to perverse systemic effects, such as severe limitations to the possibilities of governmental expenditure decisions, reduction in the quality and the quantity of supplied services, inefficient and negative redistributive consequences, decrease in social welfare. Being the free rider problem a consequence of the degree of ethical awareness of persons, reflected collectively in the degree of civic maturity of populations, it is perhaps useful to address personal and social elements influencing the decision to evade the tax burden. Indeed, the related literature has shown that, for example, people are more likely to evade if they perceive that others are evading, especially in presence of not too high probabilities of being caught (see, for example, Alm et al. 2015, Dubin (2007, McGee (2012), and Torgler (2007)). Notable differences can be observed also depending on countries, as in Pickhardt and Prinz (2012), Riahi-Belkaoui (2004), and Torgler and Schaltegger (2005), while Kirchler (2007) and Shu et al. (2012) offer a focus on a series of factors affecting tax compliance, such as perceptions about tax fairness, attitudes and trust toward government, social and personal norms. Consequences of evasion are also addressed in other contributions, as in Andreoni et al. (1998), Slemrod and Yitzhaki (2002), Torgler (2002), Kirchler (2007), and Slemrod (2007), among others.
Models of tax evasion have been developed since the Seventies, by considering optimizing individuals -fully informed on audit rates and penalties -facing different audit strategies. Such contributions discuss also the impact on evasion of different tax rates, income distribution and the choice of the basis for taxes computation. Examples are, among others, Allingham and Sandmo (1972), Yitzhaki (1987), Clotfelter (1983), Crane and Nourzad (1987), Poterba (1987), Panades (2004), Dalamagas (2011), Alstadsaeter et al. (2017), Modanese (2014, 2016). A relatively recent approach to the problem has been advanced by adopting agent-based models, as shown by Bloomquist (2006), Alm (2012), Hokamp (2013), Pickhardt and Prinz (2014), and Bazart et al. (2016). The specific advantage of this class of models is that they study the rise of bottom-up decentralized decisions, which is particularly helpful to analyze collective problems, specially with reference to complex aggregate contexts. Simulative models can be, then, particularly helpful in studying behavioral attributes, as in previous studies of socio-economic and financial markets analysis, as in Pluchino et al. (2010Pluchino et al. ( , 2018, Biondo et al. (2013), Squazzoni et al. (2014) and Biondo (2019). Dealing with tax evasion, agent-based models can replicate the importance of social norms and auditing, as in Hokamp and Pickhardt (2010), the effect of social networks on the tax compliance, as in Vale (2015), or even the impact of network configurations on the compliance, as in Andrei et al. (2014), and in Gamannossi degl'Innocenti and Rablen 2020. Korobow, Johnson and Axtell 2007 consider geographical spillovers and contagion effects in a model with networked agents, who are aware of their neighbors' actions and perceive the reputational pressure of conformity. Related literature debates on the hypothesis that social information plays a role in influencing the compliance behavior as, among others, in Alm et al. (2017) and Alm andYunus (2009), andFortin et al. (2007). Levenko and Staehr (2021) found that social norms and awareness about the fiscal system are key factors in inducing tax compliance.
From a policy perspective, there are a number of reasons to fight tax evasion. Nonetheless, politicians cannot assume drastic positions about the topic, specially in contexts where the problem is highly diffuse. The (https:// www. irs. gov/ PUP/ newsr oom/) reports that the expenditures paid by governmental authorities to induce virtuous behaviors are significant. However, in many cases, free riders remain unpunished while providing the bad example of being smart enough to exploit common resources without paying their part. One of the main findings of Biondo et al. (2020) confirms that it is possible to find a suitable configuration of audits and fines that can eradicate evasion. Nonetheless, the prevailing economic intuition since the Nineties is that, although taxpayer reporting increases with greater audit and penalty rates, these responses are not large (Alm et al. 1992). The prevailing literature suggests that authorities should prefer a client-service approach, instead of a cops-robbers enforced compliance (Kirchler et al. 2008;Prinz et al. 2014). The known conflict between individual and collective rationality (Rapoport 1974) creates the paradoxical outcome of the prisoner's dilemma. A vast amount of literature, regarding the production of public goods, as in Heckathorn (1996), the emergence of social norms and social interaction, as in Hardin (1995), and Voss (2001), among others, pose the question about the individual decision to cooperate. Cooperation could be the optimal choice because of different reasons. First of all, simple altruism, as recalled, for just some examples, in Stevens (2018) and Epstein (1993); secondly, imitation, as in Callen and Shapero (1974), in Elsenbroich and Gilbert (2014) and McDonald and Crandall (2015); alternatively, needed quality of the public good (i.e., quality of Institutions) or perceived amount of public expenditure, as in Nicolaides (2014), Feld and Frey (2002), and Torgler and Schneider (2009), and Pellizzari and Rizzi (2014), among others.
The main aim of this paper is to study tax evasion by means of a unified framework, based on a behavioral approach, where each individual decision with respect to tax compliance is driven by either personal evaluations of the available information, correlated to income and the perception of the quality of the public good, and social influences, derived by the known decision of neighbors. In particular, defined the individual utility function, we aim at providing a description of the tax-evasion problem by means of an individual-based evolutionary scheme, in which each citizen dynamically adapts her behavior as a response to changing economic and social factors. We will show basic economic intuitions on the relevance of penalties, imitation, satisfaction and risk aversion by means of an analytical model and its agentbased companion version, in order to analyse different elements influencing tax evasion. In particular, we define a generalized utility function aimed at providing a description of the tax-evasion problem in terms of three components: income, social influence and perceived quality of the public good. We will study dynamical effects produced by all utility components and their respective parameters.
We will show how the tax-evasion probability changes as a function of the riskaversion, thus analytically proving and numerically verifying that the level of riskaversion determines the value of the penalty factor above which evasion ceases to be an optimal strategy. On the other hand, we will underline the role played by nonmonetary terms elements of utility in inducing high levels of tax compliance also at substantially reduced fines. In particular, under the action of both social influence and quality assessment, we find that a taxation scheme shifting the tax burden towards the fewer, highest income agents --which provide the largest fraction of the tax revenue--is generally more effective in fostering tax compliance.
The rest of the paper is structured as follows. In Section 2 the model is presented. In Section 3 we present the case of the homogeneous, mean-field population, in which all agents are equivalent and all connected to each other. The model will thus be thoroughly explored for a comprehensive understanding of the dynamics of the system and its stability landscape. In Section 4, we move to consider the case of the heterogeneous population, opening to a more policy-oriented analysis. In Section 5 we make a further step by considering the case of heterogeneous populations structured in social networks and analyse the significance of breaking the mean-field condition. Section 6 presents final concluding remarks. Some analytical details are reported in the I.

The model
Consider a set of N agents, Γ = {a1,a2,...,aN}, representing a population of citizens placed on the vertices of a (temporal) network G (Γ,L) , where L is the set of links representing social engagement among them. At each time step t, each agent a i gets a periodic gross income c i,t initially drawn from a given distribution p(c). For simplicity, income is assumed to be constant in time, so that c i,t = c i . The generic agent a i is also responsible to report to the Government the received income. In this respect, she adopts a simple "all-or-nothing" choice: she can either pay the due tax burden or evade it completely. Government is assumed to have always the possibility to control the fiscal compliance of citizens; therefore, each agent a i has a probability π a to be audited, which is invariant in time.
At each time step t, agent a i makes her choice by comparing utility levels deriving from two competing alternatives, by means of her individual utility function, defined as: being σ ∈{pay,evade} the label indicating the strategy for which the utility is computed. Utility is expressed as a function of three components. An endogenous one, Y i,t , which depends on the monetary income as a function of the tax compliance of the agent; and two exogenous: S i,t , depending on the imitation of peers' behavior in order to minimize the psychological cost entailed by acting differently (i.e., contrarily); and Q i,t , related to the individual assessment of the supplied public good (in turn dependent on collected tax revenues.) More precisely: -When the agent a i receives the amount c i , if the chosen strategy is pay, the net income is reduced to X = c i − T(c i ), being T(c i ) < c i the due tax burden. Instead, should the agent choose the strategy evade, by accepting the risk to be audited, the final income received by her would be either X = c i − PT(c i ) or X = c i , according to the fact that she has been fined after being audited, or not, being P the penalty factor. Thus, similarly to Allingham and Sandmo (1972), the first component of utility is computed as a continuous function y i (x), supposed increasing and concave, i.e., with y � i (x) > 0 and y �� i (x) < 0 , in order to model risk-aversion for perceived gains. Agents ignore the exact value of the audit rate π a , but they infer it on the basis of the available partial information, coming either directly from their own past experience (whether or not they were already audited at the previous time step) or indirectly by shared information by their neighbors. In particular, given the network G (Γ,L) , each agent has a personal neighborhood I i,t ⊂Γ defined as the set containing herself and other agents connected to her in G (Γ,L) . We assume that the personal estimate of the audit probability of agent a i is π i,t = |Ξ i,t− 1 |/|I i,t− 1 |, where |Ξ i,t− 1 |, is the cardinality of the set containing agents audited at the previous time step belonging to her neighborhood. Therefore, in general, π i,t ≠π a . In order to have a common scale for all utility components, we normalize the disposable income with respect to its maximal value, and define x = X/c i , so to have y i (x). Thus, the utility coming from the monetary gain is perceived by the agent in relative terms, consistently with the assumption that she knows in advance her gross income. Accordingly, the monetary contributions to the individual utility entailed by the strategies pay and evade are defined as, respectively, and -Each agent a i estimates the current proportion of tax evaders in the entire population as the number of evaders at time t − 1 among her neighbors.Implicitly, we are assuming that an agent is always able to retrieve such information from her social contacts. This assumption is here useful to show the impact of such a complete awareness; it might be not fully satisfied in reality, although not impossible. Reduced information can be reasonably thought as providing a more moderate impact. Defined as A the adjacency matrix of the network G (Γ,L) , with generic element A ij = 1 if (i,j) ∈ L, i.e., agent a i and agent a j are linked neighbors, and A ij = 0 otherwise, the estimate of the proportion of evaders computed by agent a i is defined as: where e j,t− 1 = 1 if agent a j evaded at time t − 1, and e j,t− 1 = 0 otherwise. Then, the agent a i perceives the social pressure toward, respectively, evasion or compliance, if ρ i,t > 1/2 or ρ i,t < 1/2, while feeling no pressure if ρ i,t = 1/2. We define the term S i,t by the continuous, increasing and odd function s i i,t − 1∕2 , i.e., s � i (x) > 0 and s i (−x) = −s i (x), with s i (0) = 0 . Then, the contribution to the utility is given by either S evade , for strategy pay.
-The quality of the public good supplied by the government at each time step t is modelled as a function of the tax revenue at time t − 1, i.e., , thus it can be considered as dependent on the level of tax compliance of the population, whose maximal tax revenue -obtained in the absence of tax evaders-would be T max = ∑ N i=1 T(c i ) . More specifically, we introduce a quality function G(x), bounded in [0, 1] , increasing and concave so that the production function of Government is implicitly assumed to exhibit decreasing returns, i.e., with G � (x) > 0 and G �� (x) < 0 , with x = T t−1 ∕T max . We also assign to each agent a personal expected quality level q i ∈ [0,1], which remains constant in time: the comparison between q i and G t determines either satisfaction or frustration in the agent a i , thus inducing adaptive behavior. The model considers both cases of positive and negative adaptations. A positive feedback would consist in the fact that a citizen satisfied by the quality of the public good (i.e., q i < G t ) maintains a high tax compliance as to award the Government for the appreciated provision, whereas an unsatisfied citizen (i.e., q i > G t ) would choose a revenge strategy by evading if the quality of the public good is low. Instead, a negative feedback would consist in the fact that a citizen satisfied by the quality of the public good may decide to evade in order to maximize the free rider rent, whereas an unsatisfied citizen decides to pay because she understands the relevance of collecting more taxes. Thus, the term Q i,t related to the quality of the public good is modelled by i g i q i − G t , with γ i = 1 for the positive-feedback case and γ i = − 1 for the negative-feedback, being g i (x) a continuous, increasing and odd function. Thus, the contribution to the utility is computed as either Q evade , for strategy pay.
Agents' choice is simplified at the highest grade: each agent will be prone to adopt, at each time t, the strategy entailing the highest overall utility level. Eq. (1) is actually computed, for each agent a i , as a weighted average of the three abovepresented utility components, i the vector of a i 's individual weights. Accordingly to Eq. (5), a i is more likely to select the evade (resp. pay) strategy if U evade More precisely, the more positive (resp. negative) is the utility difference ΔU i,t ≡ U evade i,t − U pay i,t , the more (resp. the less) probable the agent is to Taxation and evasion: a dynamic model adopt strategy evade (resp. pay).
the final form of the utility difference results to be Conveniently, the functions y(x), s(x) and g(x) will be chosen to take values in the interval [− 1,1]. As a consequence, also the three terms (excluding the weights) in both, Eqs. (5) and (6), will lie in that interval, so that the a priori contribution of each of them is entirely committed to the magnitude of the respective weight. The reader is referred to Section 3 for a detailed discussion about the dynamical roles of each term.
We adopt a bounded rationality approach (Simon 1990), considering that agents select what they believe is the best strategy, given the elements they observe around them, at each time step. Indeed, the choice of agent a i at time t is determined on the basis of the information collected (directly and indirectly) by her. Then, a possible perspective of imperfect rationality, meant in the gametheoretical sense of a partial lack of consistency with respect to the best response strategy, is here modeled by assuming that the strategy entailing the highest utility is associated to a probability smaller than 1, whereas it attains exactly 1 in the asymptotic and hypothetical scenario that we could label as "perfect consistency" (or "perfect rationality"). In order to obtain such a behavioral configuration, the probability ξ i,t that agent a i chooses the evade strategy at time t, is computed as a logistic function, being, instead, i,t −ΔU i,t = 1 − i,t ΔU i,t the probability that she chooses the pay strategy. We, thus, account for the possibility that a i is compliant although ΔU i,t > 0 and, vice versa, that she evades although ΔU i,t < 0. The parameter β i ≥ 0 can be interpreted as a measure of the degree of ex post consistency (rationality) of agent a i : in the limit β i → ∞ , the function ξ i,t (x) converges to a step function ζ i,t (x), such that ζ i,t (x > 0) = 1, ζ i,t (x < 0) = 0 and ζ i,t (x = 0) = 0.5, thus granting that the agent's choice is fully consistent with the utility difference given by Eq. (6). Contrariwise, for any finite β i , 0 < ξ i,t < 1 is obtained. In particular, the case with β i = 0, would describe a completely random choice, i.e., ξ i,t = 0.5. Ideally, β can be calibrated to guarantee a certain level of consistency, for instance, at the maximal utility difference, |ΔU| = 1. Then β is chosen to provide an agent with a given percentage of (maximal) consistency, given by ξ(1) = (1 + e −β ) − 1 (for example, β = 6 would correspond to a maximal consistency of 99.8%). The lack of full consistency, as entailed by any finite value of β, can stem from various elements: people may have limited computational ability; they may generally possess only partial and noisy information about how fines and audits are implemented; they may feel persuaded by the behavior and the income of their peers, or, lastly, have a biased perception of the quality of public goods. Further, their behavior could be driven by endogenous factors as personal beliefs or cultural traits, leading, for instance, to an unconditioned compliance.
It is worth to notice that, since the utility difference ΔU i,t is linear in the weights, one may prefer, alternatively, to refer to ex post weights defined as β (X) i = βk (X) i , X ∈{Y,S,Q}, by absorbing β i in their definition. Thus, the higher is β (X) i , the more consistent is agent a i with respect to the (partial) best response strategy dictated by the utility component X. Being just a matter of definition, here we choose to follow with the first convention regarding k (X) i , X ∈{Y,S,Q}, as a weight and β i as an overall consistency parameter. All in all, individual behavior evolves as schematised in Fig. 1.
The numerical implementation of the model, addressed in the next sections, requires the selection of specific functional forms for the functions y(x), s(x) and g(x), defining the components of the utility function in Eq. (5). Following Kirkwood (2004), we choose being sgn(x) = x/|x| the sign function, and Eqs. (8)-(10) define generalised logistic functions whose growth rate λ i ≥ 0 regulates how quickly they converge to the boundaries {− 1,1}. It holds λ i = −(d 2 u i /dx 2 )/(du i /dx), ∀u i ∈{y i ,s i ,g i }. Therefore, for the monetary utility component y i , (Y) i represents the constant absolute risk aversion (CARA) of agent a i . Accordingly, her aversion to risk grows with (Y) i , while she becomes risk neutral in the (Y) i govern how the susceptibility of the respective utility components changes under, respectively, shifts in the fraction of neighbors adopting a certain strategy, and shifts in the quality of the public good. Lastly, the quality function G(x) is conveniently modelled by a logistic function similar to those in Eqs. (8)-(10), Strategy adoption scheme for a generic agent a i at the time step t of a Monte Carlo simulation. Here, r is a random variable, re-drawn uniformly in (0, 1) at each step, and for each of the agents 1 3 Taxation and evasion: a dynamic model with x = T t−1 ∕T max ∈ [0, 1] , as explained above, and η ≥ 0 is a constant parameter regulating how fast the returns of the production function of the Government decrease as collected tax revenues increase.

Homogeneous mean-field populations
We start by considering a homogeneous society where all agents are equivalent among them and all mutually connected (i.e., they possess complete information on any other agent). Specifically, all the agents share the same values of the individual parameters, i.e., Since all agents get the same gross income c, we define the tax simply as T(c) = c, being ∈ (0,1) the chosen tax rate. Moreover, since agents are assumed to interact, at each time step, with any other agent in the population, they agree on all estimates: Therefore, each agent interacts with an effective mean-field ρ t , which is also the probability of being a taxevader at time t for any agent in the population. As a consequence, ΔU i,t = ΔU t and i,t = t = 1∕(1 + e −βΔU t ) , ∀i. The system is, therefore, described by a single dynamic equation for ρ t : Equation (12) defines our analytical model. It will be studied in order to understand basic mechanisms driving dynamical features of the here proposed agent-based model. In the following, we first analyze the role of each utility components in influencing the stability of the system, and then we perform a linear stability analysis to know how attractive points of Eq. (12) are affected by changes in model parameters. Finally, we show potentially interesting results for some cases.

Stability analysis
Since individuals' choice is (fully) consistent with the utility difference, Eq. (6), -and therefore with its sign-we can grasp an essential understanding of the model by looking for the conditions at which that difference vanishes and the agent is indifferent with respect to the two strategies. To this end, let us first consider the special cases in which only one of the three components is present: -Monetary term, k = (1,0,0). This is the only endogenous term, giving a non dynamic contribution to an agent's utility, and also the only one depending on controllable parameters, that is, taxation and penalty policies. Given the audit probability π a and a tax rate = T(c)/c, there exists a value P of the penalty at which ΔU = ΔY/2 = 0. Although a general closed expression for P is found with some algebra from ΔU = 0, it has a complicated dependence on the other parameters. Instead, it is more instructive to look at the limits → 0 , at which y(x) ≃ x , and → ∞ , at which y(x) approaches a step function such that y(x < 0) = − 1, y(0) = 0 and y(x > 0) = 1. In the first case, ΔU = (1 − π a P) /2, hence P = a −1 , independently of the tax rate. On the other hand, ΔU is linearly proportional to the latter, therefore, for P <P , the larger the tax rate, the higher is the probability that an agent evades (vice versa for P >P ). In the second limit, ΔU approaches a step function such that ΔU = 0 for P < − 1 , ΔU = −π a /2 for P = − 1 and ΔU = −π a for P > − 1 , thus the higher the tax rate, the lower the threshold penalty to exceed in order to sharply increase the compliance probability of an agent. In this case, while being always non positive, ΔU presents a step whose magnitude is exactly equal to the audit probability. All in all, we can say that, to some extent, tax rate and audit probability progressively exchange their role when shifting from one limit to the other. Lastly, since ΔY (hence ΔU) does not depend on ρ, it is insensitive to initial conditions, meaning the system has only one attractive point, given by ρ ⋆ = (1 + e −βΔY/2 ) − 1 .
-Social term, k = (0, 1, 0). This is a purely dynamic term, depending exclusively on the fraction ρ of evaders in the population. The sign of ΔU = s( − 1∕2) is solely determined by how ρ compares with 1/2. Then, since ΔU is positive (negative) if and only if ρ > 1/2 (ρ < 1/2), if the inequality ρ > 1/2 holds at a given time, it will hold at all future times. Consequently, two attractive fixed points exist, one below and one above the repelling point ρ = 1/2. The initial condition ρ 0 will decide which of the two is reached by the system. The social term can, thus, be seen as always acting as a positive feedback loop. -Quality term, k = (0,0,1). This is a dynamic term depending on the fraction ρ of evaders through the normalized tax revenue T∕T max = 1 − , argument of the quality function G ≡ G(1 − ρ). Therefore, ΔU = g(q − G) = 0 for ρ such that q = G, easily found to be ̃≡̃(q, ) = 1 + ln 1 − q(1 − e − ) ∕ . From the fact that G decreases with ρ and increases with η, it can be shown that ̃ is an increasing function of η, while it decreases with the expected quality q.
Although ̃ plays here the role that 1/2 plays for the social term, the fact that, in general, it is ̃≠ 1∕2 , makes the analysis more involved. Let us first consider the positive feedback case (γ = 1). As for the social term, since ΔU is an increasing function of ρ, and can have more than one zero and, correspondingly, the system more then one fixed point. Suppose, for instance, that ̃> 1∕2 . Let us define ρ + ≡ ρ + (q,η,β) as the lowest ρ such that, if ρ ≥ ρ + at some time, it will be so for all future times. Since ΔU = 0 at ̃ , at the next time step = 1∕2 <̃ , hence necessary + >̃ . From the given definition follows that, if currently ∈ ̃, + , then ρ will eventually decrease below ̃ , and therefore converge to a fixed point below 1/2. Thus ρ + represents a repelling point, separating the basins of attraction of the two attractive points, one above it and one below 1/2. For fixed q and η, it depends on β through Eq. (12). Specifically, in the β → ∞ limit, + →̃ , for the new ρ will tend to 1 or 0 depending on whether the current ρ is greater or smaller than ̃ . For any finite value of β, instead, the fact that >̃ does not guarantee it will be so at future times: a stronger condition for ρ is needed, namely, for it to stay above ρ + . The smaller β, the smaller |ρ| for any fixed ΔU, and thus the higher ρ + . Eventually, if β is low enough, ρ + > 1, meaning the system has only one fixed point, i.e., that one below 1/2. The case with ̃< 1∕2 symmetrically mirrors this, as the analysis repeats identically in relation to − <̃ , the repelling point that, if positive, traps ρ below it (hence below ̃ ), while the other attractive point now lies above 1/2. At last, if ̃= 1∕2 , then simply + = − =̃ , with the two attractive points found one below and one above 1/2. Let us now consider agents with negative feedback (γ = − 1). In this case ΔU = −g(q − G) and ξ(ρ) is a decreasing function of ρ, implying either an oscillatory transient and that ρ − ξ(ρ), increasing with ρ, can only have one zero. The latter corresponds to an attractive or repulsive point depending on the chosen set of parameters. If attractive, the oscillatory transient dies away. If repulsive, the transient stabilizes into an attracting sequence of period 2 whose periodic points correspond to the fixed points of the second iterate ξ (2) (ρ) ≡ ξ(ξ(ρ)) (which, consistently, is an increasing function of ρ  1∕2] ), or, one below (above) 1/2 and one above (below) ̃ . On which side of the period-doubling bifurcation the system is, mainly depends on the value of β. In the β → 0 limit, ξ(ρ) becomes linear and can only vary in the small interval 1/2 ±β /4, thus |dξ(ρ)/dρ| < 1 everywhere and the unique fixed point is attractive. By increasing β, |dξ(ρ)/dρ| grows more the closer ρ is to ̃ , at which |dξ(ρ)/dρ| is approximately linear with respect to β. The bifurcation is then marked by the smallest β for which |dξ(ρ)/dρ| = 1 at the fixed point. Accordingly, in the β → ∞ limit, it is immediately verified that ρ alternates between 0 and 1.
All in all, the joint contribution of the three terms yields various stability landscapes. Depending on the set of parameters, it is yielded: a single, attractive, fixed point; a bistability; a tristability, under the combined action of the social and the quality terms (for positive feedback only), given ̃ is far enough from 1/2; a sequence of period 2 (for negative feedback only).

Stability of equilibrium states
We are now ready to look for the stability conditions of the equilibrium states of Eq. (12). For positive feedback, these are always fixed points (of ξ(ρ)). For negative feedback, they are either fixed points (of ξ(ρ)) or periodic points of period 2 (i.e., fixed points of the second iterate ξ (2) (ρ)). Even in the latter case, since the perioddoubling bifurcation occurs when the fixed point becomes repelling, knowing whether the condition for the stability of the fixed point is satisfied or not is enough 1 3 to establish on which side of the bifurcation the system lies. Therefore, we proceed by analysing the stability of the fixed points of ξ(ρ). Given the values of all parameters, these can be calculated from Eq. (12) once the utility components, i.e., the functions y(x), s(x) and g(x), have been chosen, for instance, as specified in Eqs. (8)-(10). Nonetheless, we can preliminary find the general condition that must hold in order for a fixed point ρ = ρ ⋆ to be attractive, for any opportune choice of these functions. Thus, suppose to displace the system from ρ ⋆ by an infinitesimal amount , so that it is now in the state ρ t = ρ ⋆ + . Expanding Eq. (12) up to the first order around = 0, one finds that t+1 = ⋆ + (ΔU)∕ | =0 = ⋆ + (ΔU)∕ | = ⋆ , being ≡ ∕ (ΔU)| = ⋆ = β e −β|ΔU| 2 = ⋆ a non-negative factor. It follows that ρ ⋆ is attractive if and only if ( )∕ | = ⋆ = | (ΔU)∕ | = ⋆ < 1 , being Equation (13) does not get any contribution from the monetary term, thus weighting it enough (i.e., lowering the other terms) entails the presence of only one, attractive, fixed point for the system. Still without making explicit the form of the utility components, we can study how an equilibrium ρ ⋆ depends on various parameters. Let a be the varying parameter (excluding β at this stage). Differentiating Eq. (12) with respect to a, at ρ = ρ ⋆ , and solving for ∂ρ ⋆ /∂a, yields The stability condition, whenever fulfilled, ensures the denominator to be positive. Consequently, the sign of ∂ρ ⋆ /∂a simply coincides with the sign of (ΔU)∕ a| = ⋆ . As shown in the I, the derivatives with respect to the penalty factor P, to the audit probability π a , and to the quality parameter q, are solely determined by the first order requirement on the utility components. In particular, for any set of values of all parameters, we find ∂ρ ⋆ /∂P < 0, ∂ρ ⋆ /∂π a < 0, and ∂ρ ⋆ /∂q > 0 for γ = 1 and ∂ρ ⋆ /∂q < 0 for γ = − 1.
Concerning the tax rate , the sign of ∂ρ ⋆ /∂ depends on the value of itself, given the utility component y(x) and the values of P and π a . Specifically, for any risk-averse agent, the range of values of P and π a for which ∂ρ ⋆ /∂ < 0 shrinks (or even disappears) as increases (see the I). Therefore, we can say that the higher , the smaller the region of parameters (P, π a ) for which an increase of the tax rate lowers ρ ⋆ . In particular, given the relatively low values taken by the audit probability and the penalty rate, ∂ρ ⋆ /∂ > 0 is typically verified in real scenarios, even supposing risk-neutral agents.

Results
Some results obtained by solving Eq. (12) are here presented and discussed. Given that the monetary parameters are the only ones -at short time scales, at least-to be controllable from a realistic policy perspective, we study how the equilibria ρ = ρ ⋆ are affected by them. In the following, we fix the tax rate at = 0.345 as in the flat taxation later described in Section 4.2, so that we can make a proper comparison between these results and those obtained for heterogeneous populations. Also, the audit probability is fixed to π a = 0.1, which is estimated to be an upper bound for many countries (Bernasconi 1998), whereas lower values of π a do not yield any qualitative change. Therefore, we study ρ ⋆ as a function of the penalty factor, P, and of the risk-aversion, λ (Y) , which is obviously essential to decide the impact of a penalty, albeit not controllable. This same setting is reproduced for different values of the consistency parameter, β. We then take q = 0.5 and η = 1, implying ̃(q, ) ≈ 0.62 ,and λ (S) = λ (Q) = 2, taken equal to better appreciate the effect of the sign of the feedback. Variations of these choices are discussed further on. At last, for the reasons explained at the end of Section 2, we focus on populations where the monetary part is dominant. Specifically, the weight vector is taken as either k = (4∕6, 1∕6, 1∕6) (from now on, denoted as configuration 'Y + '), in which case the monetary component weighs twice the sum of the other two, or k = (1, 0, 0) (denoted as 'Y 1 '), providing a reference for the classical, limit, scenario of exclusively monetary utility maximization, in line with Allingham and Sandmo (1972).
Results are depicted in Figs. 2 and 3, and support the analysis in Section 3.1. As we already know, by varying a single variable (i.e., P or λ (Y ) ), there is always only one fixed point in Y 1 . In Y + , instead, the system has one fixed point for negative Fig. 2 Stable fixed point ρ = ρ ⋆ (color-coded) of the homogeneous mean-field system described by Eq. (12), as a function of the risk-aversion, λ (Y) , and the penalty factor, P, for a population in the Y 1 configuration, i.e., only accounting for the monetary component. From left to right, the consistency parameter, β, is equal to 6, 10 and 20, respectively. Moreover, = 0.345, π a = 0.1 and ρ 0 = 0.3. The horizontal, grey line denotes P = − 1 , set as a threshold for values of risk-aversion high enough. For comparison with the curves obtained in the Y + configuration and those shown after for heterogeneous populations, we highlighted the respective results obtained here for λ (Y) ∈{0.5,5.0} by vertical, dashed white lines feedback (γ = − 1) almost everywhere and two fixed points for positive feedback (γ = 1). Altogether, the surface ρ ⋆ ≡ ρ ⋆ (P,λ (Y ) ) can change multiple times by varying the initial condition ρ 0 , while it can do it only once at a given point whenever a bistability exists. Specifically, for γ = 1, we show the results for ρ 0 = 0.3 and 0.6, while for γ = − 1 we show only the case with ρ 0 = 0.3, as in this case the system is almost insensible to initial conditions. The comparison between Figs. 2 and 3 displays the strong impact that even a small contribution coming from the non-monetary utility components can have on the proportion of evaders in the population.
The configuration Y 1 stands as a reference to evaluate the effect of non-monetary contributions to the utility. Figure 2 confirms what we learnt from the stability analysis in Section 3.1. For agent with small risk-aversion, increased fines have an impact as the penalty rate approaches P = −1 a (here equal to 10). On the other hand, for agents with high risk-aversion, compliance (i.e., 1 − ρ ⋆ ) is largely insensible to P below the threshold P ≈ − 1 (becoming an exact equality in the limit (Y) → ∞ ), while raising abruptly at it. No surprise then, in both cases, if the values of the penalty factor adopted in many countries -typically below 2 (Bernasconi 1998)-turn out to be largely ineffective. This is indeed confirmed by the experimental study Fig. 3 Stable fixed point ρ = ρ ⋆ (color-coded) of the homogeneous mean-field system described by Eq. (12), as a function of the risk-aversion, λ (Y) , and the penalty factor, P, for a population in the Y + configuration, i.e., with weight vector k = (4∕6, 1∕6, 1∕6) . For positive feedback (γ = 1) two solutions are shown, one for ρ 0 = 0.3 and one for ρ 0 = 0.6. For negative feedback (γ = − 1) we used ρ 0 = 0.3. Moreover, = 0.345, π a = 0.1, q = 0.5, η = 1 and λ (S) = λ (Q) = 2. The horizontal, grey line denotes P = − 1 , working as a threshold for values of risk-aversion high enough. For comparison with the curves shown after for heterogeneous populations, we highlighted the respective results obtained here for λ (Y) ∈{0.5,5.0} via vertical, dashed white lines 1 3 Taxation and evasion: a dynamic model of Alm et al. (1992) 1 . Looking at the precise values taken by ρ ⋆ , we see what we expect: indeed, since ΔY > 0 for P sufficiently close to 1, ρ ⋆ stays above 1/2 for any finite λ (Y) , while decreasing with the latter. As a consequence, in the limit of large β, ρ ⋆ is sharpened towards 1, except for large enough P and λ (Y) , for which it converges to 0.
For positive feedback (γ = 1), ρ ⋆ is sensibly larger or smaller in Y + than in Y 1 , depending on whether both λ (Y) and P are relatively low or high, respectively. How much low or high they should be, in turn, depends on the value of β, and we can understand this from the previous analysis. Indeed, in Y 1 , since ΔU does not depend on ρ, increasing β has the sole effect of making ρ ⋆ more extreme. On the other hand, whenever k ≠ (1, 0, 0) , changing β has the further effect of shifting the repelling point separating the two basins of attraction. As a consequence, by increasing the latter, we see in Fig. 3 (first two rows) that (i) the threshold curve moves, so that some points end up on the other side of it; and (ii) ρ ⋆ is sharpened at all points. The reason for (i) is to be found in the superposition between the positive feedback coming from the quality term and the always positive feedback coming from the social one. Indeed, all other conditions being equal, an increased β raises the initial effect of the non-monetary terms, which cause the threshold curve to move towards lower or upper values of P and λ (Y) depending on whether ρ 0 is low or high enough, respectively. The comparison with the Y 1 configuration proves that the effect of penalties is overall enhanced by the presence of the non-monetary terms. For β low enough, the abrupt drop of ρ ⋆ occurs at penalty factors smaller than the threshold P = −1 a holding for Y 1 , even in the limit of risk neutrality, whereas for larger λ (Y) compliance is always high (see the next paragraph for further comments on this), with a small, but sharp decrease at P ≈ − 1 . By increasing β, due to the shift of the threshold curve, in the lower solution (ρ 0 = 0.3) ρ ⋆ is always very low, and there is no point in increasing P; in the upper solution (ρ 0 = 0.6) it drops abruptly for high enough risk-aversion, provided P is sufficiently close to − 1 , while it remains very high for less risk-averse agents, unless P lies enough above P = −1 a . We note at this point that for P close or even equal to 1, contrarily to what one may expect by relying on the monetary term only, a very high compliance is found for wide ranges of the parameters, especially for high initial compliance (e.g., 1 − ρ 0 = 0.7). However, this is not surprising from what has been said above, as a sufficiently small ρ 0 makes both the social and the quality term raise the utility of the pay strategy enough to make it the preferred one. This is also true, independently of ρ 0 , whenever β and λ (Y) are, respectively, low and high enough. Indeed, by increasing λ (Y) , ΔY approaches 0 when P is enough close to 1 (precisely, when P < − 1 in the limit (Y) → ∞ ), while the contribution from non-monetary terms, even if positive (i.e., towards evasion) for large ρ 0 , is small, as their weigh is small. Consequently, given that β is not large, ρ first decreases towards 0.5 and from there, once smaller than ( q, , towards a small ρ ⋆ (below 0.5). A convergence to high compliance even when the fine does not imply any surcharge could still seem an unrealistic outcome. Under the assumptions of the model, this might be used as a criterion to constrain the parameters. However, the prediction of the degree of compliance of a population is beyond the scope of the present release of the model.
For negative feedback (γ = − 1), Y + yields similar results to Y 1 . This can be partially understood from the fact that the negative feedback tends to mitigate the final ρ ⋆ by compensating, to some extent, the positive feedback coming from the social term (indeed, such compensation is maximal for ̃(q, ) approaching 1/2, as the two terms always interfere destructively). Clearly, this is possible as long as the social and the quality terms contribute comparably to the utility function (which, given k (S) = k (Q) , motivates the choice λ (S) = λ (Q) ). Still, significant deviations between the two cases are seen while increasing β, as the sharp transition curve for high enough P (above P ≈ − 1 , becoming an exact threshold in the limit (Y) → ∞ ) and λ (Y) , settles much more rapidly in Y + than in Y 1 . In other words, the large β limit is reached earlier in the former.

Heterogeneous mean-field populations
In this section we consider populations where agents have heterogeneous characteristics, while being all connected. Before discussing the results, we adapt the microscopic Markov chain approach (MMCA) introduced by Gómez et al. (2010), which is an analytical model that, while preserving the heterogeneities of the agents and their (non trivial) social networks --studied in Section 5--, has the advantage of being much faster to simulate than the corresponding ABM.

MMCA approach
The model proposed in Section 2 defines a system depending, at each time step, on the previous time step only, thus describable as a Markov process. By means of the MMCA, we can track -analytically-the probability for agent a i to be a tax evader at time t, P i,t , by the following dynamic equation with ξ i,t given in Eq. (7). The proportion of evaders estimated by agent a i is while the tax revenue at time t − 1 is given by Taxation and evasion: a dynamic model Finally, being any agent audited independently with probability π a at each time step, the estimate of the audit probability π i,t , made by agent a i at any time t, coincides with its expected value, i.e., π a itself. While preserving the local heterogeneities of the system, the MCCA assumes the microscopic state variables P i,t i=1,…,N are mutually independent.

Results
In what follows, we present the numerical results obtained by integrating the MMCA according to Eq. (16) and compare them with the MC simulation results of an equivalent agent-based model (ABM). We consider populations of N = 2000 agents, all mutually connected, whose individual parameters, except their income, are drawn from normal distributions. Unless differently specified, we assume the following mean (denoted through a bar, e.g., x ) and standard deviations (indicated as σ, e.g., σ x ) for the various normal distributions: k i = (4∕6, 1∕6, 1∕6) ( k (X) = 0.02 , X ∈{Y,S,Q}), q = 0.5 (σ q = 0.05), ̄( S) =̄( Q) = 2 ( (S) = (Q) = 0.1 ). Moreover, as before, we take π a = 0.1, η = 1 and β i = 6, ∀i = 1, … , N . As a proxy for real distributions, the gross income of agents is drawn from a Pareto distribution with exponent α = 1.16 (Arnold (2014)). Clearly, the tax rate, i = T(c i )/c i , now generally depends on the gross income. Specifically, we consider three different taxation policies (see Fig. 4): a flat tax, yielding a constant tax rate -thus allowing us to make a proper comparison with the results shown in Section 3 for a homogeneous mean-field population; a progressive scheme with three tax brackets; and a continuously progressive one, i.e., with a continuously varying tax rate. In order to compare them directly, we The used gross income distribution implies that the agents in the last quartile -the third bracket-provides about the 99% of the maximal tax revenue, for any of the three taxation policies (precisely, the 98.9% for the flat tax, the 99.3% for the bracket progressive, and the 99.5% for the continuously progressive). This is a first fact to account for when interpreting the results. Moreover, as we know from Section 3.1, when the risk-aversion is low enough, the monetary term --hence the probability of evading--increases or decreases linearly with the tax rate, i , depending on whether the penalty rate lies below or above P = 1∕ a . Taking π a = 0.1 yields P = 10 , which is a higher value than those assumed in this section. Even more so by recalling that π a = 0.1 is likely an upper bound for realistic audit probabilities (Bernasconi 1998). Therefore, for the considered range of penalty rates, a higher gross income always entails a lower compliance when the risk-aversion is small. In the opposite scenario of high risk-aversion, as shown in Section 3.1, there is a critical value of the penalty at which the monetary term sharply decreases -in turn raising the probability of tax compliance-, defined by P i = 1, which generally depends on the income of agent a i . In any progressive taxation, this implies that lower values of the critical penalty are associated to higher gross incomes. Figure 5 illustrates the obtained results. We used ρ 0 = 0.3, which seems to be a realistic value for real-world levels of compliance 2 . Nonetheless, the results persist unchanged whenever ρ 0 stays below approximately 0.5, which is already an extraordinary level of evasion. Each of MMCA curves, resulting from the integration of the system of Eqs. (16), is constructed by connecting 100 points, each showing the average (solid line) and the standard error (ribbon) computed over 50 runs. Being the MMCA a deterministic approach, the variance in results solely derives from redrawing the individual parameters at each run, and not from statistical fluctuations. The MC points represent values averaged over 10 runs, whose standard errors are reported as error bars. MMCA and MC always appear to match nicely.
To begin with, we note that, under the flat tax, curves for the final fraction of evaders -and consequently for the tax revenue-quantitatively superpose to those found for the respective homogeneous population (see Fig. 3). This shows that, whenever the heterogeneity in the parameters is not large, i.e., whenever it makes sense to consider a population as reflecting a certain type of representative agent, the homogeneous mean-field model is an excellent approximation. On the other hand, while it cannot represent scenarios with variable tax rates, it nonetheless provides the (quantitative) insights to correctly understand them. In such scenarios, we observe phenomena exclusively deriving from the (high) heterogeneity among the gross incomes (therefore tax rates) of the agents.
Looking first at the case of low risk-aversion, i.e., ̄( Y) = 0.5 , we note two differences between the progressive policies and the flat tax. For positive feedback (γ = 1), the sharp drop in the fraction of evaders occurs at notably lower values of the penalty factor; precisely, at P ≈ 3 and P ≈ 2 for bracket and continuously progressive schemes, respectively, against a P ≈ 4 found for the flat tax. Although the precise value of such critical penalty factors depends on the various parameters and cannot be taken as quantitatively indicative for practical purposes, the strong variations seen here suggest that penalties substantially lower than expected could suffice to have an important impact on compliance. In this regard, as we have shown for homogeneous populations, such evidence is out of the scope of approaches based on representative agents, where also the gross income is homogeneous (Allingham and Sandmo 1972;Bernasconi 1998) 3 . As already explained for homogeneous populations, the existence of the sharp transition for all taxation policies stems from the interplay between the monetary term and the non-monetary ones: when P is above a critical value, the monetary term (which is positive, given that P <P ) is no longer able to compensate the contribution coming from the sum of the other two (which are initially negative for ρ 0 = 0.3), and the fraction of evaders drops. The smaller the tax rate, the smaller the critical P at which this occurs. In light of that, the shift of the transition towards lower values of the penalty factor is explained by the proportionality between the monetary term and the tax rate. As illustrated in Fig. 4, the tax burden due by the large majority of agents (approximately the 91%) is smaller in the progressive schemes than in the flat one. More precisely, the tax burden is less than half of the one due with a flat tax for about the 53% of agents in the bracket progressive, and for about the 68% in the continuously progressive. As a consequence, there is a large fraction of agents with a tax rate (hence a monetary term) substantially smaller than the flat tax rate, so that a smaller P is now sufficient to raise their probability of compliance and, in a rapid cascade catalyzed by the positive feedback provided by both the non-monetary terms, the compliance of almost the entire population. This is made explicit in the first row of Fig. 6, where the probability for an agent to evade (pay) is shown to always increase (decrease) with her gross income, for any progressive taxation. Said differently, progressive taxation combined with small riskaversion let the bulk of agents with a smaller gross income drive the system towards higher levels of compliance at lower penalty factors.
Still in the low risk-aversion case, the results for negative feedback (γ = − 1) are easily understood from what we just observed. As before, for any fixed penalty, the smaller the tax rate for an agent, the higher her compliance (see Fig. 6). Therefore, as we see in Fig. 4, the compliance slightly increases passing from the flat tax to the bracket progressive, and finally to the continuously progressive. However, as for Section 3, the effect of the fine is strongly reduced by the approximate compensation between the social and the quality terms, thus explaining the presence of very small variations for γ = − 1, independently of the taxation policy.
In the high risk-aversion scenario, i.e., ̄( Y) = 5.0 , the condition P i = 1 -defining the threshold value of P at which the monetary term sharply decreases for an agent with tax rate i -makes outcomes found for progressive schemes deviate from the ones found for flat tax -or homogeneous populations. In Fig. 5, for the bracket progressive, we highlighted two particular values of P: the first one is the value at which P i = 1 for the agents of highest income (i.e., in the third bracket); the second one Average probability for an agent to evade, ⟨P i ⟩ , as as a function of her gross income (each bin includes 40 agents), c i , for heterogeneous mean-field populations in the (average) configuration Y + ( k (X) = 0.02 , X ∈{Y,S,Q}), under the three taxation schemes used: flat (left column), bracket progressive (middle column) and continuously progressive (right column). For the bracket progressive taxation, the vertical dashed lines denote the values of income separating consecutive brackets, as in Fig. 4. It is worth to notice that the line on the right corresponds to the first threshold shown in Fig. 5, while the one on the left to the second threshold. In all cases, (Y) = 0.1 . Moreover, π a = 0.1, q = 0.5 (σ q = 0.05), η = 1, ̄( S) =̄( Q) = 2 ( (S) = (Q) = 0.1 ) and ρ 0 = 0.3 1 3 is the value at which P i = 1 for the agents of highest income in the second bracket. Therefore, approaching and overcoming the first threshold, only agents in the third bracket initially increase their compliance. As they contribute to the large part of the tax revenue, the quality term sharply increases. Then, if the feedback is positive, the quality term increases the compliance of other agents too; if the feedback is negative, it has the opposite effect. This is why the variation of ρ, alongside the variation of the tax revenue, is maximal at the first threshold. For negative feedback, interestingly, ρ increases with P in a range around the threshold. This is explained by the large difference in the proportion between high income agents, for which the condition for the threshold is (nearly) satisfied -making them more compliant-, and the rest of agents, whose decision is essentially affected by the change in the quality term -making them less compliant when γ = − 1. This is illustrated in the second row of Fig. 6, where we see that, by increasing P from below the first threshold to the second one, the probability of evasion increases or decreases depending on whether the agent has a high or low gross income, respectively. The second threshold works analogously. However, since at that point all agents in the third bracket have already passed their threshold, the effect on the tax revenue and on the overall compliance is already vanished. For this reason, the second threshold is better understood as an upper bound on P below which the observed effects are expected. The results found for the continuously progressive regime emerge from the same mechanism. Although clear brackets cannot be distinguished in this case, we see from Fig. 4 that the tax rate imposed on high gross incomes is close for the two progressive schemes. Consequently, as seen in Fig. 5, the first threshold of the bracket progressive effectively works for the continuously progressive as well. Nonetheless, the small surplus of tax rate in the continuously progressive, makes the change in ρ (and the raise in the tax revenue) slightly more marked. On the other hand, being the tax rate imposed on low gross incomes substantially lower than in the bracket progressive, further increasing P has a weaker effect in the continuously progressive. This explains why at higher penalties the level of compliance decreases passing from the latter to the bracket progressive, and then to the flat tax. All in all, we have seen that when risk-aversion is high and taxation is progressive, the compliance of the system is mainly driven by agents with higher or lower gross income, depending on whether the adopted penalty factor is, respectively, close to or larger enough than the threshold associated to high gross incomes (and thus high tax rates).
So far, we focused on heterogeneous populations where each of individual parameters (except the gross income) is drawn from a normal distribution, taken the same for all agents. This allows to study different populations where all agents are sufficiently well represented by a typical, "average" agent, whose parameters coincide with the population averages. We refer to them as one-type populations. Another possibility, allowing for more heterogeneity, is to consider a multi-type population by dividing it into sub-populations, each represented by a different type of agent (characterised by the means of the respective normal distributions of the parameters). Following this line, we started by studying populations with three or four different types, distinguished by how they assign weights to the utility components. What we found is that, whenever the weight vector averaged over the entire population is equal or close to the one of a corresponding one-type population, the results almost superpose. For this reason, we do not show them here.
Novel results derive, instead, by considering two-type populations where types differ solely for the sign of the feedback (see Fig. 7). In particular, we took half of the population with γ = 1 and half with γ = − 1. For the same reasons explained before, for low risk-aversion the compliance increases going from the flat tax to the bracket progressive, and then to the continuously progressive. The same is true for high risk-aversion when the penalty factor is not much higher than the threshold penalty for the third bracket, while the reversed order holds for higher penalties. Moreover, we verified that the probability of compliance is always higher for positive feedback agents when risk-aversion is high, in line with the fact that ρ is always lower for one-type populations with positive feedback (see Fig. 5). With low risk-aversion, instead, the sharp transition in ρ is strongly delayed to higher penalties (at P ≈ 9.5 for the used setup), and therefore the negative feedback agents are the most compliant for any penalty below the critical one, in accordance to what can be observed by comparing one-type populations of opposite feedback (see Fig. 5). The delay derives from the fact that only half of the agents -those with positive feedback-now contribute in favor of the transition. Furthermore, for any P below the critical value for the low risk-aversion and positive feedback population, the value of ρ is always intermediate with respect to the values found for the two one-type populations, although generally it is not close to their average value. We note also that the penalty has a stronger impact on compliance than it has for one-type populations, given ρ now spans larger ranges of values. Additionally, for the progressive policies, the increase of ρ found for the high riskaversion and negative feedback population is now suppressed, hence compliance always grows when increasing the penalty. Along what we observed for one-type populations, this further suggests that penalties may be more effective than expected when the heterogeneities of agents are taken into account. At last, we verified that this symmetric twotype population is not equivalent to a one-type population in which the quality term has no weight, as one would expect if the overall contribution of agents with opposite feedback compensated. This is the case only when both, G(x) (Eq. (11)) and ξ(x) (Eq. (7)) become linear, which only happens in the limits → 0 and β → 0 , respectively, the latter defining however a case of no interest.

Heterogeneous structured populations
In this last section, we consider populations where agents, besides having heterogeneous characteristics, engage and exchange information in accordance to a certain network of acquaintances. To this end, one should note that the dynamics, as defined by Eqs. (5)-(7), does not explicitly depend on local network properties of the agents. As a consequence, one can expect the sole structure of interactions not to play an essential role. In fact, although the very dissimilar networks we used, we found the respective results to generally superpose. Despite this, the network acquires importance when it translates the existence of some correlation in the characteristics of agents. Indeed, they estimate both the audit probability and the fraction of evaders according to their neighborhood. Therefore, if some individual characteristics affect the state of an agent (as seen above in relation to, for instance, tax rates and feedback), they also have an indirect effect on the neighbors of the agent. While considering non-random policies for the probability of audit is an interesting and easily implemented extension of the presented model, that indirect effect is for the moment only conveyed by the social term. In particular, we focus here on how tax compliance is affected by a positive correlation (or assortativity) between the incomes of acquaintances. In sociology, such a correlation is known as homophily (McPherson et al. 2001) and denotes the similarity, as measured in relation to various features, between individuals that are socially connected. In particular, both social and spatial segregation --the two being entangled-is well documented to be strongly correlated with income and wealth (Reardon and Bischoff (2011) and McPherson et al. (2001) and references therein). Verified that the network properties are not a main factor, we present the results for a population of N = 2000 agents arranged on the vertices of a two-dimensional grid. We consider a one-type population in the same setup as the one used in Section 4.2. To study how the income-income correlation affects the results, we induce different levels of correlation as measured by the Pearson correlation coefficient 4 .

3
Taxation and evasion: a dynamic model Figure 8 nicely illustrates the effect of the correlation on the overall compliance. One immediately notices some loss of accuracy of the MMCA, even if it is still able to reproduce the qualitative trends of the MC points 5 . For low risk-aversion and positive feedback, the cascade mechanism through which agents with lower gross income drive the system to a sharp increase in compliance is weakened by the correlation, which is why we observe the transition to be delayed and slightly smoothed. Indeed, that mechanism is based on the social influence that lower income agents have on higher income ones. However, increasing the income-income correlation makes the network more and more segregated, in turn reducing that influence. For negative feedback, as already observed for mean-field populations, the effectiveness of this mechanism is largely compensated by the quality term. Nonetheless, due to segregation, high income agents have a slightly higher probability of evasion which, translating into a lower public goods' quality, in turn pushes the overall compliance. This is indeed what we see in Fig. 8.   Fig. 8 Final fraction of evaders as a function of the penalty factor, P, for heterogeneous populations in the (average) configuration Y + ( k (X) = 0.02 , X ∈{Y,S,Q}), arranged in a two-dimensional grid. The results are shown for the bracket progressive taxation and three different levels of income-income correlation, as measured by the Pearson correlation coefficient, r. The final fraction of evaders in the population is reported as a function of the penalty factor, P. The results from the integration of the MMCA equations and from the MC simulations are indicated, respectively, with solid lines and points, whose standard errors are denoted by ribbons and error bars. The vertical dashed lines indicate each threshold P = − 1 , = T(c)/c, only meaningful for ̄( Y) = 5.0 . In all cases, (Y) = 0.1 . Moreover, π a = 0.1, q = 0.5 (σ q = 0.05), η = 1, ̄( S) =̄( Q) = 2 ( (S) = (Q) = 0.1 ) and ρ 0 = 0.3 5 We recall that, in the MMCA, an agent estimates (i) the audit probability as equal to the global value, π a = 0.1, and (ii) the proportion of evaders as the average of the probabilities of evasion in her neighborhood, which is therefore a real number. On the contrary, in the MC, (i) the estimation of the audit probability fluctuates around π a = 0.1 with a large standard deviation of approximately 0.28, given a neighborhood of 5 agents, as in the grid; and (ii) the estimated fraction of evaders can only take rational values as n/5, n = 0, … , 5 . Since the discrepancy between MMCA and MC is found even in the case with no income-income correlation and for different networks, we conclude that it originates from the combination of the differences in (i)-(ii) with the various dynamical non-linearities. Consistently, we verified that the MMCA approaches the MC when the edge density of the network is increased, for the differences in (i)-(ii) tend to vanish the more the network is close to a complete one, i.e., a mean-field scenario.

Results
For high risk-aversion, since the mechanism through which the high income agents drive the system towards increased compliance (for penalty factors below or around the threshold associated to the third bracket) is based on the quality term, it is mostly unaffected by the network segregation. Indeed, for positive feedback, segregation raises and lowers the compliance probability of, respectively, high and low income agents, in comparison with the case of random mixing. Therefore, while the quality term increases under the contribution of high income agents, this is compensated by the decrease of the social term as seen by the large majority of the population. The overall compliance thus results to be almost insensible to the correlation. For negative feedback, instead, since a higher public goods' quality has the opposite effect of lowering compliance, this sum up to the decrease of the social term, eventually enhancing the aforementioned mechanism, as seen in Fig. 8.
In the end, social segregation as encoded in an income-income assortativity seems to reduce the overall effectiveness of penalties (although compliance is generally still higher than in the absence of non-monetary terms). Nonetheless, we cannot exclude that accounting for the correlations induced by other individual features could change or even overturn what found here. We want to stress that the presence of an assortative mixing among agents can importantly affect the dynamics of the system whenever the individual behavior is affected by the social structure, as already found for other problems as, among others, the adoption of conventions and innovations (Nair et al. 2021) and references therein) or the spreading of an epidemics in relation to the adoption of health behavior (Munzert et al. 2021;Burgio et al. 2021).

Conclusions
In this paper we present a new model introducing a unified framework to analyse different elements influencing tax evasion. In particular, we define a generalized utility function aimed at providing a description of the tax-evasion problem in terms of three components: income, social influence and perceived quality of the public good. We considered a behavioral economics perspective, in which optimal strategies are chosen according to the corresponding understanding of available information.
Through a stability analysis performed assuming a homogeneous mean-field population, where all agents are equivalent and all mutually interacting, we study the dynamical effects produced by all three utility components and their respective parameters. Taking the penalty factor as a policy parameter, a first set of results (where only monetary factors are considered) shows how the tax-evasion probability changes as a function of the risk-aversion: we analytically prove and numerically verify the way in which, in agreement with economic literature, the level of risk-aversion determines the value of the penalty factor above which evasion ceases to be an optimal strategy. In particular, we confirm that, as expected, low values of penalty factors adopted in many countries turn out to be largely ineffective. On the other hand, by considering even a small contribution from the non-monetary terms, high levels of tax compliance become accessible also at substantially reduced fines. This result is further confirmed for more realistic heterogeneous populations, where agents with Pareto distributed incomes and therefore different tax rates are modeled. Both MMCA and Monte Carlo numerical results show that, in these cases, the effectiveness of penalties is actually even reinforced, pointing out the importance of accounting for such and other heterogeneities among agents. In particular, under the action of both social influence and quality assessment, we find that a taxation scheme that shifts the tax burden towards the fewer, highest income agents --which provide the largest fraction of the tax revenue--, while easing it for the numerous, lower income ones, is generally more effective in fostering tax compliance.
Finally, considering heterogeneous populations arranged according to a social network reflecting income correlations among peers, we find the previous effect to be slightly weakened, although compliance is generally higher than in the sole presence of the monetary term (in which case the network has no effect).
Further research is in our agenda for many aspects currently being investigated in forthcoming companion papers. First of all, the hypothesis of partial evasion, in which the decision is not binary and the citizen may decide to hide a portion of her income to the Government. Secondly, the multi-period framework, in which the penalty can all previous time steps and not only the current time. Further, this opens the problem to a dynamic optimization framework, in which each agent tries to maximize the utility by referring also to future periods. Third, a differentiation among public goods and services, to deal with the Governmental choice about their quality and reveal that a strategy may optimize the expenditure while trying to reduce evasion. Fourth, the audit probability could be differentiated according to the income, thus distributing differently in the population the incentives determining the choice about compliance/evasion. Finally, further individual features may have a key-role, such as cultural variables, traditions and customs, which may determine a strong variability of behaviors and induce widely diverse assessment of all elements driving the contribution of citizens to public goods for the community.

Appendix: Further analysis of the stable equilibria
As shown by Eq. (14), given a stable fixed point ρ = ρ ⋆ and a parameter a, the sign of ∂ρ ⋆ /∂a simply coincides with the sign of (ΔU)∕ a| = ⋆ . Here we give the explicit expression of the latter for any a ∈ {P, π a , , q, c}. For a = P, we get For a = π a , For a = , (24) (ΔU) | | | | = ⋆ = (ΔU) = k (Y) 2 y � (1 − ) − a P y � (1 − P ) , and the sign depends on itself, on P and π a , and on y(x). For any risk-averse agent, since y � (x) is maximized at x = 0, the region of parameters (P, π a ) for which ∂(ΔU)/∂ < 0 shrinks (or even disappears) when approaches 1. Specifically, the higher is , the higher P and π a needs to be to make ∂(ΔU)/∂ < 0. Therefore, fixed P and π a , ∂(ΔU)/∂ < 0 if is large enough. At last, for a = q, one gets whose sign follows the feedback, i.e., positive for γ = 1 and negative for γ = − 1. (25)