Pollution Abatement and Lobbying in a Cournot Game. An Agent-Based Modelling approach

The application of Agent-Based Modelling to Game Theory allows us to benefit from the strengths of both approaches, and to enrich the study of games when solutions are difficult to elicit analytically. Using an agent-based approach to sequential games, however, poses some issues that result in a few applications of this type. We contribute to this aspect by applying the agent-based approach to a lobbying game involving environmental regulation and firms’ choice of abatement. We simulate this game and test the robustness of its game-theoretical prediction against the results obtained. We find that while theoretical predictions are generally consistent with the simulated results, this novel approach highlights a few differences. First, the market converges to a green state for a larger number of cases with respect to theoretical predictions. Second, simulations show that it is possible for this market to converge to a polluting state in the very long run. This result is not envisaged by theoretical predictions. Sensitivity experiments on the main model parameters confirm the robustness of our findings.


Introduction
Agent-based modelling (ABM) is a computational method that allows to simulate models composed of autonomous and possibly heterogeneous agents that interact with each other as well as with the environment according to set rules of behaviour (Salgado and Gilbert, 2013;Tesfatsion, 2006).Important features of agent-based models include the possibility to model realistic agents' characteristics such as limited rationality and heterogeneity and identify the mechanisms that determine emergent phenomena.
Given the similarities in aim and the same focus on individual behaviours, it is not surprising that ABMs have also found applications in Game Theory since they offer the possibility to add on games that present non-standard specifications or are difficult to treat analytically.A large amount of research has adopted this methodology to study the prisoner's dilemma game and cooperation (Axelrod et al., 1987;Ladley et al., 2015;Hales, 2002;Szilagyi, 2012).Rational behaviour has been contrasted by inductive learning through the well-known El Farol problem (Arthur, 1991(Arthur, , 1994)).More specifically, applications of ABM can be found also in Industrial Organization (IO).However, attempts to replicate standard oligopoly models such as Cournot are limited and rely on rather simple and non-sophisticated models reproducing simultaneous games (Kimbrough andMurphy, 2009, 2013).
An isolated example of an agent-based approach to study a sequential game is given by van Leeuwen and Lijesen (2016) who propose an agent-based version of the Hotelling's game of spatial competition.The issue with sequential games is that while in ABMs agents form expectations on the basis of past information, the typical solution concept of subgame perfection adopted to solve sequential games implies that agents anticipate other players' optimal actions.Nevertheless, van Leeuwen and Lijesen (2016) claim that despite the evident differences between the two approaches, Game Theory and ABM can be adopted complementary in such a way that one approach's weaknesses correspond to the other approach's strengths and vice versa, allowing for the exploitation of the synergies between the two.Following this idea, in this paper, we apply an ABM approach to a lobbying game.
More precisely we adapt to the model used by Catola and D'Alessandro (2020) to analyse an application of lobbying to environmental regulation.Among the several approaches to model lobbying, this specific type analyses the lobbying interaction as a common agency problem through menu auctions as pioneered by Bernheim and Whinston (1986).The lobbyist offers the politician a menu of possible contributions for every given choice of the objective policy and the government picks the one that it prefers.This approach was popularised by the seminal paper of Grossman and Helpman (1994) who applied this model to analyse trade policies and in particular the lobbying in favour of sails protection.In this paper, the lobby is exogenously given, while subsequent extensions of the model (see e.g., Mitra, 1999;Laussel, 2006;Bombardini, 2008) relax this assumption and explicitly consider firms' costly decision to form the lobby.This class of models have found several applications, mostly in international trade (Grossman and Helpman, 2002, e.g), multinational firms location (Polk et al., 2014), industrial structure (Cerqueti et al., 2021) and has been widely applied in the analysis of environmental regulation of imperfectly competitive markets (see for example Fredriksson 1997;Aidt 1998;Fredriksson and Svensson 2003;Fredriksson andWollscheid 2008, andthe aforementioned Catola andD'Alessandro 2020).
This paper contributes to the literature in several ways.Firstly, methodologically.
To the best of our knowledge, this is the first attempt to apply the ABM approach to study lobbying and one of the few attempts to model sequential games in general.
In our ABM, the economic actors combine maximising behaviours with bounded rationality and limited access to information.Moreover, we provide an algorithm to reproduce the subgame perfect interaction between the government and the lobby as modelled by Grossman and Helpman (1994).
Secondly, our approach contributes to test the robustness of the prediction obtained with a theoretical model in standard IO.Indeed, one of the main limitations of the traditional IO models is that the analysis is limited to one-shot games.While they still provide valuable insights into the incentives in play, they lack the dynamics connected to the temporal dimension.At the same time, the introduction of such dynamics through the use of repeated games suffers severe restraints connected to tractability as well as the abundance of possible equilibria in such games (Folk Theorem).In this regard, ABM could provide a useful tool to compare the behaviour of complex systems with the predictions obtained by one-shot games.In the case of Catola and D'Alessandro (2020), our results show that their equilibrium is reasonably robust to the simulation over long periods, even when players are not fully rational as assumed in the standard models.
The paper is organised as follows.Section 2 summarises the original game, Section 3 presents the simulated model while Section 4 discusses the parametric setup of the model.Section 5 discusses the main results of our simulations while Section 6 presents some sensitivity analysis on key parameters of the model.Finally, Section 7 concludes the paper.

The reference model
In this section, we briefly summarise the key features of the model presented by Catola and D'Alessandro (2020).They study a Cournot oligopoly with N firms and a linear demand p = a−Q where firms choose between two technologies.The first one (polluting) has zero marginal cost but generates polluting emissions and is subjected to a unit tax τ .The second one (green) has a marginal cost c but does not cause pollution.Profits when choosing the different technologies are therefore equal to: (1) The government impose taxation following a twofold motive.On the one hand, it wants to minimise the environmental damage D. On the other hand, it is inclined to accept political contributions L. The damage is modelled as a linear function of the emission and the difference between the gross unitary damage and the tax: The utility function G is assumed to be linear. (2) The final actor is the lobby formed by all the polluting firms that offers political contributions, L, in exchange for lower τ .The objective function of the lobby, V , is to maximise the profits of its members under the constraint of breaking even.
where F is the fixed cost connected to operating the lobby.
The costs of the lobby are then equally divided among the members of the lobby in the form of a fee l.The game is finally solved as an entry game where the number of firms that join the lobby is obtained by imposing: The main result of the paper is that for values of g and F sufficiently low, both technologies co-exist in the market, otherwise, the polluting technology is pushed out of the market and all firms switch to the green one 3 The ABM model In this section, we present the adaptation of the original theoretical model to an ABM framework in which we combine the optimal Nash behaviour of agents with the typical features of ABM such as bounded rationality and limited access to information.
We code and simulate the model in NetLogo (Wilensky, 1999), an open source programming environment specifically designed for creating and experimenting with agent-based models.NetLogo allows for the visualisation of agents in its world, a two-dimensional grid of "patches"; each patch is a square piece of "ground" over which agents are located.In our model agents are the n firms.

Firms
In each period, firms have to take two decisions: in the first stage of the period they need to decide which kind of technology to use and in the last stage of the period how much to produce and then obtain profits.

Choice of type
At the beginning of period t, firms must decide whether or not to abate their emissions.In other words, they have to choose their type k, where k ∈ {g, p}, where g is the green type and p is the polluting one.
To decide over their type, each firm, given its type in the previous period, confronts its profit from the previous period with the average profits of the other type and decides whether or not to switch.
The first issue concerns how much information does a single firm have about the profits of competitors in the market.In fact, it would be unreasonable to think that that they have access to the information about profits (i.e.quantity produced) for each competitor in the previous period.For this reason, we assume that each firm has access to this information for only a subset of the total competitors.We model this limitation by borrowing the concept of Social Circle proposed by Hamill and Gilbert (2009).
The idea is simple.Consider firm i.Its specific location in the space works as a centre for a circumference O i of a given radius, creating a spatial cut-off.Firms will have information about profits for any competitor located within the circle.
An example is provided in Figure 2, where the circumference defines the set of competitors for the blue firm.Firms coloured in yellow, outside of the circumference, are not competitors.Bigger (smaller) values of the radius are associated with bigger (smaller) submarkets and thus will include more (less) competitors.
Following Hamill and Gilbert (2009), the relationship between agents is only established if agents can reciprocate, which is always the case if the radius takes the same value across firms, while it is not ensured when the measure of the radius is different across agents (see Figure 3).In our case, we depart from the assumption of reciprocity allowing the radius to be heterogeneous across firms.This implies that given firms x and y, it may be the case that y ∈ O x while x ∉ O y , meaning that firm x will have information about firm y but not the other way around.While this is typically not the case for social interactions, in discussing industrial competition it is perfectly plausible that some firms have access to better information than others.
Once the information about the profits of its competitors within its circle is collected, each firm will compare its profit of the previous period, with the average profits of firms of the different type.If its profit is higher than the average, it will remain of the same type.If its profit is lower than the average, the firm will consider switching type (i.e.technology).
Formally, at the beginning of every period, each firm will choose whether to maintain its type k from the previous period or to switch to the other type (−k) according to this rule: is the average profit of rivals within firm i circle where λ is a parameter measuring the "intensity of choice".This formulation captures the idea that on the one hand switching technology is a process that costs time and money while on the other hand, firm i only has partial information about the success of the competitors.Therefore firm i will be more willing to switch the higher the difference in profits obtained by its rival, but the switching is not certain.
After all firms take this decision, the total number of polluting firms (i.e. firms of type p), n p , and the total number of green firms (i.e. firms of type g), n G , are computed.

Choice of quantity
In the last stage of the period, the n firms compete a la Cournot in a market characterised by a linear demand.
As this is the final stage, the value of the tax τ has already been determined by the government and therefore is known to all firms.Moreover, the abatement cost is the same for every firm.If we assume that the information about the number of green (n g ) and polluting firms (n p ) is also common knowledge, firms can choose the Nash strategy in terms of quantity depending on their marginal cost c k and the average marginal cost c .Namely: After every firm makes this decision, the total quantity is computed, thus determining the market-equilibrium price p * .Finally, profits of each firm are realised, as it is the environmental damage.At the end of the period then each firm will realise a net payoff, which for the polluting firms must also include the cost of the lobbying.

The lobbying game
In this section we discuss our implementation of the lobbying game a la Grossman and Helpman (1994) in the context of the ABM.Such a game is structured as a menu auction, where the lobbyist offers a contribution schedule, where different amounts of political contribution correspond to different decisions from the government.The government will then pick its preferred option given such a schedule.A convenient feature of this approach is that in a subgame perfect equilibrium, the outcome of such a process implies that the level of the tax will in fact be the one that maximises the objective function of the lobby.We can exploit this result to build an algorithm that simulates the same procedure: i) Firstly the lobby proposes a contribution schedule in the form of a vector ⃗ L.
ii) For every value L j (i.e., for every element in ⃗ L), the government determines the associated value of τ j .Thus, ⃗ L is associated to a second vector (of the same length) ⃗ T whose elements are the possible values of the tax.
iii) The lobby computes its objective function V j for every possible pair (L j , τ j ) and implements the one that provides the higher value of V .
In what follows we discuss how each of these steps are implemented.

Lobbying schedule
To determine the lobbying schedule ⃗ L, the lobby apply a rule of thumb based on the results from the previous year.Namely, the lobby collects from every polluting firm a contribution computed as a fixed share α j of the average profits that polluting firms obtained in the previous year.These collected resources, net of the fixed cost of running the lobby (F ), are then used to finance the campaign contribution, meaning Therefore, to every possible value of α, corresponds a given offer to the government.In the model, we choose the following possible values for α ⃗ α = (0.2; 0.4; 0.6) (10)

Tax schedule
The government will weigh the political contribution offered by the polluting lobby, against the expected environmental damage that will follow from the tax.Therefore, the trade off to evaluate is between the contribution offered L and the expected environmental damage E(D), which is equal to where E(Q p ) is the expected total quantity produced by polluting firms.
Given this equation and the government's objective function, the tax chosen by the government will satisfy the following condition: As for E(Q p ), we assume that the government has a refined version of static expectations concerning quantity.More specifically, the government expects that the average individual quantity produced by polluting firms will remain the same from the previous period, however, it updates the expectations on the total quantity using the information about the number of polluting firms.Formally: combining equations 12 and 13 the optimal value of the tax set by the government in period t is equal to: Thus, at the end of this process, the vector ⃗ T is determined.

Tax Determination
Once the vector of taxes is determined, the lobby will combine ⃗ L and ⃗ T to compute the value of its objective function for every value of α.
The aim of the lobby is to maximise the profits of its members which depend on the expectations of future profits given the level of the chosen tax and the expenditure in lobbying.As L j is already determined at this stage, the crucial variable is the expected profits of the polluting firms given the tax level τ j .Similarly to what we did for the government, we assume that the lobby has a refined form of static expectations.On the one hand, if the tax remains the same, the lobby assumes that the average individual profits of its members in the upcoming period is equal to the profit in the last period.On the other hand, if the tax in the new period changes, the lobby updates its expectations about the profit in the same proportion.
Formally, the objective function of the lobby for every level of α j can be written as: where δ j is the growth rate of the tax given α j : δ j = τ t j −τ t−1 τ t−1 .The lobby will compute V j for every possibility and select the highest outcome.
The lobby selection implies the selection of the realised value of α for period t, which in turn determines the value of the L and τ .
Once the lobbying process is over, the lobby fee that each polluting firm needs to pay is computed.The value of such a fee will be the total amount contributed to the government (L t ) plus a share of the fixed cost of running a lobby F that is divided evenly among members.The lobby fee that each polluting firm has to pay is therefore equal to:

Model setup
In setting the model we choose the size of the market a = 1000 and a number of firms n = 100.At the initialisation, each firm is assigned a random position, a random radius for its social circle in the range (5, 25) and flips a coin to decide its type at round 0, thus determining n 0 g and n 0 p .After that every firm has been placed in the space, the initial value of the tax (τ 0 ) is determined.To provide the maximum comparison with the theoretical model, we assume as a starting value for the tax the equilibrium tax level given the number of polluting firms. 1 It is worth mentioning that assuming this starting point also provides a condition over the value of the marginal cost c, namely c ≤ a+n * t 1+n , which constrains the selection of the value at the setup.We chose a value of the marginal cost equal to 10.
As for the switching probability from equation 6, we selected a value of λ that would ensure that θ ∈ [0, 1].Among those values, we selected λ = 0.5.
An overview of the values used at the model setup is reported in After the setup, we run the model until the market converges to one technology (i.e.either n t p = 0 or n t g = 0) but we impose a limit of 3000-time steps above which the simulation stops.
Since we want to test the robustness of the theoretical predictions of the model, we use environmental damage as a main variable, ad we run different scenarios for different values of g and compare our results with the prediction from the game.
The following Box provides an overview of the simulation algorithm.

Results
In this section, we present the results of the Monte Carlo simulations.We run 500 simulations for each value of g and let the model run until there is convergence to either of the 2 states (i.e.polluting or green).Results are summarised in Table 2 Results from Table 2 show that for values of g sufficiently low, the market actually converges to a situation in which all firms choose not to abate (polluting state), while for g ≥ 10.75 the market will converge to a situation in which all firms will choose the clean technology (green state).There is also an "ambiguous case", corresponding to g = 10.7 where depending on the starting point the market can converge to either state.Notably, though, the time required for the market to converge to the polluting state sharply increases the higher g, with cases in which the number of time steps approaches the upper bound of 3000.
We can compare these results with the theoretical prediction of Catola and D'Alessandro (2020).The first difference between the theoretical and the simulated model concerns the possibility that the market reaches the polluting states.In fact, in the theoretical model this outcome is not possible.Polluting firms may be able to remain in the market but are never able to push the green ones out.Our results instead suggest that this could happen for values of g sufficiently close to c, meaning that the theoretical model might be too optimistic concerning the inefficacy of lobbying to push out green technology.However, it is also worth mentioning that this scenario requires a substantial amount of time even for values of g very close to 10.
The second point concerns the green state.According to the theoretical prediction, the green state requires as a necessary condition values of g ≤ a+cn 1+n ≈ 19.80.Instead, our simulations show that the market converges to green even for values of g significantly smaller and in a little amount of time.In this sense, we could argue that the theoretical model is too pessimistic about the strength of the green firms to push polluting firms out of the market.
A final point concerns the timing.In fact, the number of time steps that the model requires to converge varies significantly in the scenarios.Notably, it is increasing in g for the outcomes in which the market converges to the polluting states and decreasing in g for the outcomes in which the market converges to the green states.On the one hand, this can be considered in line with the theoretical prediction if we think about speed as reflecting the strengths of the economic incentives in choosing one technology over the other.On the other hand, we should also reconsider the prediction of a mixed market outcome.Indeed, if we accept that a time step corresponds to a cycle that includes production, sales and political process leading to a new tax, it is clear that a market that requires several hundred ticks to converge (which is often the case for the polluting state) is in fact a market that one could reasonably consider as mixed.
To provide further insights, we are interested in observing the behaviour of the tax throughout the simulations.In Figure 4 we plot the difference between the values of the actual tax and of the theoretical value according to the model for every tick.
We include 6 simulations for 6 different values of g.Two scenarios in which the market converges to polluting and 4 in which the market converges to green.
At the beginning of the simulation, the two values coincide since the tax is initialised at the theoretical value.Then, as time steps move one, the two values diverge, and as the model tends to converge to either outcome the difference explodes, as the theoretical value becomes extremely large (if the model converges to polluting) or extremely negative (if the model converges to green). 3e can notice that for values of g close to the critical values, the behaviour of the two taxes is similar for a longer period before it starts to diverge.Conversely, for higher and lower values of g, the divergence happens in a much smaller number of time steps.For λ, we test 9 values including λ = 0.5 used in the main Monte Carlo simulations.This parameter measures the intensity of the gap in profits with respect to competitors.Thus no matter the sign of this gap, λ will reinforce but never change the effect given by the gap on the probability to switch, and thus the preference to go green or pollute.For this reason, we run the experiment on two values of g, so to test the effect in both cases of convergence.Results confirm the expectation as displayed in Figure 6.For greater intensity of the profits gap (increasing values of λ), the system converges faster to the expected state for the set values of g.We also test how changes in the initial number of firms of each type affect the final outcome.In our main model we assumed that, at the setup, each firm is assigned the green technology with probability 0.5, therefore, on average firms are equally divided between the types.In what follows we test again the model by imposing a fixed share of green firms from 10% to 90% for two specifications of the models.
The results are reported in Figure 7.In this case, the effect of the change in the initial share of green firms on the time steps required to converge is not monotonic.
However, despite the changes in time steps, the different shares of green firms do not affect the final outcome in terms of convergence.Additionally, the case of g = 10.7 deserves a closer look since the main simulations present two possible outcomes in terms of convergence.We want to investigate whether changes of λ or the initial share of green firms can affect the convergence.
We run additional sensitivity experiments holding g = 10.7 at the same seed of 1000.
In this case, we let the model run until convergence without imposing any time limit given that being a critical value we expect that changes in those key parameters will affect the convergence.Results (reported in Figure 8) confirm our expectations.
The kind of convergence is sensitive to the initial values of both λ and the share of green firms.We can see that, while under the parametrisation of the main analysis this simulation would converge to green, there are values of λ and initial shares of green firms for which the simulation would converge to polluting.What is confirmed tough is the large difference in terms of time steps.Convergence to polluting always requires a substantially larger amount of time compared to the convergence to green.4In this article, we develop an agent-based model version of the one-shot lobbying game analysed by Catola and D'Alessandro (2020), that we used to test the robustness of their theoretical predictions.To this purpose we provide a novel approach to simulate the specific lobbying interaction à la Grossman and Helpman (1994) and tackle the issue of compatibility between ABM and the solution concepts typical of sequential games.More specifically, we develop a model where on the one hand, firms decide how much to produce following a Nash behaviour but decide which technology to implement based on limited information about their direct competitors.On the other hand, the government and the lobby interact to decide the tax level while being constrained to static expectations.
We simulate the model for a set of values of the externality intensity parameter g for 3000-time steps, or until the market converges to one technology, and compare our results with the equilibrium of the game which predicts either a fully green market or a mixed equilibrium where both technologies coexist.
Our results show that the market converges to green in a larger number of cases than the one predicted by the game and with a steadily decreasing number of time steps for increasing values of g.In this regard, we could argue that the original prediction was rather pessimistic.On the other hand, however, our simulations show that for values of g sufficiently low the market actually converges to polluting, an outcome that was excluded by the original model.In this sense, we could argue that the model was too optimistic regarding the inability of the lobby to push out green technology.We also highlight a critical value of g in which the market converges to either outcome.However, there is a substantial difference in the number of time steps required between polluting or green convergence.The time factor is, indeed, a crucial feature of the model, since it can reconcile the mixed equilibrium with our simulations.In fact, convergence to the polluting state usually requires a number of time steps that are hardly comparable with the life span of any real market.
Finally, this being a first attempt to simulate such games, many features have been left behind thus leaving a large scope for further research.In particular, future attempts could aim at increasing the complexity of the model by relaxing even more the Nash behaviour of firms, implying for example rules of thumb in deciding the production.This would imply the possibility of negative profits and even bankruptcies among firms.Other interesting avenues for further research are the use of more sophisticated expectations for the government or the lobby.

Figure 1
Figure 1 provides a snapshot of the NetLogo world of the model.The polluting firms are depicted in red; the green firms are the non-polluting ones.

Figure 1 :
Figure 1: The NetLogo world of the model at its set-up.

Figure 2 :
Figure 2: Application of a social circle to define the set of firm's competitors.

Figure 3 :
Figure 3: Types of Social Circle switching probability θ we again borrow from the ABM literature and adapt the probability of switching from Delli Gatti et al. (2010) and Riccetti et al. (2013) (time indices are omitted):

Figure 7 :
Figure 7: Sensitivity experiments on the share of green firms (seed = 1000).

Figure 8 :
Figure 8: Sensitivity experiments for g = 10.7 (seed=1000).Red dots indicate that the converge is at the polluting state, green the other way around.

Table 1 :
Variables' values used in the model setup .2