Market-Based Environmental Policies in the Power Sector

So-called Green New Deals aim to decarbonise the power sector via market-based instruments. Consequently, engineering-economic models for analysing the sustainable-energy transition have proliferated. This review categorises existing approaches and identifies areas for extending the state of the art. We cluster the extant literature into two groups: engineering/operational research (M1) and environmental economics (M2). While M1 focuses on the power sector’s technical and spatio-temporal aspects, M2’s emphasis is on incentives and externalities. Depending on the nature of the research question, either perspective (or both) may be suitable. Since the envisaged electrification of the wider economy implies tighter coupling between formerly distinct sectors, e.g., power and heat, both M1 and M2 will have to adapt to the new paradigm in terms of methodology and application areas. Here, eliciting coordinating mechanisms, enhancing algorithms for solving hierarchical models, soft linking bottom-up and top-down models, and crafting robust environmental policy in face of uncertainty over externalities are some of the vistas for future research.


Introduction
Spurred by calls for climate action, many industrialised countries have adopted legally binding targets for reducing greenhouse gas (GHG) emissions, viz., from CO 2 .A prominent example of such an environmental measure is the Fit for 55 package of the European Union (EU), which aims to reduce GHG emissions by at least 55% by 2030 vis-àvis 1990 levels [1].Similar to other so-called Green New Deals posited worldwide, e.g., in the USA [2], the legislation is itself a stepping stone to a climate-neutral Europe by the year 2050.Besides mechanisms for limiting GHG emissions and promoting the adoption of renewable energy (RE), such packages also envisage electrification of sectors such as Makoto Tanaka and Yihsu Chen contributed equally to this work.B Afzal S. Siddiqui asiddiq@dsv.su.seExtended author information available on the last page of the article heating, industrial processes, and transport.Hence, the electricity industry is to figure prominently by not only reducing its own GHG emissions but also facilitating a sustainable transition of the broader economy.
Such an endeavour is rendered more complex by the deregulated status of the electricity industry in most OECD countries.Beginning in the 1980s with reforms in Chile and the UK, the generation and retailing functions of vertically integrated utilities were separated into privatised entities in order to foster competition [3].Indeed, the emergence of small-scale generators and advances in telecommunications eroded the economies of scale of large central-station power plants.By contrast, the economies of scale of the transmission and distribution functions meant that these networks remained largely state regulated.Thus, broadly speaking, the modern electricity industry's functions are handled in a decentralised manner by organisations that have distinct and often conflicting objectives, e.g., profit maximisation versus welfare maximisation.
It is notable that environmental concerns were largely absent from the nascent debate over electricity-industry restructuring, which was motivated by the desire to improve economic efficiency.In fact, only over the past decade have climate objectives risen to the forefront with policies intended to support the adoption of RE.For example, the German Energiewende began in 2010 with a shift away from coal and nuclear power using feed-in tariffs (FiTs) to entice adoption of RE [4].While such support has more than doubled the share of renewables in the German power mix to over 40% in 2020 vis-à-vis 2010 [5] and reduced sector GHG emissions by nearly 35% over the same period [6], the intermittent nature of variable renewable energy (VRE) output has created challenges for the existing German grid and market design [7,8].Nevertheless, such support measures have reduced overnight capital costs of VRE to near parity with those of gas-fired plants by 2020, e.g., a 430 MW combinedcycle gas turbine (CCGT) plant and a 200 MW on-shore wind plant have capital costs of $1084/kW and $1265/kW, respectively [9].Given that VRE is now a viable competitor in deregulated electricity industries, subsequent policies to tackle climate change will have to be more targeted in order to decarbonise the rest of the economy without blunting the electricity market's ability to function efficiently.
In this context, decarbonisation in climate packages, such as Fit for 55 [1], is typically cast in terms of either attainment of RE targets, e.g., at least 40% of overall energy production from RE by 2030, or reduction in GHG emissions, e.g., at least 55% by 2030 relative to 1990 levels.Naturally, decision-support models, e.g., in the engineering and operational research (OR) literature, reflect this perspective in applying targets to power-sector decarbonisation while overlooking the deregulated nature of the electricity industry.In particular, this strand of the literature optimises electricitysector investment and operations as if a single entity were responsible for all decisions.Although imperfect competition is included in game-theoretic models, the perspective taken is not a regulatory one but rather that of strategic producers subject to GHG policy.Thus, instead of facilitating a tradeoff between economic and environmental objectives, the prevailing emphasis is on technical considerations, viz., balancing VRE output's intermittency via spatio-temporal sources of flexibility [10,11], underpinned by targets, e.g., for RE production via renewable portfolio standards (RPS) [12].Yet, the use of such proxy measures may obfuscate the quantification of the tradeoffs between fundamental economic and environmental objectives [13].
By contrast, the literature on environmental economics explicitly trades off economic and environmental considerations via a damage-cost function [14].Instead of attempting to find the optimal power-system configuration under a given set of climate targets, e.g., net-zero emissions, this strand of the literature endogenises the environmental cost of GHG emissions [15].Consequently, the socially optimal environmental policy, e.g., a CO 2 tax, can be set endogenously in anticipation of industry's incentives.For example, in a Stackelberg leader-follower framework, it may be prudent for a regulator (leader) to reduce the CO 2 tax applied to an imperfectly competitive industry (follower) vis-à-vis perfect competition [16].Intuitively, a producer with market power will attempt to increase the market-clearing price by withholding output from its (ostensibly) polluting facility, which also reduces GHG emissions.However, a higher CO 2 tax may actually exacerbate the welfare losses from curbed production.In such a situation, the tradeoff between the economic and environmental metrics may be assessed by varying the damage-cost parameter to strike the socially optimal balance between the two objectives [17].With rare exceptions, this branch of the literature typically abstracts from the spatio-temporal details of power sectors, e.g., transmission constraints, heterogeneous firms, and intermittent VRE output, in order to focus on analytical solutions that are amenable to rigorous comparative statics.Hence, in a mirror image of the engineering/OR literature, environmental-economics models may draw policy conclusions that are not underpinned by the fundamentals of power systems.
Since policymakers in deregulated electricity industries cannot directly intervene in investment and operational decisions, they must resort to guiding socially optimal outcomes via market-based instruments.Besides the aforementioned CO 2 tax, FiT, and RPS schemes, other measures include cap-and-trade (C&T) markets for GHG emissions [18] and tradeable emission standards [19].By determining the levels of these price-, quantity-, or rate-based mechanisms, policymakers can induce a given response by industry, which treats these policy variables as if they were exogenous.In order to be credible, policies should be crafted with both the economic and technical features of the power system.However, such realism necessitates more sophisticated modelling at the interface of heretofore disparate strands of the literature, viz., engineering/OR and environmental economics.
In summary, there are two broad methodological approaches to addressing environmental issues in the power sector:

Methodology M2 Environmental economics, which aims to reflect how policies impact decision makers' incentives
While most papers cannot be neatly categorised, they, nevertheless, lie somewhere on the spectrum between M1 and M2.Furthermore, each approach is appropriate depending on the nature of the research questions.Our survey of the literature also identifies four broad application areas:

Application A4 Endogenous policy targets for emissions
Again, it is not clearcut how to partition application areas precisely.However, broadly speaking, A1 refers to the analysis of environmental policies such as CO 2 taxation, C&T schemes, and RPS when producers and consumers are price takers with respect to permit prices.A2 concerns a phenomenon that arises when a jurisdiction with more stringent environmental policy induces its production to relocate to an area with more lax standards.A3, in contrast to A1, takes the view that the power sector includes firms that are large enough to manipulate emission trading system (ETS) or renewable energy certificate (REC) prices through their generation in the primary energy market.More rare is A4 in which a policymaker explicitly sets environmental targets in anticipation of industry's response in contrast to exogenous targets in A1-A3.
As we will highlight in the "Existing State of the Art" section of this review paper, the engineering/OR strand of the literature (M1) abstracts away from the power system's economic realities, viz., conflicting objectives and externalities, whereas the one on environmental economics (M2) shies away from its technical attributes, viz., resource and spatio-temporal constraints.Using this high-level classification of the salient features of the two strands of the literature (M1-M2) and application areas (A1-A4), we will next distil pertinent research gaps and identify methodological requirements to underpin policy analysis of the power sector in the "Research Gaps and Methodological Requirements" section.Based on this "wishlist," the "Conclusions" section will summarise the state of the art, point out critical methodological challenges, and suggest areas in which synergies are possible.

Existing State of the Art
In this section, we review the existing literature from methodological and application perspectives as laid out in the "Introduction" section.Using the classifications A1-A4, we synthesise the main insights in the extant literature.Where possible, we also categorise the work methodologically, i.e., M1 or M2.

Market-Based Environmental Policies (A1)
Papers studying the impacts of market-based environmental policies on the power sector generally consider two types of instruments: tax (price-based instrument) and C&T (quantity-based instrument).A price-based instrument is incorporated by adding the costs to the objective functions of firms' optimisation problems in proportion to the relevant emission rates.A quantity-based instrument is also referred to as a mass-based C&T system when an emission cap (in tonnes, t, or other mass units) is explicitly specified in policies.Whereas a tax (in $/t or $/lb) is fixed, the allowance or permit price under C&T is endogenous and can fluctuate in response to supply and demand conditions in the permit market, which is also affected by the induced permit demand from the product market, i.e., the power sector.The conditions under which these two instruments are equivalent are established, see, for example, a classic textbook by either [20] or [21].Our discussion, hereafter, will focus on C&T because of its popularity in the power sector.
Seminal works in environmental economics (M2) that tackle externalities1 are [23] and [24], which aim to identify allocation of rights and to design externality controls that are also acceptable to stakeholders, respectively.On this basis, either an allocation of pollution rights (under C&T) or a Pigouvian tax may be established to internalise the cost of the externality.In theory, an environmental tax or C&T level should be set such that it equals the marginal social cost of damage associated with the pollutant.Examples of engineering/OR works (M1) that implement CO 2 taxes and C&T as part of more detailed models of the power system include [25], [26], and [11].Typically, such papers use large-scale problem instances that detail a region's spatio-temporal features in order to identify how generation and transmission investment will evolve over time as environmental constraints become more stringent.Rather than identifying optimal environmental policy per se, their main objective is to capture the evolution of the power sector, viz., in terms of generation and transmission expansion, especially as decarbonisation catalyses intermittent output from VRE.
Another variant of market-based instruments that resembles a C&T program is called a performance-based policy and has received some attention.Similar to a mass-based C&T, the instrument sets a performance baseline.A firm or a polluting source receives a subsidy (or pays a fee) when its performance is better (or worse) than the baseline.The approach in environmental economics (M2) is to model a stylised industry in order to obtain closed-form solutions that could be amenable to comparative statics [27].Based on benchmarking of performance metrics such as social welfare and CO 2 emissions, first-and second-best solutions can be methodically compared.Taking this perspective, seemingly counterintuitive results, such as the propensity of RPS2 to lower prices, can be rigorously unpicked [28].In a similar vein, [29] use a simple two-state example to study the theoretical properties of the proposed tradeable performance-based policy (Clean Power Plan) and compare it to a mass-based C&T program.The authors distinguish between two kinds of tradeable performance-based policy, viz., regional or state by state.They find that under a state-by-state policy, power prices across states could be different even without any transmission congestion, reflecting the varying stringency of tradeable performance-based standards between states within an interconnected market.Next, via an M1 perspective, they extend the numerical analysis from a three-node case to the Pennsylvania-New Jersey-Maryland (PJM) Interconnection.
Aspects pertaining to the implementation of C&T policies have received some attention as well.Price-containment mechanisms are commonly applied within C&T programs to manage the volatility of allowance prices.These policies set a permit-price collar specifying the floor and the ceiling of permit prices, below or above which permits will be purchased using cost-containment reserves or injected to the market from emission-containment reserves to stabilise permit prices.Examples include program designs in California [30] and the Mid-Atlantic and Northeastern American states' Regional Greenhouse Gas Initiative (RGGI) [31].Using a blend of M1 and M2, [32] examine the effects of imposing various price-containment mechanisms on investment decisions and spot-market equilibria in an electricity market.The problem is formulated as a two-stage stochastic program with containment mechanisms represented by complementarity conditions.As per [29], they prove general properties of the optimal solution before implementing detailed problem instances.They conclude that the implied carbon costs to society can be significant when the emission-containment reserve is constantly activated to curb permit-price spikes.Price containment, of course, can also be done through allowance banking.Classic studies in the mould of M2 that address allowance banking typically consider multiple periods within the framework of optimal control, e.g., [33], or explicitly model inter-temporal linkage through coupling constraints, e.g., [34], who also blend M1-M2.
The possibility to align public and private incentives via environmental policy is showcased by [35], who use a twoperiod equilibrium model (M2) of an electricity industry with a thermal plant and a storage-enabled renewable plant.Absent a CO 2 tax, the renewable plant's marginal valuation of storage capacity may actually decrease with storage efficiency.Intuitively, greater efficiency would enable more energy to be moved to the peak period, thereby depressing the price when demand is high.Consequently, the renewable producer would attempt to offset the socially desirable but profit-damaging impact of greater storage efficiency by shifting energy to the off-peak period [36].In order to remedy this undesirable behaviour, an exogenous CO 2 tax on emissions from thermal generation induces a ceteris paribus increase in renewable output in the off-peak period due to the higher marginal cost of preserving stored energy for deployment in the peak period.Thus, when the device efficiency is increased exogenously under a CO 2 tax, it becomes less attractive to divert renewable energy to the off-peak period because it is already heavily deployed then.Hence, shifting renewable energy to the peak period appears enticing and enables the marginal valuation of storage to increase with device efficiency over a broader range.
The coordination of energy and environmental policies is another relevant perspective that has received less attention in the literature.Because of an increasing need for the contribution of renewables to power-system adequacy, many countries allow VRE to participate in capacity markets, where remuneration is made to incentivise sufficient investment in generation infrastructure [37].Thus, VRE may receive a payment from capacity markets in addition to subsidies such as tax credits, FiT, and REC.[38] assess how inaccurate capacity credits of VRE interact with renewable tax credits and REC.They argue that the capacity de-rating practice of VRE adopted in the Electric Reliability Council of Texas (ERCOT) is not effective, yielding a significantly greater capacity value than the actual contribution to system adequacy.This, in turn, leads to over-subsidisation of VRE through the capacity market.Taking the M1 approach with ERCOT data, they find that the largest efficiency loss occurs when over-counting of capacity credits for VRE is coupled with the US renewable tax credits, primarily due to over-subsidisation of solar photovoltaic generation, which is already heavily subsidised.
Another substrand of A1 examines the impact of environmental policy on welfare and emissions when firms are able to exert market power in the electricity market while behaving non-strategically with respect to the market-based environmental policy, i.e., essentially taking the CO 2 tax, ETS price, or REC price as exogenous. 3This perspective acknowledges the persistence of market power in the electricity industry, which is marked by a high concentration of ownership and spatio-temporal opportunities to create scarcity strategically. 4 Analyses of market power in the electricity market alone typically examine the impact that such leverage can have on economic and environmental metrics as the exogenous environmental policy is varied.In an equilibrium model of an oligopolistic electricity industry subject to C&T regulation, [18] demonstrate that when relatively clean plants belong to a few firms, their withholding of generation leads to expanded output by firms with relatively dirty portfolios.As a result, both electricity and permit prices can end up increasing.Taking the M1 approach, they use a 225-node model of the Western Electricity Coordinating Council (WECC) system.In a similar vein, [40] develop a Nash-Cournot model of the New York-Québec power sector subject to a C&T cap that corresponds to a portion of the RGGI coverage for New York.The exertion of market power in this setting effectively involves temporal arbitrage in hydropower by Hydro-Québec and withholding in general by large firms in New York, which causes the permit price to crash even as electricity prices soar.By contrast, a less stringent C&T cap leads to a milder impact on electricity prices from Cournot behaviour as price-taking fossil-fuelled fringe plants can respond to keep prices in check.Meanwhile, treating the EU ETS permit price as an exogenous CO 2 tax, [41] examine the impact of permit allocation and Cournot behaviour on electricity prices across 20 European countries.Also using a networkconstrained model, they show that higher ETS permit prices boost electricity prices for consumers and result in windfall profits for producers.[42] also treat the EU ETS price as a CO 2 tax in a Nash-Cournot model of Nord Pool to identify how strategic hydro operations would be affected by a future climate package.Under a high CO 2 price and doubled VRE capacity, temporal arbitrage by large hydro reservoirs [43] would become relatively more profitable as flexible resources would enjoy greater leverage in the presence of more intermittent generation and inhibited response by price-taking fossil-fuelled plants.Still treating the CO 2 price as an exogenous tax but endogenising investments in generation capacity, [44]'s open-loop Nash-Cournot analysis of the German power sector confirms the result of [16] that imperfect competition requires lowering the pollution tax.
Environmental policies can impact power-system reliability through the types of technologies that are promoted.For example, the intermittence of VRE, e.g., wind, can pose a significant challenge to the operations of the power system.An ISO may launch a new market, such as the CAISO Flexible Ramping Product, 5 in order to enhance the flexibility of the system when facing variation of output from wind.In this context, [45] consider ramping charges to mitigate strategic behaviour by a merchant storage investor (M2), while [46] address ramp-rate withholding (M1).Stylised models in the spirit of M2 also tackle market power in electricity markets when environmental policy is taken as exogenous.Focusing on the Swedish RPS scheme, [47] show that the exercise of market power à la Cournot would cause both electricity and REC prices to soar.In order to mitigate this impact of market power, they propose allowing producers from neighbouring countries to participate in the Swedish RPS.Demonstrating a paradoxical result due to the imposition of a CO 2 tax, [48] uses a closed-loop Cournot model of a two-node network with a single transmission line.Absent a CO 2 tax, the coal plant exports power to the node with the gas-fired plant, thereby congesting the line.In effect, each plant behaves as a monopolist at its node while anticipating the ISO's optimal dispatch.Thus, each plant solves a bi-level problem, which can be reformulated as a mathematical program with equilibrium constraints (MPEC).Together, the plants' MPECs comprise an equilibrium problem with equilibrium constraints (EPEC).When an exogenous CO 2 tax is introduced, it increases the effective marginal cost of generation of the coal plant by more than it does the gas plant's.Consequently, the line becomes uncongested, and the gas plant actually increases its generation, thereby leading to an overall increase in CO 2 emissions.Although the finding appears counterintuitive, it can be explained by the fact that the CO 2 tax converts two monopolies into a duopoly, i.e., alleviating producers' market power.Hence, environmental policy needs to be designed with care in the presence of market power and network constraints.

Carbon Leakage Stemming from Environmental Policy (A2)
Sub-regional climate policy has been a concern to policymakers as climate-change impacts go beyond local jurisdictional boundaries.The implementation of sub-regional policies creates perverse incentives for polluting generators located in unregulated regions within the same power market to ramp up their output, thereby causing the emission-leakage problem.In essence, polluting producers in unregulated regions are enticed by the ensuing high electricity price in the regu-lated region.Thus, they stymie the emission-reduction efforts of the regulated region.For example, if the regulated region reduces its own emissions by 100 t but the (partial-coverage) C&T policy induces the unregulated region's emissions to increase by 10 t, then so-called relative leakage of 10% occurs.
[49] studies this issue in the RGGI context by formulating a least-cost economic dispatch model of type M1 to quantify the extent of emission leakage under different levels of carbon prices.The findings are also supported by empirical work [50].Using a bottom-up equilibrium approach (M1) to probe the interface between EU ETS and non-ETS countries in the South-East Europe Regional Electricity Market (SEE-REM), [51] find relative leakage rates between 6.3% (for an ETS price of e50/t) and 40.5% (for an ETS price of e10/t).In other words, a low ETS price would lead to high relative leakage because of the ramping up of polluting generation in non-ETS countries.Subsequent increases to the ETS price would attract incrementally more non-ETS polluting generation until it had exhausted capacity to respond to the ETS price.Thus, the rate of relative leakage decreases as the ETS price increases.Moreover, greater hydropower availability may actually increase emissions in the ETS part of SEE-REM by lowering the electricity price and boosting consumption.
Taking a mix of M1 and M2 approaches, [52] demonstrates how carbon leakage may be mitigated with imperfect competition in the primary electricity market and price-taking behaviour for CO 2 permits.Her model is inspired by the California power sector, which generally has a cleaner portfolio than those of its neighbouring states.She first proves analytically via an M2 approach the conditions under which CO 2 emissions are reduced as a result of incomplete policy coverage.Next, she conducts a simulation using a more detailed M1 analysis of the California power sector.In particular, she finds that strategic behaviour in a Cournot oligopolistic electricity market subject to incomplete emission policy reduces emission leakage vis-à-vis perfect competition.In the California context, this means that while incomplete regulation in an imperfectly competitive market would achieve 35% of the emission reduction of complete regulation, this number drops to 25% for a perfectly competitive market.
Another strand of this research examines policy design to circumvent emission-leakage problems, including point-ofregulation [53], efficient border adjustment [54], and outputbased allowance allocation [55].Specifically, [53] investigate the consequences of California AB32 and conclude that loadand source-based regulation would be equivalent.They prove solution properties formally (M2) and use numerical examples of their equilibrium model (M1) to quantify the extent of the carbon leakage.[55] similarly find via a bottom-up equilibrium model that California's first-deliverer policy could induce resource reshuffling, thereby calling into question its effectiveness vis-à-vis source-based regulation.In effect, border carbon adjustment can mitigate leakage by dampening the economic incentives for utilisation of polluting technologies as [54] conclude in their detailed model of type M1 of California and Western North America.
Different regional power markets have also had their C&T programs administratively linked by state governments to explore heterogeneity in abatement costs.Examples include California's agreement to link its C&T to Québec, which is confronting legal challenges brought by the Trump Administration [56].As the linkage represents a sizeable transfer of wealth between regions, it is subject to fairly detailed analysis [57].Finally, linkage of a C&T within the power sector to other sectors has also received some attention, e.g., [34] study this in the context of the PJM regional power market using a mix of M1 and M2 through a residual supply curve from other programs or sectors.
The inefficiencies from overlapping or lack of coordination of environmental policies in the power sector have also been analysed.This is particularly relevant in the USA where the federal-leading effort is in limbo.Examples include policies with different market-based instruments competing in the same regulatory landscape, such as C&T and RPS.[58] blend M1 and M2 perspectives to examine market outcomes when C&T and RPS measures are concurrently implemented, which is fairly common in the US states.They use California data to find that making one policy more stringent would weaken the market incentive, which the other policy relies upon to attain its intended target.Other policies include, for example, the Internal Revenue Code 45Q 6 introduced by the US government to incentivise deployment of carbon capture, utilisation, and storage (CCUS) along with other eligible projects.It was further expanded under the 2022 Inflation Reduction Act to stimulate CCUS investment.Its interactions with existing C&T and RPS protocols could be incorporated into both M1 and M2 equilibrium frameworks in the context of sector coupling [59].Thus, careful coordination is needed to mitigate the impacts of the inefficiency.The benefit of such policy coordination is highlighted in [60] and [61], where both papers formulate bi-level problems (primarily M2 and M1, respectively) to optimise the regulation.

Market Power in Permit Markets (A3)
The exercise of market power in the permit markets has received relatively less attention in the literature, while strategic behaviour of dominant firms in wholesale electricity markets remains a concern [39].Although less investigated, a strand of the literature has focused on the interaction of electricity and permit markets both subject to market power.For example, empirical work by [62] suggests that during 2000 and 2001, oligopolistic firms strategically raised the NO x permit price in the C&T system with grandfathering adopted in California's Regional Clean Air Incentives Market (RECLAIM), which resulted in higher offers into the California electricity market.By contrast, Chapter 5 of [63] discusses that the permit price may be manipulated in an opposite direction when the initial permits are auctioned instead of grandfathered.Using a Stackelberg framework with a stylised three-node example (M1), they demonstrate that the leader firm could have an incentive to suppress the permit price to reduce its burden of purchasing the permits, thereby exerting oligopsony market power.
From a methodological perspective, one approach to studying strategic behaviour in the permit markets is a conjectured-price response model, in which a generating firm makes an assumption about how changes in net permit purchases affect the permit price [64].This method would be sensitive to the modelling assumptions of a firm's belief about the market power it can exercise in the permit market.Another approach that has been proposed is to model dominant or leader (e.g., Stackelberg-type leader) firms that can directly manipulate the permit price in their favour through the permit market.This situation is captured by incorporating the market-clearing condition for permits, usually represented as a complementarity condition, into the constraints of the profit-maximisation problem for the dominant/leader firms [63].In this line of research, [65] consider a Cournot firms-competitive fringe structure, where Cournot generators behave strategically, taking into account the equilibrium condition for emission permits in a C&T system and the fringe firms' price-taking behaviour.Their closed-form analysis (M2) shows that diverting grandfathered permits from Cournot to fringe producers can mitigate market power by reducing both electricity and permit prices.The result is supported by a simulation analysis of the California electricity market (M1).
On the other hand, [66] examine a Stackelberg leaderfollower model applied to the PJM Interconnection, in which the largest firm is assumed to play the role of the leader, anticipating the NO x permit equilibrium and the followers' reaction including the ISO.They solve the resulting MPEC (M1) for a 14-node system by combining different algorithms in sequence and suggest that the leader can gain substantial profits by driving up the NO x permit price under a grandfathering scheme.In a similar vein, [67] also use a leader-follower framework to investigate potential carbon leakage in Europe under the EU ETS with a permit auction.They obtain solutions by transforming their MPEC into a mixed-integer quadratic programming (MIQP) problem for a 22-node network of SEE-REM.They demonstrate the propensity of the leader firm to lower the CO 2 permit price, thereby increasing ETS emissions and exacerbating carbon leakage.
The interaction of electricity and permit markets under market power has also been examined in the context of RPS policy and a REC market.Amundsen and Nese [68•] model market power in the green-certificate market with a conjectured-price-response assumption (M2).Their analytical model of interactive gaming of the electricity and green-certificate markets indicates that the certificate price may be manipulated at a low or a high level in equilibrium.A similar result is obtained in [69], who use a dominant firmcompetitive fringe framework to explore the interaction of the electricity and REC markets both subject to market power.They assume that a non-renewable dominant firm strategically takes account of the market-clearing condition for REC and the price-taking behaviour of a renewable fringe firm.Their closed-form solutions reveal that the non-renewable firm has an incentive to lower the REC price, even to zero, in order to avoid the burden of REC costs.The lower REC price may lead to an underinvestment in renewables in the long run.[19] pursue another direction by studying the market power associated with performance-based emission policy.Based on a leader-follower model that blends M1 and M2, they illustrate that a leader firm with a clean endowment under the tradeable performance-based standard manipulates both electricity and permit prices, thereby resulting in worse market outcomes compared to its C&T counterpart.

Endogenous Policy Targets for Emissions (A4)
Many studies assume environmental policy as exogenous, whereas an assessment of endogenously determined policy targets has received scant attention partially because the policymaking process tends to be subject to political negotiation rather than economic principles.Indeed, in the real world, policymakers propose legally binding targets for reductions in GHG emissions, e.g., the EU 20-20-20 targets [70] or RPS targets, e.g., California's 60% target of RE sources for the year 2030 [71].However, the economic foundation of such environmental policy targets has been rarely investigated in the research community.
Notable exceptions include [16] and [14], who use M2type models to examine how the exercise of market power by polluting firms affects the socially optimal pollution tax.In such models, a welfare-maximising policymaker is the leader and determines the tax in anticipation of profitmaximising firms at the lower level, who take the pollution tax as exogenous.In particular, imperfect competition could invalidate the optimality of a Pigouvian tax on CO 2 emissions.Intuitively, this deviation arises because society faces two distortions from a polluting plant operated by a firm with market power: (i) economic welfare losses from price manipulation and (ii) environmental damage from emissions.In general, a firm that exerts market power tends to withhold generation in order to raise prices above perfectly competi-tive levels.This economic impact is somewhat offset by lower CO 2 emissions vis-à-vis perfect competition.Thus, a lower pollution tax (or price) would be preferred under imperfect competition.
In the extant literature on RPS policy, the RPS target has been mostly treated as exogenous.[72] study RE investment incentives using a real options approach under exogenous stochastic processes of electricity prices and support payments.In a Nordic case study, they conclude that RPS policy creates incentives for a large RE project, although the investment timing is delayed.[73] consider a stylised static equilibrium model (M2) for the electricity market taking into account both REC and CO 2 emission permits.They perform a comparative analysis and show how RE output and capacity are affected by the exogenous RPS target.With a similar framework, [28] examines the impact of exogenous RPS requirements on the equilibrium electricity price.She finds that modest RPS targets may initially lower the equilibrium electricity price since the REC price serves as a subsidy for RE.However, more stringent RPS targets reduce nonrenewable energy (NRE) production, thereby putting upward pressure on the equilibrium electricity price.The aforementioned studies of [68•] and [69] that take account of market power also set RPS targets exogenously.
To frame hierarchical decision making of policymakers and private firms, a strand of this literature develops bilevel models that take the perspective of a policymaker at the upper level who anticipates industry's investment and dispatch decisions at the lower level.In the context of RPS policy, [74] assume that a welfare-maximising policymaker acts as the leader to set the optimal RPS target, which is taken as given by follower power companies.They abstain from modelling power flows in transmission lines and other spatio-temporal features.Their closed-form solutions (M2) indicate that the optimal RPS target for a perfectly competitive electricity industry is higher than that for a centrally planned benchmark.Moreover, the optimal RPS target is lower than that of perfect competition when allowing for market power by the NRE sector.Their results imply that ignoring the interaction between RPS requirements and the market structure may lead to suboptimal RPS targets with welfare losses.Chapter 8 of [63] extends the framework of [74] by incorporating engineering details of power flows and generation-capacity expansion.They recast the bi-level model as a mixed-integer quadratically constrained quadratic program (MIQCQP) to obtain the optimal RPS target numerically (M1).A similar analysis for the optimal RPS target can be found in Chapter 6 of [75], which considers other technical aspects such as the ramping constraints of generation units and the temporal factors of seasons and time periods.
In the existing literature, exogenous treatments of environmental policy have been also common for economic instruments other than RPS, e.g., FiT [76], carbon tax [48], C&T [41], and tradeable performance-based standard [77].By contrast, [78] use a bi-level model (a blend of M1 and M2) with a policymaker at the upper level who endogenously decides the optimal FiT or feed-in-premium (FiP).They show that the FiT and FiP schemes yield similar outcomes in terms of maximised social welfare in a model that includes renewable generation expansion but does not consider transmission constraints.[79] formulate a bi-level model for endogenous tax policy with engineering details of power flows and generation-expansion planning (M1).However, in their model, an upper-level policymaker minimises the total cost of policy intervention such as carbon taxation instead of social welfare.On the other hand, [17] consider a bi-level model in the context of transmission planning and welfare maximisation.Using a stylised two-node model (M2), they prove that a carbon charge lowers social welfare in imperfectly competitive electricity markets as the resulting reduction in consumption facilitates further exercise of market power.In another direction, [80] formulate a two-node bi-level model (M2) to devise a proactive C&T policy by taking account of CO 2 emission leakage.From a policymaker's perspective, they derive the socially optimal emission caps for C&T under different regional-coverage policies.Another instance of bi-level modelling applied to a tradeable performance-based standard is found in [61], in which a welfare-maximising policymaker implements the regulation of an average emission rate for power producers.Their analytical results based on a transmission-constrained power market with an arbitrary number of states (nodes) (M1) indicate that localised heterogeneous regulation can be superior to system-wide homogeneous regulation.
The taxonomy for representative works is summarised in Table 1.Here, " * " denotes entries that overlap methodological boundaries.

Research Gaps and Methodological Requirements
Research on environmental policy in the power sector has burgeoned in recent years due to both climate targets necessitated by regulation and advances in underpinning methodology.Indeed, the advent of market-based mechanisms has generated a need for identifying disparate agents' incentives and responses to policy in a decentralised industry.Such a setup lends itself to analysis by non-cooperative game theory, which has been a mainstay of engineering/OR (M1) and environmental economics (M2) alike.Thus, the envisaged electrification of the wider economy, i.e., sector coupling, could both pose new challenges as well as create potential for coordinating mechanisms.Consequently, existing methodology will also have to adapt to a new paradigm in order to provide a credible basis for policy evaluation.Here, we iden- tify several research gaps in the extant literature along with corresponding methodological enhancements that would be necessary for bolstering the framework for analysis.

Research Gap RG1 Non-cooperative and cooperative game theory to identify coordination mechanisms, e.g., between sectors and agents
Electrification of the wider economy foresees a greater role for a VRE-enabled power sector.In principle, such socalled sector coupling could not only decarbonise heating, industrial processes, and transport [81] but also leverage such sectors' flexibility, e.g., via load shifting and heat storage, to mitigate the intermittency of VRE output.In case of full coordination over information and resources among sectors, an equilibrium problem in which each perfectly competitive sector pursues its own objective yields the same solution as central planning of all sectors [82].However, this fortuitous outcome may not hold if either agents within sectors exert market power [83] or there is imperfect coordination over information and resources [84].Consequently, unlocking the synergies from sector coupling will have to avail of more sophisticated methodology in order both to identify conflicts and to propose coordination mechanisms.
Mitridati et al. [85••] take a bi-level approach in which the flexibility of the heating sector can be made available for the power sector.In essence, the day-ahead dispatch of the heating sector in advance of the day-ahead electricity market clearing (as in the Nordic countries) should anticipate the requirements of the power sector for flexibility.However, in practice, the two markets are cleared sequentially, which necessitates an inefficient redispatch of the heating sector in order to accommodate the power sector's realised operations.By contrast, a stochastic bi-level model can proactively schedule the heating sector's operations, which serves as a suitable compromise between the industry's existing sequential market clearing and an idealised integrated solution.A similar concept is devised for improved coordination between day-ahead and real-time dispatch involving the power and natural-gas sectors [86].In such stochastic bi-level models with technical and spatio-temporal details (M1), solution of large-scale problem instances via either decomposition or directly as mixed-integer linear programs (MILPs) yields insights about price-or quantity-based coordination mechanisms between sectors.Moreover, relaxing the assumption of perfect competition in either sector could be reflected by adjusting the relevant objective functions [87] and adapting the resulting coordination mechanism (M2).
While such non-cooperative game-theoretic approaches can identify instances of imperfect coordination and assess the performance of proposed mechanisms, the field of cooperative game theory [88] formalises the notion of the Shapley value [89] to reward incremental contributions by entities to a cooperative effort.It is, therefore, especially useful when self-interested agents' benefits and costs depend upon the extent of cooperation over the exploitation of shared resources.Instances abound of the application of cooperative game theory in the power sector, e.g., investment in transmission capacity for North Sea off-shore wind projects [90••], community-based heat and power systems [91], and power-sector decarbonisation in the presence of externalities [92].Extending such frameworks to resolve conflicting objectives over externalities and sharing of resources, e.g., between sectors, could be fruitful in uncovering coordination mechanisms to leverage the benefits of electrification.

Research Gap RG2 Bi-level programming solution methods to allow for endogenous environmental regulation of largescale systems
Environmental problems in the power sector naturally lend themselves to a bi-level framework as a regulator typically wishes to set policy in anticipation of industry's behaviour.While environmental economics (M2) renders such leaderfollower problems in terms of a Stackelberg model and derives analytical solutions for the purpose of conducting comparative statics [16,74], such a perspective is limited by its inability to capture the full range of spatio-temporal variability in a power system.For this reason, the engineering/OR literature (M1) aims to reformulate bi-level problems as either MPECs or mathematical programs with primal and dual constraints (MPPDCs), which may subsequently be posed as MILPs, MIQPs, or MIQCQPs.
While MILPs, MIQPs, and MIQCQPs may be tackled directly by commercial solvers such as CPLEX or Gurobi [12,78,93,94], handling realistic test networks becomes a challenge.For example, even when an MPEC comprises an LP at the lower level, the resulting MILP reformulation necessitates tuning parameters, M, to implement disjunctive constraints that resolve complementarity conditions [95].While the big-M values corresponding to primal constraints of the lower level may be guessed ex ante because they are bounded by natural restrictions, e.g., on capacity, dual variables' big-M values are less straightforward to pin down.In effect, because dual variables are typically related to prices, their values are realised ex post.As [96•] argue, trial-anderror tuning heuristics for big-M values can break down and lead to suboptimal solutions if initial guesses for those corresponding to the dual variables are artificially restrictive.Setting arbitrarily large big-M values may seem like a sensible procedure for bypassing such issues, but this choice can itself be subject to errors as poor scaling can introduce numerical instabilities.Subsequently, solvers may report "optimal" solutions that may not even be feasible.Instead, special ordered sets of type 1 (SOS1) are proposed by [97], which avoid the use of big-M values.The introduction of storage will only further add to the complexity of solving bi-level problems efficiently as a sufficient number of representative periods and linking constraints need to be retained in order not to wreck the chronology of the time horizon under analysis [98].
Besides SOS1, resolution methods such as branch & bound [99], decomposition [100], and parametric programming [101] hold promise for tackling bi-level problems.Of these approaches, decomposition has seen the most widespread application, especially to problems involving strategic behaviour in the power sector.For example, [102] use Benders decomposition to analyse strategic investment by a producer in a power sector under uncertainty.Although they have several hundred scenarios and several dozen nodes in their problem instances, they obtain reliable solutions much faster than as a non-decomposed MPEC.They attribute the applicability of Benders decomposition to the fact that a strategic producer's profit function at the upper level is sufficiently convex in its investment decisions.Intuitively, a strategic producer is able to adapt its offers to the electricity market if increased investment leads to the shutdown of more costly units.Such an option is not available to a nonstrategic producer, which causes its profit function to be flat with respect to the investment decisions.However, if the situation were reversed, i.e., there were a welfare-maximising agent (such as a regulator) at the upper level and strategic behaviour (by firms) at the lower level, then the convexity of the upper-level objective function in the upper-level decision would not be evident.Thus, adapting Benders or other decomposition methods to reflect this regulatory perspective would be a promising area for future research.Likewise, using the structure of the lower-level problems as in branch & bound and parametric programming to "discard" unviable candidate solutions would enable the efficient solution of large-scale problem instances.

Research Gap RG3 Multi-sector analysis, e.g., by linking bottom-up power-sector and top-down computable general equilibrium (CGE) models, to address broader impacts on the economy due to the energy transition
The fact that most existing C&T programs encompass sectors other than the power sector means that it is imperative to consider the effect of other sectors on environmental policy in the power sector.There are two channels through which other sectors can affect C&T programs.One is through the changes in the supply or demand of allowances, which impact allowance prices.Within a two-sector framework, this can be done by explicitly modelling residual allowance supply/demand curves.For example, [34] develop a model of permit banking under imperfect competition and imperfect inter-temporal arbitrage when the program is linked to another sector.However, the so-called "other sector" is a lump of many sectors through which the precise interaction among those sectors cannot be explicitly examined.
The second way is to directly integrate a power-sector model with a top-down economic model such as CGE.The benefit is that interactions among and with sectors other than power sector can be examined through changes in factor prices.However, such integration is subject to a number of challenges [103].First, CGE models are calibrated based on input-output (IO) data or a social-accounting matrix, which represents "financial flows" rather than physical flows, e.g., in MWh, among the sectors [104].Second, CGE models describe a snapshot of the economy with a 1-year resolution.Consequently, it poses significant difficulties when coupled with power-system models based on much finer temporal resolutions.Third, the sectoral definition of CGE is not aligned with the power sector's structure.For instance, there is no sector explicitly defined as an ISO in the social-accounting matrix in the IO data.In effect, distribution and transmission are generally lumped together with generation as a single sector in the IO data.
A soft-link approach is commonly used to address this challenging issue.Soft linking involves iteratively solving CGE and corresponding power-sector models until consistent results are realised, e.g., quantities demanded under a common set of prices [105,106].However, due to inconsistencies in behavioural assumptions and the innate het-erogeneity of the models, it is difficult to achieve overall consistency and coherence [107].An emerging approach is called "integrated" or "hybrid," which entails formulating the market equilibrium (both CGE and the power-sector model) as mixed-complementarity problems [107].In this respect, [108] recognises the difficulty of constructing databases that integrate macroeconomic data with engineering information.In order to compare different coupling approaches, [109] find that being able to integrate CGE and engineeringbased models in a coherent and consistent way is one of the biggest strengths of the integrated approach.Nevertheless, they acknowledge that the dimensionality and algebraic complexities remain the main limitations that prevent hybrid models from having a real application.Thus, developing efficient algorithms to address the dimensionality of the resulting large-scale real-system applications will advance our knowledge in this area, especially as electrification integrates the power sector with those for heating, industrial processes, and transport.

Research Gap RG4 Fundamental analysis of the role of uncertainty in environmental economics models, e.g., about the abatement cost of emissions
Market-based environmental policies may have different impacts on emission outcomes when regulatory instruments are subject to uncertainty.A well-known discussion in environmental economics (M2) is "price vs. quantity" or the question about the choice between an emission tax and a C&T scheme when the abatement cost of emissions is uncertain to the regulator.If there is no uncertainty in the abatement cost, then both tax and C&T regulation can achieve the same target for emissions [110].However, the regulator may make errors in estimating the abatement-cost function, either overestimation or underestimation, because of uncertain factors, thereby leading to misidentification of the efficient emission level.In a classic analysis, [111] demonstrates that under uncertainty, a price-based instrument such as a tax is preferred with smaller efficiency losses when the marginal cost of abatement is more steeply sloped than the marginal benefit of abatement.By contrast, a quantity-based instrument such as a C&T is preferable when the marginal benefit of abatement is more steeply sloped than the marginal cost of abatement.Economic models for the selection of policy instruments with this type of uncertainty have been extended in several directions, e.g., the case of mixed instruments instead of a uniform policy [112], the case of multiple pollutants [113], and the case where firms can invest in abatement technologies [114].
M2-type studies that analyse the effect of environmental policies under uncertainty, not limited to uncertainty in abatement costs, mostly use stylised economic models without considering engineering details and the spatio-temporal features of power sectors.It is not straightforward whether policy implications regarding uncertainty obtained from M2 approaches directly apply to transmission-constrained electricity markets with an increasing penetration of uncertain and intermittent VRE output.On the other hand, the engineering/OR literature (M1) has explored the issues of uncertainty, e.g., related to VRE, over the past decades in the context of power-system operations and planning.Particularly, the methods of stochastic programming [115] and robust optimisation [116] have been extensively applied to the analysis of electricity markets [117][118][119].However, most applications are concerned with short-and long-term decision making under uncertainty such as economic dispatch, unit commitment, and investment in generation, distribution, and transmission facilities.The M1 literature has rarely examined the choice of environmental policy instruments and resulting emission outcomes in power markets when subject to uncertainty.
In this context, the M1 framework based on robust optimisation could be extended to incorporate uncertainty in not only demand and VRE output but also the cost of damage from GHG emissions.For example, [11] implement a generation-and transmission-expansion plan via robust optimisation for the Nordic region under increasingly more stringent curbs to GHG emissions.They subsequently estimate the expected cost of ignoring robustness by fixing the investment decsions from a stochastic programming model and evaluating them out of sample to demonstrate how lack of backup generation capacity could lead to infeasibilities.Hence, such an M1-type model could serve to devise robust environmental regulation as part of the energy transition.

Conclusions
In this paper, we review the existing literature on marketbased environmental policies in power sectors that face the challenges of legally binding targets for reducing GHG emissions.From methodological perspectives, we highlight the wedge between engineering/OR (M1) and environmental economics (M2) approaches in the literature.M1-type studies usually pursue engineering details of power systems, while abstracting away from the economic principles of environmental policies.By contrast, M2-type studies tend to explore the mechanisms and implications of environmental policies using stylised economic models without incorporating the power system's technical attributes.In this context, concerted research efforts should aim at bridging this divide between M1 and M2 to achieve the goal of decarbonising the power sector in reality.
Our literature review specifically identifies four application areas in the literature: implementation of market-based economic instruments such as CO 2 taxation, C&T, and RPS schemes (A1); carbon leakage stemming from varying stringency of environmental regulations among jurisdictions (A2); market power in permit markets such as manipulation of ETS and REC prices by large producers (A3); and endogenous policy targets for emissions or performance standards set by a policymaker in anticipation of industry's response (A4).We do not argue that applications are limited to these areas but rather emphasise a recognition of relevant research directions in recent years.
The main insights from A1 to A4 lead us to several research gaps in the extant literature.First, it could be beneficial to develop coordination mechanisms based on non-cooperative and cooperative game theory to enable a VRE-rich power sector that is coupled with other sectors, e.g., heating, industrial processes, and transport (RG1).Second, further advancement of bi-level programming solution methods, e.g., decomposition, would facilitate scaling stylised models to realistic spatio-temporally detailed networks to inform policy for endogenous environmental regulation (RG2).Third, given the expected electrification of the wider economy, it would be useful to enhance multi-sector analysis, e.g., by integrating top-down CGE and bottomup engineering-based models (RG3).Fourth, to address the fundamental issues of uncertainty in the context of environmental policies, the methods of stochastic programming and robust optimisation could be further extended for the analysis of endogenous environmental regulation, e.g., under uncertain costs of damage from GHG emissions (RG4).
Ultimately, any effort to address these research gaps requires interdisciplinary perspectives and coordination among various research communities.For example, impacts from climate change due to GHG emissions, including wildfires, sea-level rises, and widespread droughts, will need careful assessments from natural-science researchers who study the Earth's response to human activities.These include, for instance, forest ecologists, hydrologists, and atmospheric scientists.Likewise, efficient algorithms to solve largescale models need to be developed by OR analysts, applied mathematicians, and computer scientists to overcome computational challenges stemming from today's decentralised power systems.Meanwhile, policies and market representations in the analyses to mitigate climate change need to be grounded in economic theory with a well-thought-out practical implementation that, undoubtedly, needs input from economists and policymakers.Thus, one of the grand challenges is to create a research community that fosters an environment that facilitates such interdisciplinary efforts.