Emission Regulation of Markets with Sluggish Supply Structures

I examine regulation in the presence of convex investment costs and technology specific capacity stocks. Announcement of future emission taxes reduces current emissions unless fossil fuels are scarce, in which case the effect is ambiguous. Substantial future emission reductions require action today, because it takes time to build up clean production capacity and phase out dirty capacity. The Pigou tax must be coupled with sector specific investment taxes or subsidies to induce the socially optimal trajectory if the private discount rate differs from the social discount rate. If such investment taxes or subsidies are unavailable, a (time-inconsistent) second-best alternative may be to tax emissions above the Pigouvian level during the transition phase. The theory is complemented with a stylized numerical model of the US electricity market.


Introduction
Committed emissions from existing and proposed energy infrastructure represent more than the entire remaining carbon budget if the 1.5 °C target is to be achieved, and perhaps two-thirds of the remaining carbon budget if mean global warming is to be limited below 2 °C (Tong et al. 2019). Energy infrastructure has substantial (and partly sunk) investment costs and can remain operative for decades once in place. Hence, it is of crucial importance to direct investment away from emission intensive fossil fuels and towards low-emission alternatives if we are to curb global warming.
In this paper, I examine emission regulation and transition dynamics in the presence of long-lived capital in a competitive partial equilibrium model with resource scarcity. The modeling framework includes capital accumulation and convex investment costs to capture the sluggish response to regulation caused by long-lived capital. I use regulation of the electric power industry as an example throughout the paper.
The socially optimal time trajectory can be implemented in competitive equilibrium in a setting with convex investment cost and resource scarcity by a standard Pigouvian emission tax if the firms' private discount rate equals the social discount rate. Otherwise, i.e., if the firms are more impatient than the social planner, production capacity adjusts too slowly to the emission tax. 1 Therefore, a tax or subsidy on investment is needed to implement the socially optimal time trajectory in this case. This provides a rationale for using emission taxes and investment subsidies simultaneously, which is relevant for public policy because many countries have or consider regulation featuring both instruments (see, e.g., IPCC 2012). Further, emission taxes above the Pigouvian level may be optimal if the private discount rate exceeds the social discount rate and an emission tax is the only policy instrument available. The explanation is that a higher emission tax helps speed up the transition, which increases welfare if the firms discount the future too strongly. A caveat is that such policies tend to be time inconsistent. Last, the time lags of environmental policy may be substantial, and I also examine current effects of increasing (or introducing) future emission taxes. 2 A key research question here is whether the green paradox (Sinclair 1992;Sinn 2008) holds when the model includes capital accumulation, convex investment costs and resource scarcity. 3 The analysis highlights that large investments towards a less emission intensive production capacity mix is needed early on. The reason is that current investment in relatively clean production technologies decreases the cost of future emission reductions. This corroborates a key result in Vogt-Schilb et al. (2018), who explicitly models investment in abatement equipment and finds that it is optimal to start a long-term emission-reduction strategy with significant and early abatement investment, even if the optimal carbon price starts low and grows progressively over time. 4 Regarding current effects of anticipated future emission taxes, future taxes have three key dynamic effects on current emissions from electricity generation in a model with exhaustible resources and long-lived capital. First, future taxes increase the future cost of combusting fossil fuels. This reduces the profitability of upkeep and investment in emission intensive power plants, which again reduces the demand for fossil fuels. Second, future emission taxes increase future residual demand for low-emission electricity. This increases the profitability of investment in low-emission power plants, which again increases the 1 The Stern Review (Stern 2007), and the following discussion about appropriate social discount rates in cost-benefit analysis, indicates that the social discount rate may be below capital market interest rates, at least in the case of climate change (Weitzman 2007;Tol and Yohe 2006). 2 As pointed out by Di Maria et al. (2017), the time lags of environmental policy may be substantial; cf., e.g., the Kyoto Protocol, which was signed in 1997, entered into force in 2005, and had its first commitment period in 2008. 3 One unit of resource (e.g., a barrel of oil) cannot be extracted and sold twice. As shown by Hotelling (1931), intertemporal profit maximization involves that marginal present value profits from extraction are equalized through time. Because future emission taxes reduce profits from future resource extraction, announcement of such taxes causes the resource owners to move extraction forward in time. This mechanism was recognized early on by Sinclair (1992), who pointed out that present value carbon taxes should decline over time, as increasing carbon taxes accelerates emissions. Using similar reasoning, Sinn (2008) argues that demand-side climate policies might increase emissions, at least in the short run, and terms this effect the "green paradox." There is a large literature following up on this phenomenon (see Jensen et al. 2015, for a survey). 4 There are several key differences between the present paper and Vogt-Schilb et al. (2018). For example, the present paper features exhaustible resources, examine dynamic effects of future taxes, and derive the taxes and subsidies that can implement the socially optimal time trajectory in competitive equilibrium. electricity supply from these plants to the market. This reduces the equilibrium consumption of fossil fuels. Because of convex investment costs, it is cost efficient for the firms to begin adaptation to anticipated future emission taxes immediately. Therefore, emissions may decrease even before the tax has been implemented. Third, owners of scarce fossil resources can pre-empt a future emission tax by accelerating production of fossil energy such that more extraction take place before the tax is enacted. This is the well-known green paradox (see, e.g., Sinclair 1992;Sinn 2008). Whereas the two first mechanisms reduce the demand for fossil fuels, the third mechanism increases the supply of fossil fuels. Therefore, it is theoretically ambiguous whether the market equilibrium will feature increased or decreased fossil fuel consumption (before the emission tax is implemented), as compared to the case without future taxes.
I illustrate the analytical findings with a stylized numerical model of the US electricity market. The numerical model uses the Path solver in GAMS to solve the theoretical model as a mixed complementarity problem, given assumptions about quadratic functional forms for utility, environmental damage and production costs. The electricity demand function does not change over time in the numerical simulations, and technical change is either omitted or exogenous. The numerical simulations suggest that the green paradox does not hold in the presence of long-lived capital: Early emissions decrease in all cases following announcement of future emission taxes, except when investment costs are very low (the model then collapses towards the standard exhaustible resource model without long lived capital; see, e.g., Sinclair 1992, andSinn 2008).
The analysis predicts that forward-looking firms will reduce current use of inputs subject to stringent future regulation. In this respect, it is interesting to observe the current struggle of publicly traded US coal companies. 5 Clearly, there are several factors behind this, like slower economic growth, cheap natural gas and current environmental regulation. Nevertheless, it seems reasonable that also bleaker prospects caused by future environmental regulation and increased competition from renewable power partly explain the investors' vanishing interest in coal. 6 The presence of adjustment costs is an important premise for the present paper. Adjustment costs was early recognized, both related to firms' net capital investment decisions (Gould 1968;Lucas 1976) and related to changing the number of employees (Holt et al. 1960;Oi 1962). Capital adjustment costs arise, e.g., if the price of capital increases in the rate of investment. Labor adjustment costs include costs related to hiring, training and layoff. Bellofatto and Besfamille (2018) use the probability that a project is finished early and without the need of refinancing as a proxy for administrative capacity. Whereas capital adjustment cost is the type of adjustment cost explicitly modeled in the present paper, the results may be relevant also in the case of labor adjustment cost or administrative capacity constraints. In the macroeconomics literature, Kydland and Prescott (1982) assumes that it takes time to install new equipment, and Wickens (2008, p. 33) assume that the cost of a 5 According to Bloomberg (March 17, 2016), the combined market capitalization of US coal miners since 2011 has plunged from over $70 billion to barely $6 billion. In the past two years, at least six US coalmining companies have filed for bankruptcy. Their struggle to find rescue in the financial and capital markets underscores Wall Street's vanishing interest in coal companies (http://www.bloom berg.com/news/artic les/2016-03-16/coal-s-last-man-stand ing-dragg ed-to-the-brink -of-bankr uptcy ). 6 The International Energy Administration (IEA) states, referring to the 2015 Paris Climate Conference, that climate policy has emerged as a major driver for the future of coal in large parts of the world (http:// www.iea.org/Textb ase/npsum /mtcmr 2015s um.pdf). unit of investment depends on how large it is in relation to the size of the existing capital stock.
The paper also relates to a body of literature which examines resource extraction under capacity constraints, but without pollution (Kemp and Van Long 1980;Amigues et al. 1998;Holland 2003). Particularly relevant to the present paper, Amigues et al. (2015) show that optimal investment in renewables starts before the end of fossil fuel usage in a setting with adjustment costs and endogenous capacity constraints. Further, Coulomb et al. (2019) examines the optimal transition from coal to gas and renewables in a model with capacity constraints and adjustment costs. Similarly to the present paper, they find that different energy sources should be used together in order to smooth out adjustment costs. Whereas the present paper's modeling of the supply side shares many similarities with this body of literature, this paper adds to the literature by examining regulation and including environmental constraints (except for Coulomb et al. 2019, which also models pollution).
Another relevant branch of literature examines optimal use of energy sources, given emission constraints and exhaustible fossil fuels. This literature models renewable energy as a clean backstop and pays limited attention to adjustment costs. A general result without adjustment costs is that the clean backstop is kept on hold until the use of fossil fuels ceases (Chakravorty et al. 2006(Chakravorty et al. , 2008van der Ploeg and Withagen 2012). 7 The transition is different in the present paper, as renewables (and nuclear power) is phased in early on and exploited together with emission intensive fossil fuels along the socially optimal time trajectory.
Section 2.1 characterizes the competitive partial equilibrium. The tax scheme that can implement the socially optimal time trajectory is presented in Sect. 2.2. Section 2.3 examines dynamic effects following announcement of future emission taxes. Section 2.4 investigates second-best taxation when an emission tax is the only policy instrument available and firms discount the future too strongly. The numerical illustration is in Sects. 3 and 4 concludes.

Theoretical Analysis
Let the vector x t = x 1 t , x 2 t , … , x̄i t denote a representative consumer's consumption bundle of goods i ∈ I = 1, 2, … ,ī in period t ∈ T = 1, 2, … , t . The associated benefit is given by the increasing and strictly concave utility function u x t . Each good x i t is produced by a representative firm (or sector) i. I assume market clearing such that production of x i t equals consumption of x i t for all i ∈ I and t ∈ T . The firms' discount factor is given by ∈ (0, 1] and all derivatives are assumed to be finite. One interpretation of this model setup is an economy with concave utility from electricity consumption, where electricity may be derived from ̄i energy sources: coal, gas, hydro power, and so forth. I will use this as an example throughout the paper, i.e., we have one representative firm for each type of electricity generation technology. The investment costs of power generation are essentially capital construction costs and land, including "regulatory costs" for obtaining siting permits, environmental approvals, and so on. These costs may increase substantially in the presence of economy wide capacity constraints, like limited availability of skilled labor or raw materials. I assume that the investment cost function, i y i t , is strictly convex and increasing in investment y i t , with minimum at i (0) = 0. 8 The model framework allows the representative firm to actively reduce capacity faster than capital depreciation ( y i t < 0). 9 Operating costs for power plants include fuel, labor and maintenance costs. I divide these costs into fixed and variable operating costs. Fixed operating and maintenance costs, denoted f i (Y i t ) , include, e.g., salaries for facility staff and maintenance that is scheduled on a calendar basis. They do not vary significantly with a plant's electricity generation, but increase in capacity; i.e., we have The variable operating costs include the cost of consumable materials and maintenance that may be scheduled based on the number of operating hours or start-stop cycles of the plant. These costs are captured by the variable cost function . Variable operating costs increase in production x i t and decrease in the capacity measure Y i t . This is captured by the first order derivatives k i . 10 Note that abatement within a given technology is modeled as a flow variable; i.e. the current emission intensity does not matter for the future emission intensity. This differs from the emission reductions that may be achieved with a larger share of low-emission technology capacity (e.g., replacing coal capacity with renewables). Electricity generation (flow) abatement is most relevant for fossil fuels, where it may involve, e.g., switching to cleaner types of coal or use of combined cycle power plants or scrubbers (sulfur dioxide). 11 Production capacity evolves following the state equation: 8 The strict convexity of i (⋅) is a key assumption in the literature on adjustment costs referred in Sect. 1. It implicitly assumes that at least one factor used for expanding capacity is scarce. Joskow and Parsons (2009) point out that the human and manufacturing infrastructure required to produce major nuclear plant components, perform detailed engineering, and construct new nuclear plants is limited. Hence, a surge in nuclear plant orders will run up against capacity constraints on the supply of key components and labor, leading to higher component manufacturing costs and higher construction costs. Regarding petroleum, Osmundsen et al. (2015) and Skjerpen et al. (2018) finds that increased capacity utilization in the rig market increases the rig rates and, hence, the cost of capacity construction in the Gulf of Mexico and on the Norwegian continental shelf, respectively. Last, the modern-day gold rush of oil companies and contractors converging on western Canada's oil-sands markets bogged down as high materials costs and outstripped labor resources forced project delays and budget overruns around the year 2007 (see http://www.enr.com/artic les/29338 -oilsands -boom-extra cts-toll-on-costs ?v=previ ew). 9 The costs of decommissioning power plants depends, e.g., on the extent of environmental remediation required, the physical location of the plant, and the potential salvage value of equipment and scrap; see, e.g., Raimi (2017 , because a unique solution requires emission intensities above business as usual to be costly. The requirements for the hessian matrix associated with the firms' Hamiltonian to be negative definite are k i . 11 The modeling of abatement within a particular type of technology as a pure flow activity abstracts from the fact that most types of abatement action requires some sort of investment. This is common in the economic literature (cf., e.g., Nordhaus 1991Nordhaus , 1992. where ∈ (0, 1] is a capital depreciation factor and Ȳ i is initial capacity (a constant determined by history).
Assume that a subset of the representative firms j ∈ J = ĩ + 1,ĩ + 2, … ,ī use a scarce resource as an input factor in production ( J ⊆ I = 1, 2, … ,ĩ,ĩ + 1, … ,ī ). These firms have an additional term h j S j t x j t added to their variable operating cost function, where the remaining resource stock, S j t , evolves following the state equation: Here S j is an exogenous constant and I have normalized units in (2) such that one unit of production requires one unit of resource. We have the resource stock constraint S j t ≥ 0 . Further, resource scarcity implies that unit operating costs decrease in the remaining resource stock; i.e., we have h j S j t S j < 0 ; e.g., because the cheapest resource deposits are extracted first. Note that firm j ∈ J is an integrated firm that extracts the fossil fuels needed for electricity generation by itself. I assume that lim S j t →0 h S j t = ∞ ; i.e., the cost of resource extraction approaches infinity as the resource stock is completely exhausted. 12 Total operating costs are given by: For example, the first i = 1, 2, … ,ĩ may denote firms using non-exhaustible energy sources like renewables and nuclear, whereas the remaining j =ĩ + 1,ĩ + 2, … ,ī are firms combusting fossil energy sources like petroleum and gas. This cost structure implies that unit operating costs, c i z i ∕x i t , have the familiar skewed U-shape with minimum at c i z i ∕x i t = c i x i t z i (for any given e i t ). 13 Clearly, the relationship between f i (⋅) and k i (⋅) may differ markedly across technologies. For example, nuclear power plants feature high fixed costs relative to the variable operating costs, as compared with gas-fueled power plants. The strict convexity of i (⋅) and k i implies that the cost associated with any given change in the energy mix x may be reduced by increasing the number of time periods during which the change occurs. Specifically, the cost of reducing GHG emissions increases with the speed of emission ∀i ≤ĩ, ∀t, ∀i >ĩ, ∀t, 12 A framework where extraction costs increase with accumulated extraction is frequently used in the resource economics literature; see, e.g., Heal (1976), Hanson (1980) and Hoel (2012). Economic exhaustibility is arguably the relevant condition for most scarce energy resources. For example, before enhanced oil recovery (EOR), typically only around 30% of the oil in the reservoir has been recovered and around 70% remains in the ground. In some fields, recovery rates greater than 60% have been achieved using advanced EOR (e.g. Prudhoe Bay in Alaska); see https ://www.iea.org/comme ntari es/whate ver-happe ned-to-enhan ced-oil-recov ery. 13 We may have idle capacity if the ratio x i t ∕Y i t is low. The model abstracts from hourly, daily and seasonal variations in electricity demand (which is relevant for the transition from relatively flexible fossil fuel plants to less flexible renewable or nuclear energy). The capacity measure Y i t must account for power plant downtime caused by maintenance or weather conditions (renewables); see "Appendix C" on the numerical model.
reductions. I will discuss the results in a setting where the initial capacity mix Y 1 , Y 2 , … , Ȳi is characterized by too much dirty capacity, such that the socially optimal capacity declines over time for relatively emission intensive production technologies, and increases for relatively clean energies. Whereas this eases the discussion of the results, and arguably is the case for the electricity industry in many countries today, it is not necessary for the validity of the analytical results.
The emissions stock evolves following the state equation: where Ē is a constant determined by history and ∈ [0, 1) denotes the stock depreciation factor from one period to the next. Environmental damage from emissions is given by is weakly convex and increasing. 14 Let p i t denote the endogenous consumer price on x i t (net of taxes). I assume the regulator has access to three regulatory instruments: an emission tax, t , a tax on investment, i t , and a tax on extraction of exhaustible resources, i t . The representative firms i ∈ I may be interpreted as representing different sectors of the electric power industry; i.e., the wind power sector, the nuclear power sector, and so forth. I will therefore use the term sector specific taxes when referring to i t and i t . 15 The investment tax i t may take three forms: (i) a standard unit tax on investment if i t > 0 and y i t > 0 , (ii) a subsidy to decommissioning if i t > 0 and y i t < 0 , and (iii) a subsidy on investment if i t < 0 . The extraction tax i t is needed to slow down extraction of scarce resources if the private discount rate is higher than the social discount rate (see Sect. 2.2); i.e., i t > 0 is enacted to preserve scarce resources for use in the future.

Market Equilibrium
The competitive representative firm i ∈ I maximizes the present value of profits over the remaining time horizon solving: where c i z i t + i t x i t is operating costs including the extraction tax, i y i t + i t y i t is the cost of investment, including investment taxes, and e i t t is the emission tax payment. The maximization is subject to equations (1), (2), (3) and the resource constraint S j t ≥ 0. 16 A price-taking representative consumer maximizes net utility solving: The associated first order condition is u x i t x * t ≤ p i t for all i ∈ I and t ∈ T. We have the following result: solving (5) and (6) subject to equations (1), (2) and (3), satisfies: with Y i, * t and S i, * t as given by equations (1) and (2), respectively. We have i, * t = i, * t = 0 .

The weak inequalities are strict if and only if we have a corner solution for the relevant decision variable. 17
Proof See "Appendix B". ◻ We see from Lemma 1 that production of x i t increases in capacity Y i t , marginal utility from consumption and the remaining resource stock ( ∀i >ĩ ), whereas it decreases in production cost and extraction taxes. Note that Lemma 1 implies p 1 t = p 2 t = … = p̄i t if the goods are perfect substitutes. This is relevant if I is a set of electricity producers.
The variable i t is a (endogenous) shadow price representing the present value of the change in future profits caused by a marginal increase in current capacity. In the case where optimal production capacity declines towards a new and lower level (faster than capacity depreciation), higher capacity today induces too high fixed operating costs in the future. Hence, the shadow price i t is negative. Conversely, i t is positive if optimal capacity shifts upwards. 18 For sectors with scarce resources ( ∀i >ĩ ), more extraction today implies less extraction in the future. Hence, the resource owners must not only decide whether to extract the resource, but also when to extract. This consideration is captured by the non-negative shadow price on the remaining resource stock, j t , in Lemma 1 (the endogenous shadow price j t is often referred to as the 'scarcity rent' or 'Hotelling rent'). It is the present value change in future profits caused by a marginal increase in the remaining resource stock S j t . Lemma 1 implies that each resource owner equalizes marginal discounted profits from extraction over time. Otherwise, the resource owners could increase the present value of their resource by moving production between periods. The resource rents typically vary between the different resources.
Whereas the isolated effect of increased emission taxes is to decrease production (given e i t > 0 ), production of x i t in competitive equilibrium may increase in the emission tax t if x i t is a low-emission good. The reason is that the upward shift in each firm's supply cost functions, caused by the emission tax, increases in the emission intensity of the firm. Therefore, residual demand and equilibrium production of relatively low-emission goods increases. Equation (7f) in Lemma 1 states the familiar result that marginal costs of emission reductions (i.e., flow abatement for each type of technology) equal the emission tax in the interior solution.

The Socially Optimal Tax Scheme
Let welfare W be measured as the present value of utility from consumption net of environmental damages, production costs and investment costs: where 1 ≥ ≥ is the social discount factor. The regulator faces a trade-off in the presence of convex investment costs. On the one hand, fast emission reductions reduce environmental damage. On the other hand, the convexity of investment costs imply that the cost of emission reductions can always be reduced by extending the time horizon over which emission reductions take place. We have the following result: (4). Then, the socially optimal time trajectory can be implemented in partial competitive equilibrium with the following taxes: where: with * ,i t and * ,i t as given in Lemma 1, We first examine the case where the social discount factor equals the private discount factor, such that = . In this case i,sp t = i,sp t = 0 , given the optimal emission tax sp t . We observe that the expression for sp t is the present value of the stream of future marginal stock damages following one additional unit of emissions along the socially optimal time trajectory. This is sometimes referred to as the social cost of carbon in the case of greenhouse gases.
The Pigou tax sp t is only indirectly affected by the explicit modeling of production capacity and convex investment costs (the production capacity mix influence emissions and, hence, the optimal tax). Specifically, sp t is not reduced during the first years to give firms time to adjust. On the contrary, higher investment costs cause slower development of relatively clean production capacity, which again entails higher emissions, a higher absolute value shadow price on the emissions stock, and higher optimal emissions taxes (see also Fig. 2 in Sect. 3.2). More expensive decommissioning of dirty production capacity (e.g., coal) has the same effect (see Figure 10 in "Appendix A").
Assuming an interior solution for ( Nordhaus (1991;, the present analysis highlights that substantial investments may be necessary early on (see Fig. 1 in Sect. 3.2). The reason is that it takes time to implement the emission reductions necessary to curb global warming. The well-known result that lower (flow) abatement costs reduces the optimal emission tax remains valid in the present model setup (Weitzman 1974). Cheaper abatement also reduces the importance of redirecting investment towards less-emission intensive production capacity.
A common assumption when deriving the socially optimal time trajectory is that the firms' private discount rate equals the discount rate of the social planner (see, e.g., Nordhaus 1991Nordhaus , 1992Golosov et al. 2014). This assumption may be questionable, at least when applied to major environmental challenges like greenhouse gas emissions from power plants and climate change. The Stern Review (Stern 2007), and the following discussion about appropriate social discount rates in cost-benefit analysis, indicates that the social discount rate may be below capital market interest rates, at least in the case of climate change (Weitzman 2007;Tol and Yohe 2006). Indeed, the Stern Reviews's conclusions about the need for decisive immediate action hinges on the assumption of a near-zero pure time preference discount rate, which are inconsistent with today's marketplace real interest rates and savings rates (Nordhaus 2007). Goulder and Williams (2012) argue that we should distinguish between a social-welfare-equivalent discount rate appropriate for  determining whether a given policy would augment social welfare and a finance-equivalent discount rate suitable for determining whether the policy would offer a potential Pareto improvement. 19 The case where the social discount factor is larger than the private discount factor ( < ) involves i,sp t ≠ 0 ( ∀i ) and i,sp t > 0 for sectors with scarce resources ( ∀i >ĩ ). The optimal investment tax, i,sp t , is the difference between the social planner's and the representative firm's shadow price on capacity Y i t along the socially optimal time trajectory. Note that the socially optimal investment tax is positive (negative) if production decrease (increase) over time. For example, Proposition 1 may imply decommissioning subsidies to coal plants, whereas investment in renewable energy is subsidized; see Fig. 3 in the numerical Sect. 3.2.
The optimal extraction tax, i,sp t , is the difference between the social planner's and the representative firm's shadow price on scarce resources along the socially optimal time trajectory. This tax is needed because owners of scarce resources put too low value on the future resource bases in their current extraction decisions when < . Hence, the social planner taxes current extraction to conserve more of the resource stocks for future use.
The following corollary summarizes the above discussion:  Note that the sector specific taxes and subsidies, i,sp t and i,sp t , in general differs between sectors (or technologies), even though all sectors have the same private discount factor .

Dynamic Effects of Future Emission Taxes
In practice, it may be hard for lawmakers to enact an emission tax immediately (Di Maria et al. 2017). In this Sect. 1 analyze dynamic effects of future emission taxes. The other taxes are set to zero ( i t ≡ i t ≡ 0 ). Lemma 1 implies that a credible announcement of increased future emission taxes has three key effects in the electricity market: (a) Reduced fossil fuel demand from power plants Future emission taxes increase the future cost of burning fossil fuels. The decline in future emission intensive fossil-fueled electricity generation implies that optimal fossil-fueled power plant capacity will be lower in the future. This reduces the profitability of investment in, e.g., coal-fired power plants, and thereby the demand for coal (cf., a lower i t for emission intensive energy in Lemma 1). (b) Increased supply of electricity generated from (sufficiently) low-emission energy sources Future emission taxes imply higher supply costs for emission intensive fossilfueled power plants. Hence, low-emission electricity generation sources, like renewables or nuclear power, gain a competitive advantage when the tax is implemented. This increases the profitability of investing in low-emission electricity generation capacity (cf., a higher i t for low-emission energy in Lemma 1). The associated increase in non-fossil electricity generation capacity reduces the electricity market equilibrium consumption of fossil fuels. 20 (c) Increased current supply of fossil fuels Future taxes decrease the future value of the fossil fuel resource (cf., a lower value on the Hotelling rent i t for ∀i >ĩ in Lemma 1). Hence, it is profitable with faster extraction. This is the well-known (weak) green paradox (see, e.g., Sinclair 1992;Sinn 2008;Gerlagh 2011). In particular, Sinclair (1992) and Sinn (2008) caution against environmental policies that become more stringent with the passage of time, because such policies will accelerate resource extraction and, thereby, accelerate global warming.
Whereas the resource scarcity dynamic (c) suggest that exhaustible fossil fuel extraction accelerates following signaling of future environmental policies, the capacity stock dynamics (a) and (b) have the opposite effect. From a theoretical point of view, it is therefore ambiguous whether current emissions increase or decrease following signaling of stringent future climate policy, given that resource exhaustibility, capacity constraints and convex investment costs are present. The capacity stock mechanisms (a) and (b) strongly dominate the supply side mechanism (c) put forth by the green paradox literature in the numerical Sect. 3.3 below. One reason is that mechanism (c) only really matters for oil and gas fueled power plants (the resource rent is small for coal). The above discussion suggests that emissions will unambiguously decline following the tax announcement if scarcity (mechanism c) is negligible or non-existent. 21 The mechanics discussed above act on current production via the shadow prices on capacity ( i t , mechanisms a and b) and the resource stock ( i t , mechanism c). Figure 11 in "Appendix A" graphs how these shadow prices are affected by the tax announcement in the numerical simulations.

Second-Best Emission Taxes
The optimal tax scheme given in Proposition 1 involves taxes or subsidies that differs across sectors ( i t and i t ). In this section, I consider the case where the regulator is constrained to i t ≡ i t ≡ 0 and the representative firms' discount rate exceeds the social discount rate ( < ). 22 Then, whereas the Pigou tax sp t still perfectly balances environmental damages with (flow) abatement cost, the transition towards a less emission intensive production capacity mix is too slow as compared with the socially optimal trajectory (cf., the need for investment taxes and subsidies i,sp t when < in Proposition 1). The transition towards a cleaner capacity mix can then be accelerated by announcing a future emission tax that is above the Pigouvian tax.
We observe that a policy involving announcement of a future tax above the Pigouvian tax will be subject to the mechanisms discussed in Sect. 2.3 (i.e., we have an increase from the Pigouvian tax to an emission tax that is higher during the transition). Of particular interest, higher future emission cost decreases future production from relatively emission intensive power plants, which again decreases their shadow price on capacity i t and, hence, investment in emission intensive production capacity. By the same reasoning, investment in (sufficiently) low-emission investment capacity increases. Hence, a future emission tax above the Pigouvian level has similar effects as the optimal investment tax in Proposition 1. This suggests that welfare may be increased by implementing a tax above the Pigouvian level during the transition period, given that < and that mechanisms (a) and (b) dominate mechanism (c). Note that this argument only applies to future emission taxes; i.e., the regulator has no incentive to tax emissions above marginal environmental damages in the current time period. On the contrary, taxing current emissions above sp t unambiguously reduces welfare, because marginal abatement cost is then larger than marginal environmental damages.
Assume mechanisms (a) and (b) dominate mechanism (c). Then, the regulator faces the following trade-off: On the one hand side, a tax above the Pigouvian level increases welfare by accelerating the change in production capacity necessary for the transition towards less emission intensive electricity generation. On the other hand side, there is a loss by taxing current emissions above the Pigouvian level, because marginal abatement cost is then higher than marginal environmental damage. It will increase welfare to tax emissions above the Pigouvian level if and only if the former effect dominates the latter. These dynamics are 21 The abatement cost of a sector also matters here. Consider, e.g., a sector with relatively high BaU emissions and cheap abatement possibilities. This emission intensive sector may invest in capacity (and hence increase early emissions) because its' cheap abatement opportunities gives it a comparative advantage when the future emission tax is enacted. 22 An alternative setting is the case where the only available policy instrument is investment subsidies i t > 0 ; i.e., the regulator is constrained to t = 0 . We see from Lemma 1 that such a policy causes the electricity price to be below the price along the socially optimal time trajectory. The reason is that emission pricing reduces emissions by increasing the operating costs of emission intensive power plants, whereas renewable investment subsidies increase the capacity of low emission power plants. See Abrell et al. (2019) about subsidies to renewables versus emission pricing. examined numerically in Sect. 3.2, where the second-best emissions tax trajectory turns out to be markedly above the Pigouvian tax during the transition period; see Fig. 4.
A caveat is that this policy is likely to be time inconsistent. To see this, let t = {1, 2, 3} , < , i t ≡ i t ≡ 0 and suppose mechanisms (a) and (b) dominate mechanism (c). Consider a policy where the regulator in period t = 1 announces the following emission tax sequence sp 1 , sp 2 + , sp 3 , where is a small positive constant and sp t is given by Proposition 1. Then, the period t = 1 shadow price on capacity decreases in for relatively emission intensive sectors, and increases in for (sufficiently) low-emission sectors. Hence, the isolated effect of > 0 is to increase welfare (unless is too large). In period t = 2 , the regulator will have an incentive to break the commitment to > 0 , however, because in period t = 2 welfare is unambiguously maximized by = 0 . Therefore, this policy requires that the regulator can credibly commit to policies that, even though they increase present value welfare (8), will be less than optimal in the future time period in which they are enacted. 23

Numerical Illustration: The US Electricity Market
In this section I substantiate the analytical findings with complementary numerical results based on a stylized model for the US electricity market. The numerical model runs over the time horizon T = {2016, 2017, … , 2115} and uses the Path solver in GAMS (numerical software) to solve the theory model as a mixed complementarity problem. 24 Sect. 3.1 briefly summarizes the main characteristics of the numerical model, along with the data sources and estimation and calibration procedures used. The numerical results are given in sects. 3.2 and 3.3. See "Appendix C" for more details about the numerical model and data sources, including a test of model fit against history.

Parameterization and Functional Forms
The United States generated about 4 thousand terawatt hours of electricity in 2016, of which 30 percent came from coal plants, 34 percent from natural gas and petroleum, 20% from nuclear power and 15% from renewables. 25 I model electricity from these four energy sources, and coal-fired power plants with carbon capture and storage (CCS). Electricity is a homogeneous good and electricity generated from the different sources are modeled as perfect substitutes in consumption. I assume throughout that the US government allows increased nuclear energy production.
I use data from the US Energy Information Administration (EIA), IMF and British Petroleum to estimate a quadratic utility function for electricity consumption; see 23 Whereas we know from game theory that commitment to strategies involving suboptimal payoffs in single periods may be possible (given that present value payoff is increased by this strategy), the presence of a finite time horizon in the present model setup poses challenges; see., e.g., Osborne and Rubinstein (1994), pp. 134-136 and 155-160. 24 See Dirkse and Ferris (1995) and 'http://www.gams.com/' for information about the Path solver and GAMS. 25 Petroleum constituted less than 1%. In the "renewables" category we have the following shares: Hydro = 6.5%, biomass = 1.5%, geothermal = 0.4%, solar = 0.9% and wind = 5.6%. Figures are for net electricity generation. Emissions from the electric power industry constituted about 35% of US energy-related CO2 emissions in 2016. See https ://www.eia.gov/tools /faqs/faq.cfm?id=427&t=3. "Appendix C". This utility function yields the linear demand function for US electricity used in the numerical simulations. The electricity demand function does not change over time in the numerical simulations. 26 Real capital depreciation is set to 6% per year. 27 Technology specific investment costs are fetched from EIA. Technology specific operating costs are calibrated using historic figures from EIA, cost estimates of fossil fueled power plant ramp-up costs (Kumar et al. 2012), and figures for remaining fossil fuel reserves from British Petroleum. Supply of electricity generated from fossil fueled power plants, and gas in particular, is modeled quite flexible, whereas nuclear and renewables must invest in production capacity in order to increase production (with more than a few percent above the 2015 level). Emission reductions are possible either by lower electricity consumption or through substitution from fossil energy to renewables, nuclear energy or implementation of CCS. The stylized numerical model treats the emission intensities of each production technology as exogenous constants. 28 The demand function in the numerical simulations is kept at the level corresponding to GDP in 2016 over the whole time horizon, however. The reason is twofold: First, US electricity demand (and emissions) grows gigantic in the later modeling years if we extrapolate the trends in electricity demand since the 1950s. Second, the picture has been somewhat different the last decade or so: From 2005 to 2016, US GDP has increased with 16%, whereas US electricity consumption has increased with 0.45% (IMF World Economic Outlook Database; EIA November 2017 Monthly Energy Review). 27 Nadiri and Prucha (1993)  The highly stylized quadratic environmental damage function is calibrated such that the Obama Administration's 80% emission reduction target in 2050 (as compared with 2015 emissions) is socially optimal, given the other parameters in the model. 29 I assume social and private discount rates equal to 4 percent per year, unless otherwise stated. Figure 1 graphs net investment along the socially optimal time trajectory in the numerical model, and the (undiscounted) emission tax that implements this trajectory in competitive equilibrium (cf., Proposition 1 with = ). The socially optimal time trajectory is characterized by substantial investment in low-emission electricity generation capacity and decommissioning of fossil fueled power plants. The lion's share of investment occurs during the first years. 30 Whereas CCS plays an important part during the transition towards a clean energy mix, the use of CCS declines along the socially optimal time trajectory, as renewable and nuclear capacity increases after a couple of decades. The explanation is that CCS has relatively high operating costs and emissions, as compared to renewables and nuclear power plants. The results suggests that CCS can reduce the need for expensive investment in renewable capacity in the short term. 31 Figure 2 graphs optimal emission taxes and emissions along the socially optimal time trajectory for different assumptions about the magnitude of investment costs. Emissions and optimal taxes increase in investment cost, because the transition towards a cleaner energy mix slows down when investment costs increase, which again implies higher emissions and larger environmental damage. 32 Proposition 1 states that the socially optimal time trajectory can be implemented by a Pigou tax alone only if the private discount factor equals the socially optimal discount factor. Otherwise, the Pigou tax must be supplemented with taxes on investment and production. Figure 3 shows the investment taxes/subsidies necessary to induce the socially optimal time trajectory when the private discount factor is 0.9, whereas the social discount factor is 0.96. For comparison, the representative overnight capital costs used in the model calibration are 3636, 6084, 978, 5945 and 2557 USD per kW for coal, CCS, gas, nuclear and renewables, respectively (EIA 2016). The optimal extraction tax ( i t ) is close to zero for coal and small but positive for gas and petroleum. 33 29 According to the Obama Administration (US presidential administration from 2009 to 2017), the United States intended to roughly double its pace of carbon pollution reduction, from 1.2% per year on average during the period 2005-2020 to 2.3-2.8% per year on average between 2020 and 2025. This target was grounded in analysis of cost-effective carbon pollution reductions achievable under existing law and was intended to keep the US on the pathway to achieve deep economy-wide reductions of 80% or more by 2050. https ://www.white house .gov/the-press -offic e/2015/03/31/fact-sheet -us-repor ts-its-2025-emiss ions-targe t-unfcc c. The Trump administration succeeding Barrack Obama has implemented several changes that roll back Obama-era policies aiming to curb climate change, however. See, e.g., https ://news.natio nalge ograp hic.com/2017/03/how-trump -is-chang ing-scien ce-envir onmen t/. 30 This result also appeared in a previous version of the numerical model with flow damages only (no emission stock). 31 Coulomb et al. (2019) finds that gas can reduce the need for renewables in the short term (their model does not include CCS). 32 We know from Montgomery (1972) that emission taxes and quotas are equivalent in a setting without uncertainty. Figure 2 could therefore also be interpreted as graphing optimal emission quotas for various assumptions about investment costs (the 'Pigou tax' is then the endogenous quota price). 33 I do not calculate gas,sp t explicitly, but the numerical simulations suggests that it is small (the shadow price on gas and petroleum resources is relatively small compared to the other cost elements).

The Socially Optimal Time Trajectory
As discussed in the theory section, the regulator may be constrained to i t = i t = 0 , such that an emission tax is the only available policy instrument. If so, it may increase welfare to tax emissions above the Pigouvian level if < . The theory is ambiguous on this, however, because two effects counteract each other (see Sect. 2.4). Hence, I investigate this topic numerically. Let the regulator implement a tax of the form sp � t = sp t + t for all t ∈ T , where sp t is given by Proposition 1 and the regulator choose the level on t that maximizes welfare (8). The second-best tax is graphed Fig. 4. 34 Whereas the cost of the extra tax element in time period s ∈ T , s > 0 , is incurred in the period it is enacted (because marginal abatement costs exceeds the social cost of carbon), the benefits via investment occur over the time interval {2016, 2017, … , s − 1} . For example, 2028 > 0 acts on investment in the period 2016-2027, whereas 2017 > 0 only acts on investment in 2016. This is why t gradually increases over time (before declining as the transition is completed).
Technical change is omitted in the main part of present paper, but "Appendix A" features two different technology scenarios. In the first scenario, renewable investment cost declines by 5 percent each year. This implies, e.g., that renewable investment costs are halved by 2030, and only one tenth of baseline cost in 2060. As expected, exogenous technological change implies a delay in investment. Nevertheless, Fig. 8 in "Appendix A" shows that the optimal trajectory features large investments early on also in this technology optimistic scenario. The second scenario features a clean technology breakthrough. The new technology emerges in 2025, has potential to supply 1500 GWh per year at a marginal supply cost equal to half the 2015 electricity price, and the whole US electricity market at a marginal supply cost equal to the 2015 price. Otherwise, it shares characteristics with the renewable energy sector. As expected, the emergence of revolutionary clean technology has a major impact on the socially optimal time trajectory; see Fig. 9 in "Appendix A". 35 Irreversible investment and vintage models are examined by, e.g., Hart (2004), Caparros et al. (2015) and Rozenberg et al. (2019). Whereas I have assumed that decommissioning of capacity is possible in the main analysis, a simulation which approximates irreversible investment by strongly increasing decommissioning costs is included in "Appendix A". As expected, irreversibility implies a larger share of electricity generated by coal-fired power plants, with associated higher environmental damages and emission taxes; see Fig. 10.

Current Effects of Future Emission Taxes
We know from Sect. 2.3 that it is ambiguous from a purely theoretical perspective whether current emissions will increase or decrease following announcement of a future emission tax. In this section I investigate this topic using numerical analysis.
Consider an emission tax that is announced in the beginning of year 2016. The tax is zero for the period 2016-2024 and equals the Pigou tax thereafter ( i,sp t in Proposition 1 with i t = i t = 0). Figure 5 graphs the changes in net investment (investment minus capital depreciation) induced by the tax announcement in the period 2016-2050. As expected, investment in generation capacity from low-emission sources (renewables, nuclear and CCS) increases 34 The second-best tax in Fig. 4 is an approximation obtained by comparing welfare levels in the numerical model for different values on t . 35 The presence of endogenous technological change in the form of learning by doing pulls in the direction of more early abatement; see, e.g., and Bramoullé and Olson (2005), and Kverndokk and Rosendahl (2007). when the tax is announced. The reason is that future residual demand for electricity from low-emission plants will increase when electricity from coal plants is taxed. In terms of Lemma 1, the emission tax induces a higher future producer price for low-emission plants, with an associated higher shadow price on capacity. Furthermore, it is not profitable to invest in coal-fired power plants in the face of the future emission tax. Therefore, net investment is negative for coal. The results for gas are less clear. The reason is that gas is less emission intensive than coal, but more emission intensive than the other energy sources. The shadow price on gas the gas resource (i.e., the Hotelling rent) is lower in the tax simulation, as compared with the no-tax case; see Fig. 11 in "Appendix A". Figure 6 shows changes in electricity production and emissions following announcement of the future tax, as compared to the case with no tax. The lower capacity of coalfired power plants, implied by the figures for net investment graphed in Fig. 5, causes early production and emission from coal to decline. In addition, the increased capacity of low-emission power plants crowds out electricity from coal-fired power plants, also in the years before the tax is implemented. The black line in Fig. 6 shows the associated decline in aggregate yearly emissions. Emissions decline in all periods, except for a minuscule increase in 2016, which occurs because the capacity stock mechanics operates with a one period time lag. 36 Overall, the cumulative decline in emissions over the period 2016-2024, i.e., before the tax is implemented, constitutes 69 percent of total emissions in 2015. Because the sectors adjust optimally to the future regulation, Figs. 5 and 6 illustrate that immediate action is optimal even when the emission reduction targets are several years into the future.
It is interesting to examine how sensitive the results in Fig. 6 are with respect to the magnitude of investment costs. In a sensitivity analysis I multiply the model baseline investment costs ( (⋅) ) with ∈ {0, 0.05, 0.1, 0.2, 0.4, … , 2} ; see Fig. 12 in "Appendix A". Here, = 0 is the case with free investment in production capacity (no adjustment costs), whereas = 2 indicates that marginal investment costs are doubled. In the numerical model, total emissions from 2016 to 2024 declines unless investment costs are less than 5% of the baseline capital costs calibrated from figures given by the US Energy Administration (EIA 2016). The model collapses towards a standard exhaustible resource model without capacity constraints as investment costs approach zero.
Further sensitivity analysis was conducted for remaining fossil fuel reserves, discounting, the fuel shares of generation capacity in 2015, and the time lag between tax announcement and tax implementation (see Fig. 7 in "Appendix A" for the time lag sensitivity). Emissions in the time period between tax announcement and tax implementation decreased in all cases, given that the time-lag was two years or more. Total emissions over the period 2016-2024 remained lower in the tax simulation (as compared with the no-tax simulation) even when scarcity costs were multiplied with five, and when initial capacity was adjusted such that all electricity in 2015 was generated from gas and petroleum fired power plants.

Conclusion
This paper examined regulation in the presence of convex investment costs and technology specific capacity stocks. Four key results emerged: First, future emission reductions may require substantial investment in low emission energy sources today, because it takes time to build up clean production capacity and phase out dirty capacity. Second, the Pigou tax must be coupled with technology (or sector) specific investment taxes or subsidies to induce the socially optimal trajectory if the private discount rate differs from the social discount rate. Third, if such investment taxes or subsidies are unavailable, a second-best alternative may be to tax emissions above the Pigouvian level during the transition phase. A caveat is that this second-best tax policy is time-inconsistent, however. Fourth, announcement of future emission taxes reduces current emissions unless fossil fuels are scarce, in which case the effect is ambiguous in theory. The theory was complemented with a stylized numerical model of the US electricity market. The numerical model suggested that early emissions will decrease following the tax announcement in the combined presence of resource scarcity and long-lived capital.
The results in the present paper hinge on the assumption of strictly convex investment costs. Whereas this is reasonable in the presence of economy wide capacity constraints, or expedited construction of power plants (see the literature on adjustment costs cited in the introduction), there exists plausible scenarios where this convexity is non-existent or negligible.
The analysis features several simplifications and the results should be interpreted cautiously. Among them, the paper does not account for endogenous technical change and knowledge accumulation, which will be important in the transition towards a low emission economy (Goulder and Mathai 2000;Popp 2004;Kverndokk and Rosendahl 2007;Acemoglu et al. 2012). Further, the analysis does not feature general equilibrium effects and the environmental damage function in the numerical model is highly stylized. The results were derived under the assumption of perfect information, including knowledge about future prices. Whereas this is a very strong assumption, the economic rationale behind the discussion in Sect. 2.3 is straightforward: Expectations about a future emission tax reduces the incentives to maintain and invest in emission intensive production capacity, and increases the incentives to invest in clean alternatives. The associated change in the production capacity mix (i.e., a larger share of clean capacity) causes emissions to decline. As such, the essential assumption is that the tax announcement can induce an increase in expected future emission taxes. Proposition 1, on the other hand, is not valid without the perfect foresight assumption.
I have assumed that the regulator can commit credibly to future emission taxes. This matters because optimal regulation in the future depends on current investment levels. Specifically, the optimal future emission taxes prescribed by the tax rule in Proposition 1 typically change if the firms do not believe in the future tax levels as announced by the regulator today and, hence, choose different current investment levels than those targeted by the regulator (see, e.g., Kydland and Prescott 1977, on commitment and credibility). are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.

Appendix A: Figures
This appendix presents figures referred to in the text.
I have tested the model fit by running the model from 2005. Figure 7 shows model projections and historic figures for the period 2005-2015. Production levels, investment levels and electricity prices are endogenous. The simulation features actual values for GDP, coal prices and gas prices (I use constant values based on historic averages after 2015 in Sect. 3). All in all the model does reasonably well, but it struggles with the shale gas revolution. Specifically, the supply of coal is too high, whereas the supply of gas is too low in the last years of the sample. Figure 7 also graphs results from a sensitivity analysis w.r.t. the lag between tax announcement and tax implementation (x-axis). We observe that emissions in the time period between tax announcement and tax implementation decrease in all cases; unless the tax is implemented in the next year 2017 (cf., the one-period lag caused by Eq. 1).
The left diagram in Fig. 8 replicates Fig. 1 in Sect. 3.2 in the case of fast technical change, modeled as a 5% yearly reduction in the investment cost parameters k ren 1 and k ren 2 . We observe that, even though a slower change in the energy mix is optimal in the presence of very fast technological progress, a large share if the investment still occurs in the early years. The right diagram in Fig. 8 is a sensitivity analysis w.r.t. resource scarcity (the scarcity parameter c i 6 is 0, 105 and 210 in the simulations denoted with 0, 1 and 2, respectively). Note that the social cost of carbon decreases in resource scarcity. The socially optimal time trajectory with fast renewable technology growth. Net investment by technology (left axis) and optimal emission taxes (right axis). Right diagram: Emissions (bars) and the social cost of carbon (lines) when scarcity is zero (0), baseline (1) or doubled (2) Fig. 9 The socially optimal time trajectory with a clean technology revolution (FuTech available in 2025). Left diagram: Net investment and optimal tax. Right diagram: Production levels Figure 9 examines the socially optimal time trajectory in the case of an emerging clean future technology (FuTech). I have omitted CCS and used T = 70 in this simulation. Further, I allow FuTech to produce 25 GWh per year in the period 2016-2024. This was necessary for the numerical model to solve. The left diagram in Fig. 9 replicates Fig. 1 in Sect. 3.2. The right diagram graphs production levels (both with FuTech present). Figure 10 graphs sensitivity results in the case of very high decommissioning costs ( k i 3 = 100,000).  Table 1 for exact parameter values. All prices and costs are measured in 2016 USD in this paper (except for the estimation above). The numerical model solves the systems of equations in Appendix B (social planner and competitive equilibrium) as mixed complementarity problems, given these functional forms and parameter values.