Appendix
A Optimal Switch to Clean Energy
A.1 Large Local Damage
In this case, coal is used first, and then shale gas. Hence gas is used just before the switch to solar. Using the envelope theorem, the marginal benefit of delaying innovation can be written as:
$$\begin{aligned} \frac{\partial V(T_{b})}{\partial T_{b}}e^{\rho T_{b}}&=\left[ u\left( x_{e}(T_{b})\right) -(c_{e}+d+(\lambda _{0}+\theta _{e}\mu _{0})e^{\rho T_{b} })x_{e}(T_{b})\right] \nonumber \\&\quad -\left[ u\left( x_{b}\right) -c_{b}x_{b}\right] -(F^{\prime }(T_{b})-\rho F(T_{b}))\nonumber \\&=\pi _{e}(T_{b})-\pi _{b}+(\rho F(T_{b})-F^{\prime }(T_{b} )) \end{aligned}$$
(26)
Functions \(T_{b}\rightarrow (\rho F(T_{b})-F^{\prime }(T_{b}))\) and \(T_{b}\rightarrow \pi _{e}(T_{b})\) are decreasing with \(T_{b}\). It follows the assumptions made on F(.) for the first one (\(F^{\prime \prime }(.)<0\)). For the second one, we have:
$$\begin{aligned} \pi _{e}^{\prime }(T_{b})&=\left[ u^{\prime }\left( x_{e}(T_{b})\right) -(c_{e}+d+(\lambda _{0}+\theta _{e}\mu _{0})e^{\rho T_{b}})\right] \frac{\partial x_{e}(T_{b})}{\partial T_{b}}-\frac{\partial (\lambda _{0} +\theta _{e}\mu _{0}){e^{\rho T_{b}}}}{\partial T_{b}}x_{e}(T_{b})\\&=-\frac{\partial (\lambda _{0}+\theta _{e}\mu _{0}){e^{\rho T_{b}}}}{\partial T_{b}}x_{e}(T_{b})<0 \end{aligned}$$
as the final price of shale gas \(c_{e}+d+(\lambda _{0}+\theta _{e}\mu _{0})e^{\rho T_{b}}\) increases with \(T_{b}\). Hence \(\frac{\partial V(T_{b} )}{\partial T_{b}}e^{\rho T_{b}}\) is continuous and decreases with \(T_{b}\). It is positive for \(T_{b}\) such that \(x_{e}(T_{b})=x_{b}\)Footnote 27 and strictly negative when \(T_{b}\) goes to \(+\infty \).
As a result, \(V(T_{b})\) has a unique maximum \(T_{b}^{*}\) which satisfies:
$$\begin{aligned} \pi _{b}-\pi _{e}(T_{b}^{*})=\rho F(T_{b}^{*})-F^{\prime }(T_{b}^{*}) \end{aligned}$$
Functions \(T_{b}\rightarrow (\rho F(T_{b})-F^{\prime }(T_{b}))\) being decreasing with \(T_{b}\) and function \(T_{b}\rightarrow \pi _{b}-\pi _{e}(T_{b})\) increasing with \(T_{b}\), \(T_{b}^{*}\ge 0\) if and only
$$\begin{aligned} \rho F(0)-F^{\prime }(0)\ge \pi _{b}-\pi _{e}(0) \end{aligned}$$
A.2 Small Local Damage
The same reasoning applies, except that in this case, shale gas is used first, then coal. Hence coal is used just before the switch to solar occurs. \(V(T_{b})\) has a unique maximum \(T_{b}^{*}\) which satisfies:
$$\begin{aligned} \pi _{b}-\pi _{d}(T_{b}^{*})=\rho F(T_{b}^{*})-F^{\prime }(T_{b}^{*})>0 \end{aligned}$$
\(T_{b}^{*}\ge 0\) if and only
$$\begin{aligned} \rho F(0)-F^{\prime }(0)\ge \pi _{b}-\pi _{d}(0) \end{aligned}$$
B Thresholds
B.1 Large Local Damage
If shale gas is used alone, and coal is left under the ground, then the values of \(\lambda _{0},\)\(\mu _{0},\)\(T_{b}\) and \(X_{e}\) must solve the system composed of Eqs. (1), (11), (17) and
$$\begin{aligned} \theta _{e}X_{e}=\overline{Z}-Z_{0} \end{aligned}$$
(27)
which replaces (2). Moreover, to ensure that there exists no incentive to introduce coal at date 0, the initial price of shale gas \(p_{e}(0)\) must be below the initial price of coal, \(p_{d}(0),\) i.e. we must have
$$\begin{aligned} (\theta _{d}-\theta _{e})\mu _{0}\ge c_{e}+d-c_{d}+E^{\prime }(X_{e}) \end{aligned}$$
(28)
If the solution of the above system is such that this condition is satisfied, then shale gas is used alone to get to the ceiling. There exists a threshold value of the ceiling \(\overline{Z}_{1}\) under which only shale gas is used. It is solution of the system composed of Eqs. (1), (11), (17), (27) and (28), this last equation being taken as an equality.
If coal is used alone to get to the ceiling, then the values of \(\mu _{0}\) and \(T_{b}\) must solve the following system:
$$\begin{aligned}&\displaystyle \theta _{d}\int _{0}^{T_{b}}x_{d}(t)dt =\overline{Z}-Z_{0} \end{aligned}$$
(29)
$$\begin{aligned}&\displaystyle \quad \left[ u\left( x_{b}\right) -c_{b}x_{b}\right] -\left[ u\left( x_{d}(T_{b})\right) -(c_{d}+\theta _{d}\mu _{0}e^{\rho T_{b}})x_{d} (T_{b})\right] =\rho F(T_{b})-F^{\prime }(T_{b}) \qquad \end{aligned}$$
(30)
where Eq. (29) is the combination of Eqs. (1 ) and (2) for \(X_{e}=0,\) and Eq. (30) is Eq. (17) in the case \(X_{e}=0.\) Moreover, we must make sure that there is no incentive to extract shale gas: the final price of coal \(p_{d}(T_{b})\) must be lower than the price of the first unit of shale gas that could be extracted at date \(T_{b}\), \(c_{e}+d+\theta _{e}\mu _{0}e^{\rho T_{b}}.\) Hence we must have:
$$\begin{aligned} (\theta _{d}-\theta _{e})\mu _{0}e^{\rho T_{b}}\le c_{e}+d-c_{d} \end{aligned}$$
(31)
meaning that the marginal gain in terms of pollution of switching from coal to shale gas, evaluated at the carbon value at date \(T_{b}\), is smaller than the marginal cost of the switch. If the solution of the above system is such that this condition is satisfied, then shale gas is never extracted. There exists a threshold value of the ceiling \(\overline{Z}_{2}\), such that if \(\overline{Z}\ge \overline{Z}_{2}\) shale gas is not developed. \(\overline{Z}_{2}\) is solution of the system composed of Eqs. (29), (30) and (31), this last equation being written as an equality.
For an intermediate ceiling \(\overline{Z}\) such that \(\overline{Z} _{1}<\overline{Z}<\overline{Z}_{2}\), the three phases exist.
Note that these two thresholds cannot coincide, except if \(T_b=0\), which cannot be the case, by assumption.
B.2 Small Local Damage
If shale gas is used alone to get to the ceiling, then \(\lambda _{0},\)\(\mu _{0},\)\(T_{b}\) and \(X_{e}\) must solve the system composed of Eqs. (1), (27), (11) and:
$$\begin{aligned} \left[ u\left( x_{b}\right) -c_{b}x_{b}\right] -\left[ u\left( x_{e}(T_{b})\right) -(c_{e}+d+(\lambda _{0}+\theta _{e}\mu _{0})e^{\rho T_{b} })x_{e}(T_{b})\right] =\rho F(T_{b})-F^{\prime }(T_{b}) \end{aligned}$$
(32)
Moreover, the final price of shale gas \(p_{e}(T_{b})\) must be lower than the price of the first unit of coal that could be extracted at date \(T_{b},\)\(p_{d}(T_{b}),\) i.e. we must have:
$$\begin{aligned} (\theta _{d}-\theta _{e})\mu _{0}e^{\rho T_{b}}>c_{e}+d-c_{d}+E^{\prime } (X_{e})e^{\rho T_{b}} \end{aligned}$$
(33)
meaning that the cost in terms of pollution of switching to coal instead of going directly to solar is higher than the advantage in terms of production costs. It happens for values of the ceiling below \(\overline{Z}_{3}\) defined by (1), (27), (11), (32) and (33) taken as an equality.
For \(\overline{Z}>\overline{Z}_{3},\) the three resources are used.
C The Effects of a More Stringent Climate Policy
C.1 Large Local Damage
In this case, Eqs. (1) and (2) may be written as:
$$\begin{aligned}&\displaystyle \int _{T_{e}}^{T_{b}}x_{e}(t)dt =X_{e}\nonumber \\&\displaystyle \int _{0}^{T_{e}}\theta _{d}x_{d}(t)dt+\int _{T_{e}}^{T_{b}}\theta _{e} x_{e}(t)dt=\overline{Z}-Z_{0} \end{aligned}$$
(34)
Using (34), this last equation reads:
$$\begin{aligned} \int _{0}^{T_{e}}x_{d}(t)dt=\frac{1}{\theta _{d}}\left( \overline{Z} -Z_{0}-\theta _{e}X_{e}\right) \end{aligned}$$
(35)
Totally differentiating system (34), (35), (14), (17) and (11) yields:
$$\begin{aligned}&x_{e}(T_{b})dT_{b}-x_{e}(T_{e})dT_{e}+\int _{T_{e}}^{T_{b}}dx_{e}(t)dt=dX_{e}\\&x_{d}(T_{e})dT_{e}+\int _{0}^{T_{e}}dx_{d}(t)dt=\frac{1}{\theta _{d}}\left( d\overline{Z}-\theta _{e}dX_{e}\right) \\&\left[ \theta _{d}\mu _{0}-(\lambda _{0}+\theta _{e}\mu _{0})\right] \rho dT_{e}+(\theta _{d}-\theta _{e})d\mu _{0}-d\lambda _{0}=0\\&\qquad -\left[ u^{\prime }\left( x_{e}(T_{b})\right) -(c_{e}+d+(\lambda _{0}+\theta _{e}\mu _{0})e^{\rho T_{b}}\right] dx_{e}(T_{b})+((d\lambda _{0}+\theta _{e}d\mu _{0})\\&\qquad +(\lambda _{0}+\theta _{e}\mu _{0})\rho dT_{b})e^{\rho T_{b}}x_{e}(T_{b}) =\left( \rho F^{\prime }(T_{b})-F^{\prime \prime }(T_{b})\right) dT_{b}\\&\qquad \qquad \qquad \qquad d\lambda _{0}=E^{\prime \prime }(X_{e})dX_{e} \end{aligned}$$
As
$$\begin{aligned} x_{d}(t)&=D(p_{d}(t))\Rightarrow dx_{d}(t)=D^{\prime }(p_{d}(t))dp_{d} (t)=D^{\prime }(p_{d}(t))\theta _{d}e^{\rho t}d\mu _{0}\\ x_{e}(t)&=D(p_{e}(t))\Rightarrow dx_{e}(t)=D^{\prime }(p_{e}(t))dp_{e} (t)=D^{\prime }(p_{e}(t))e^{\rho t}\left( d\lambda _{0}+\theta _{e}d\mu _{0}\right) \end{aligned}$$
the first two equations read equivalently:
$$\begin{aligned} x_{e}(T_{b})dT_{b}-x_{e}(T_{e})dT_{e}+\left[ \int _{T_{e}}^{T_{b}}D^{\prime }(p_{e}(t))e^{\rho t}dt\right] \left( d\lambda _{0}+\theta _{e}d\mu _{0}\right)= & {} dX_{e}\\ x_{d}(T_{e})dT_{e}+\left[ \int _{0}^{T_{e}}D^{\prime }(p_{d}(t))e^{\rho t}dt\right] \theta _{d}d\mu _{0}= & {} \frac{1}{\theta _{d}}\left( d\overline{Z}-\theta _{e}dX_{e}\right) \end{aligned}$$
Besides,
$$\begin{aligned}&\dot{D}(p_{d}(t))=D^{\prime }(p_{d}(t))\dot{p}_{d}(t)=D^{\prime } (p_{d}(t))\theta _{d}\mu _{0}\rho e^{\rho t}\\&\quad \Rightarrow \int _{0}^{T_{e}}D^{\prime }(p_{d}(t))e^{\rho t}dt=\frac{1}{\theta _{d}\mu _{0}\rho }\int _{0}^{T_{e}}\dot{D}(p_{d}(t)dt=\frac{1}{\theta _{d}\mu _{0}\rho }\left[ D(p_{d}(T_{e}))-D(p_{d}(0)\right] \\&\quad =\frac{x_{d}(T_{e})-x_{d}(0)}{\theta _{d}\mu _{0}\rho } \end{aligned}$$
and
$$\begin{aligned} \int _{T_{e}}^{T_{b}}D^{\prime }(p_{e}(t))e^{\rho t}dt=\frac{x_{e}(T_{b} )-x_{e}(T_{e})}{(\lambda _{0}+\theta _{e}\mu _{0})\rho } \end{aligned}$$
Hence the first two equations read:
$$\begin{aligned} -x_{e}(T_{e})dT_{e}+x_{e}(T_{b})dT_{b}-dX_{e}+\frac{x_{e}(T_{b})-x_{e}(T_{e} )}{(\lambda _{0}+\theta _{e}\mu _{0})\rho }\left( d\lambda _{0}+\theta _{e}d\mu _{0}\right)= & {} 0\\ x_{d}(T_{e})dT_{e}+\frac{\theta _{e}}{\theta _{d}}dX_{e}+\frac{x_{d} (T_{e})-x_{d}(0)}{\mu _{0}\rho }d\mu _{0}= & {} \frac{1}{\theta _{d}}d\overline{Z} \end{aligned}$$
Using the equality between marginal utilities, the fourth equation simplifies, and we obtain easily:
$$\begin{aligned} A\times \left( \begin{array}{c} dT_{e}\\ dT_{b}\\ dX_{e}\\ d\lambda _{0}\\ d\mu _{0} \end{array} \right) =\left( \begin{array}{c} 0\\ \frac{1}{\theta _{d}}\\ 0\\ 0\\ 0 \end{array} \right) d\overline{Z} \end{aligned}$$
with
$$\begin{aligned} A=\left( \begin{array}{ccccc} -x_{e}(T_{e}) &{} x_{e}(T_{b}) &{} -1 &{} \frac{x_{e}(T_{b})-x_{e}(T_{e})}{(\lambda _{0}+\theta _{e}\mu _{0})\rho } &{} \theta _{e}\frac{x_{e}(T_{b} )-x_{e}(T_{e})}{(\lambda _{0}+\theta _{e}\mu _{0})\rho }\\ x_{e}(T_{e}) &{} 0 &{} \frac{\theta _{e}}{\theta _{d}} &{} 0 &{} \frac{x_{e} (T_{e})-x_{d}(0)}{\mu _{0}\rho }\\ \left[ \lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0}\right] \rho &{} 0 &{} 0 &{} 1 &{} \theta _{e}-\theta _{d}\\ 0 &{} (\lambda _{0}+\theta _{e}\mu _{0})\rho x_{e}(T_{b})+z_{1} &{} 0 &{} x_{e} (T_{b}) &{} \theta _{e}x_{e}(T_{b})\\ 0 &{} 0 &{} -z_{2} &{} 1 &{} 0 \end{array} \right) \end{aligned}$$
where
$$\begin{aligned} z_{1}&=-\left( \rho F^{\prime }(T_{b})-F^{\prime \prime }(T_{b})\right) e^{-\rho T_{b}}>0\\ z_{2}&=E^{\prime \prime }(X_{e})>0 \end{aligned}$$
Hence:
$$\begin{aligned}&\rho \theta _{d}\mu _{0}(\lambda _{0}+\theta _{e}\mu _{0})\det A\\&\quad =\theta _{d}\left[ \underbrace{\left( x_{e}(T_{e})-x_{e}(T_{b})\right) }_{>0}x_{d}(0)\theta _{d}\mu _{0}+\underbrace{\left( x_{d}(0)-x_{e} (T_{e})\right) }_{>0}x_{e}(T_{b})\left( \lambda _{0}+\theta _{e}\mu _{0}\right) \right] z_{1}z_{2}\\&\qquad +\rho \left\{ \left[ \underbrace{\left( \theta _{e}x_{e}(T_{b})-\theta _{d}x_{d}(0)\right) }_{<0}\theta _{e}\mu _{0}-x_{d}(0)\theta _{d}\lambda _{0}\right] \underbrace{\left( \lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0}\right) }_{<0}+x_{e}(T_{e})\theta _{d}\lambda _{0}^{2}\right\} z_{1}\\&\qquad +\rho \theta _{d}x_{d}(0)x_{e}(T_{e})x_{e}(T_{b})\theta _{d}\mu _{0} (\lambda _{0}+\theta _{e}\mu _{0})z_{2}\\&\qquad +\rho ^{2}\theta _{d}(\lambda _{0}+\theta _{e}\mu _{0})x_{e}(T_{b})\left[ x_{e}(T_{e})\lambda _{0}^{2}-x_{d}(0)(\lambda _{0}+\theta _{e}\mu _{0} )\underbrace{\left( \lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0}\right) } _{<0}\right] \end{aligned}$$
i.e. \(\det A>0.\)
$$\begin{aligned}&A^{-1}\times \left( \begin{array}{c} 0\\ \frac{1}{\theta _{d}}\\ 0\\ 0\\ 0 \end{array} \right) =\frac{1}{\rho \theta _{d}\mu _{0}(\lambda _{0}+\theta _{e}\mu _{0})\det A}\times \\&\quad \left( \begin{array}{c} \mu _{0}\left( \lambda _{0}+\theta _{e}\mu _{0}\right) \left[ \frac{\theta _{d} }{\lambda _{0}+\theta _{e}\mu _{0}}\left( x_{e}(T_{e})-x_{e}(T_{b})\right) z_{1}z_{2}+\rho z_{1}(\theta _{d}-\theta _{e})+\rho x_{e}(T_{b})\left( x_{e}(T_{e})z_{2}\theta _{d}+\rho (\theta _{d}-\theta _{e})\left( \lambda _{0}+\theta _{e}\mu _{0}\right) \right) \right] \\ -\, \rho x_{e}(T_{b})\mu _{0}(\lambda _{0}+\theta _{e}\mu _{0})\left[ -x_{e} (T_{e})\theta _{d}z_{2}+\rho \theta _{e}\underbrace{\left( \lambda _{0} +(\theta _{e}-\theta _{d})\mu _{0}\right) }_{<0}\right] \\ -\, \rho \mu _{0}\left[ -x_{e}(T_{b})z_{1}\theta _{e}\underbrace{\left( \lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0}\right) }_{<0}+x_{e}(T_{e} )\theta _{d}\lambda _{0}\left( z_{1}+\rho x_{e}(T_{b})(\lambda _{0}+\theta _{e}\mu _{0})\right) \right] \\ -\, z_{2}\rho \mu _{0}\left[ -x_{e}(T_{b})z_{1}\theta _{e}\left( \lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0}\right) +x_{e}(T_{e})\theta _{d}\lambda _{0}\left( z_{1}+\rho x_{e}(T_{b})(\lambda _{0}+\theta _{e}\mu _{0})\right) \right] \\ -\, \rho \mu _{0}\left( \lambda _{0}+\theta _{e}\mu _{0}\right) \left[ \begin{array}{c} \frac{\theta _{d}\mu _{0}}{\lambda _{0}+\theta _{e}\mu _{0}}(x_{e}(T_{e} )-x_{e}(T_{b}))z_{1}z_{2}+x_{e}(T_{b})z_{1}z_{2}-\rho z_{1}\left( \lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0}\right) \\ -\rho x_{e}(T_{b})\left[ -x_{e}(T_{e})\theta _{d}\mu _{0}z_{2}+\rho (\lambda _{0}+\theta _{e}\mu _{0})\left( \lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0}\right) \right] \end{array} \right] \end{array} \right) \end{aligned}$$
As \(\det A>0,\) we deduce:
$$\begin{aligned} \frac{\partial T_{e}}{\partial \overline{Z}}>0,\quad \frac{\partial T_{b} }{\partial \overline{Z}}>0,\quad \frac{\partial X_{e}}{\partial \overline{Z} }<0,\quad \frac{\partial \lambda _{0}}{\partial \overline{Z}}<0,\quad \frac{\partial \mu _{0}}{\partial \overline{Z}}<0 \end{aligned}$$
C.2 Small Local Damage
In this case, Eqs. (1) and (2) may be written as:
$$\begin{aligned} \int _{0}^{T_{d}}x_{e}(t)dt= & {} X_{e} \end{aligned}$$
(36)
$$\begin{aligned} \int _{T_{d}}^{T_{b}}x_{d}(t)dt= & {} \frac{1}{\theta _{d}}\left( \overline{Z} -Z_{0}-\theta _{e}X_{e}\right) \end{aligned}$$
(37)
Totally differentiating system (36), (37), (16), (21) and (11) yields:
$$\begin{aligned}&\displaystyle x_{e}(T_{d})dT_{d}+\frac{x_{e}(T_{d})-x_{e}(0)}{(\lambda _{0}+\theta _{e}\mu _{0})\rho }=dX_{e}\\&\displaystyle x_{d}(T_{b})dT_{b}-x_{d}(T_{d})dT_{d}+\frac{x_{d}(T_{b})-x_{d}(T_{d})}{\theta _{d}\mu _{0}\rho }=\frac{1}{\theta _{d}}\left( d\overline{Z}-\theta _{e}dX_{e}\right) \\&\displaystyle -((d\lambda _{0}+\theta _{e}d\mu _{0})+(\lambda _{0}+\theta _{e}\mu _{0})\rho dT_{d})e^{\rho T_{d}}x_{e}(T_{d})+\theta _{d}(d\mu _{0}+\mu _{0}\rho dT_{d})e^{\rho T_{d}}x_{d}(T_{d})=0\\&\displaystyle \theta _{d}(d\mu _{0}+\rho dT_{b})e^{\rho T_{b}}x_{d}(T_{b})=\left( \rho F^{\prime }(T_{b})-F^{\prime \prime }(T_{b})\right) dT_{b}\\&\displaystyle d\lambda _{0}=E^{\prime \prime }(X_{e})dX_{e} \end{aligned}$$
Using \(x_{e}(T_{d})=x_{d}(T_{d}),\) we obtain:
$$\begin{aligned} A\times \left( \begin{array}{c} dT_{d}\\ dT_{b}\\ dX_{e}\\ d\lambda _{0}\\ d\mu _{0} \end{array} \right) =\left( \begin{array}{c} 0\\ \frac{1}{\theta _{d}}\\ 0\\ 0\\ 0 \end{array} \right) d\overline{Z} \end{aligned}$$
with
$$\begin{aligned} A=\left( \begin{array}{ccccc} x_{d}(T_{d}) &{} 0 &{} -1 &{} \frac{x_{d}(T_{d})-x_{e}(0)}{(\lambda _{0}+\theta _{e}\mu _{0})\rho } &{} \theta _{e}\frac{x_{d}(T_{d})-x_{e}(0)}{(\lambda _{0} +\theta _{e}\mu _{0})\rho }\\ -x_{d}(T_{d}) &{} x_{d}(T_{b}) &{} \frac{\theta _{e}}{\theta _{d}} &{} 0 &{} \frac{x_{d}(T_{b})-x_{d}(T_{d})}{\mu _{0}\rho }\\ \left[ -\theta _{d}\mu _{0}+(\lambda _{0}+\theta _{e}\mu _{0})\right] \rho &{} 0 &{} 0 &{} 1 &{} -(\theta _{d}-\theta _{e})\\ 0 &{} y_{1} &{} 0 &{} 0 &{} \theta _{d}x_{d}(T_{b})\\ 0 &{} 0 &{} -E^{\prime \prime }(X_{e}) &{} 1 &{} 0 \end{array} \right) \end{aligned}$$
where
$$\begin{aligned} y_{1}=-\left( \rho F^{\prime }(T_{b})-F^{\prime \prime }(T_{b})\right) e^{-\rho T_{b}}+\rho x_{d}(T_{b})\theta _{d}\mu _{0}>0 \end{aligned}$$
Let’s denote
$$\begin{aligned} y_{2}=E^{\prime \prime }(X_{e})\left[ x_{d}(T_{d})\theta _{d}\mu _{0} +x_{e}(0)\left( \lambda _{0}+\left( \theta _{e}-\theta _{d}\right) \mu _{0}\right) \right] \end{aligned}$$
According to (16), we have:
$$\begin{aligned} \lambda _{0}+\left( \theta _{e}-\theta _{d}\right) \mu _{0}=\left( c_{d} -(c_{e}+d)\right) e^{-\rho T_{d}}>0 \end{aligned}$$
which implies that \(y_{2}\) is also positive.
We have
$$\begin{aligned}&-\rho \theta _{d}\mu _{0}(\lambda _{0}+\theta _{e}\mu _{0})\det A\\&\quad =\rho x_{d}(T_{b})^{2}\theta _{d}^{2}\mu _{0}\left\{ \rho (\lambda _{0} +\theta _{e}\mu _{0})(\lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0})+E^{^{\prime \prime }}(X_{e})\left[ x_{d}(T_{d})\theta _{d}\mu _{0}\right. \right. \\&\qquad \left. \left. +\,x_{e}(0)(\lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0})\right] \right\} \\&\qquad +\,y_{1}\rho \left\{ x_{d}(T_{d})\theta _{d}\lambda _{0}^{2}+x_{e}(0)\theta _{e}^{2}\mu _{0}(\lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0})-x_{d}(T_{b} )\theta _{d}(\lambda _{0}+\theta _{e}\mu _{0})(\lambda _{0}\right. \\&\qquad \left. +\,(\theta _{e}-\theta _{d})\mu _{0})\right\} \\&\qquad +\,y_{1}E^{^{\prime \prime }}(X_{e})\theta _{d}\left\{ x_{e}(0)(\lambda _{0}+\theta _{e}\mu _{0})\left( x_{d}(T_{d})-x_{d}(T_{b})\right) +x_{d} (T_{b})\theta _{d}\mu _{0}(x_{e}(0)-x_{d}(T_{d}))\right\} \end{aligned}$$
It is straightforward that the terms of the first and third lines are positive. Let look at the term of the second line:
$$\begin{aligned}&y_{1}\rho \big \{x_{d}(T_{d})\theta _{d}\lambda _{0}^{2}+x_{e}(0)\theta _{e} ^{2}\mu _{0}(\lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0})\\&\qquad -\,x_{d}(T_{b})\theta _{d}(\lambda _{0}+\theta _{e}\mu _{0})(\lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0})\big \} \end{aligned}$$
Dividing by \(y_{1}\rho >0\), it has the sign of:
$$\begin{aligned}&\lambda _{0}^{2}(\theta _{d}x_{d}(T_{d})-\theta _{d}x_{d}(T_{b}))\\&\quad +\lambda _{0}\mu _{0}(\theta _{e}^{2}x_{e}(0)+\theta _{d}^{2}x_{d} (T_{b})-2\theta _{e}\theta _{d}x_{d}(T_{b}))\\&\quad +\mu _{0}^{2}\theta _{e}(\theta _{e}^{2}x_{e}(0)+\theta _{d}^{2}x_{d} (T_{b})-\theta _{e}\theta _{d}x_{d}(T_{b})-\theta _{e}\theta _{d}x_{e}(0)) \end{aligned}$$
It is straightforward that \(\lambda _{0}^{2}(\theta _{d}x_{d}(T_{d})-\theta _{d}x_{d}(T_{b}))>0\). Moreover
$$\begin{aligned} \lambda _{0}\mu _{0}(\theta _{e}^{2}x_{e}(0)+\theta _{d}^{2}x_{d}(T_{b} )-2\theta _{e}\theta _{d}x_{d}(T_{b}))=\lambda _{0}\mu _{0}x_{d}(T_{b})(\theta _{d}-\theta _{e})^{2}+\lambda _{0}\mu _{0}\theta _{e}^{2}(x_{e}(0)-x_{d}(T_{b})) \end{aligned}$$
(38)
and
$$\begin{aligned} \mu _{0}^{2}\theta _{e}(\theta _{e}^{2}x_{e}(0)+\theta _{d}^{2}x_{d}(T_{b} )-\theta _{e}\theta _{d}x_{d}(T_{b})-\theta _{e}\theta _{d}x_{e}(0))=\mu _{0} ^{2}\theta _{e}(\theta _{d}-\theta _{e})(\theta _{d}x_{d}(T_{b})-\theta _{e} x_{e}(0)) \end{aligned}$$
(39)
so that regrouping the last two terms (38) and (39), one gets :
$$\begin{aligned}&\lambda _{0}\mu _{0}\left( \theta _{e}^{2}x_{e}(0)+\theta _{d}^{2}x_{d} (T_{b})-2\theta _{e}\theta _{d}x_{d}(T_{b})\right) \\&\qquad +\mu _{0}^{2}\theta _{e}\left( \theta _{e}^{2}x_{e}(0)+\theta _{d}^{2}x_{d}(T_{b})-\theta _{e} \theta _{d}x_{d}(T_{b})-\theta _{e}\theta _{d}x_{e}(0)\right) \\&\quad =\lambda _{0}\mu _{0}x_{d}(T_{b})(\theta _{d}-\theta _{e})^{2}+\lambda _{0} \mu _{0}\theta _{e}^{2}(x_{e}(0)-x_{d}(T_{b}))\\&\qquad +\mu _{0}^{2}\theta _{e}(\theta _{d}-\theta _{e})(\theta _{d}x_{d}(T_{b})-\theta _{e}x_{e}(0))\\&\quad =\lambda _{0}\mu _{0}x_{d}(T_{b})(\theta _{d}-\theta _{e})^{2}+\lambda _{0} \mu _{0}\theta _{e}^{2}(x_{e}(0)-x_{d}(T_{b}))\\&\qquad +\mu _{0}^{2}\theta _{e}(\theta _{d}-\theta _{e})((\theta _{d}-\theta _{e})x_{d}(T_{b})-\theta _{e}(x_{e} (0)-x_{d}(T_{b})))\\&\quad =\lambda _{0}\mu _{0}x_{d}(T_{b})(\theta _{d}-\theta _{e})^{2}+\lambda _{0} \mu _{0}\theta _{e}^{2}(x_{e}(0)-x_{d}(T_{b}))+\mu _{0}^{2}\theta _{e}(\theta _{d}-\theta _{e})^{2}x_{d}(T_{b})\\&\qquad -\mu _{0}^{2}\theta _{e}^{2}(\theta _{d} -\theta _{e})(x_{e}(0)-x_{d}(T_{b}))\\&\quad =\mu _{0}x_{d}(T_{b})(\theta _{d}-\theta _{e})^{2}(\lambda _{0}+\theta _{e} \mu _{0})+\mu _{0}\theta _{e}^{2}(x_{e}(0)-x_{d}(T_{b}))(\lambda _{0}+\mu _{0}(\theta _{e}-\theta _{d})) \end{aligned}$$
which is positive. As a result:
$$\begin{aligned} \det A<0 \end{aligned}$$
We also obtain:
$$\begin{aligned} A^{-1}\times & {} \left( \begin{array}{c} 0\\ \frac{1}{\theta _{d}}\\ 0\\ 0\\ 0 \end{array} \right) =\frac{1}{\theta _{d}(\lambda _{0}+\theta _{e}\mu _{0})\det A}\\&\left( \begin{array}{c} y_{1}\left[ E^{\prime \prime }(X_{e})(x_{e}(0)-x_{d}(T_{d}))\theta _{d} +\rho \left( \theta _{d}-\theta _{e}\right) (\lambda _{0}+\theta _{e}\mu _{0})\right] /\rho \\ -x_{d}(T_{b})\theta _{d}\left[ \rho (\lambda _{0}+\theta _{e}\mu _{0})\left( \lambda _{0}+\left( \theta _{e}-\theta _{d}\right) \mu _{0}\right) +y_{2}\right] \\ y_{1}\left[ x_{d}(T_{d})\theta _{d}\lambda _{0}-x_{e}(0)\theta _{e}\left( \lambda _{0}+\left( \theta _{e}-\theta _{d}\right) \mu _{0}\right) \right] \\ y_{1}E^{\prime \prime }(X_{e})\left[ x_{d}(T_{d})\theta _{d}\lambda _{0} -x_{e}(0)\theta _{e}\left( \lambda _{0}+\left( \theta _{e}-\theta _{d}\right) \mu _{0}\right) \right] \\ y_{1}\left[ \rho (\lambda _{0}+\theta _{e}\mu _{0})\left( \lambda _{0}+\left( \theta _{e}-\theta _{d}\right) \mu _{0}\right) +y_{2}\right] \end{array} \right) \end{aligned}$$
As \(\det A<0,\) we deduce:
$$\begin{aligned} \frac{\partial T_{d}}{\partial \overline{Z}}<0,\quad \frac{\partial T_{b} }{\partial \overline{Z}}>0,\quad \frac{\partial X_{e}}{\partial \overline{Z} }\text { ambiguous},\quad \frac{\partial \lambda _{0}}{\partial \overline{Z}}\text { ambiguous},\quad \frac{\partial \mu _{0}}{\partial \overline{Z}}<0 \end{aligned}$$
\(\frac{\partial X_{e}}{\partial \overline{Z}}\) and \(\frac{\partial \lambda _{0} }{\partial \overline{Z}}\) have the same sign as \(x_{e}(0)\theta _{e}\left( \lambda _{0}+\left( \theta _{e}-\theta _{d}\right) \mu _{0}\right) -x_{d} (T_{d})\theta _{d}\lambda _{0}.\) It is negative when \(\theta _{e}=0,\) and positive when \(\theta _{e}=\theta _{d}.\)
D Low Price Elasticity of Demand
Step 1. Expenditure pD(p) is continuous and increasing with p. From Lagrange theorem, denoting \(p_{T_{b}}\equiv p(T_{b})\) and \(x_{T_{b}}=D(p(T_{b}))\) there exists a price \(p_{i}\in ]c_{b},p_{T_{b}}[\) such that:
$$\begin{aligned} p_{T_{b}}x_{T_{b}}=c_{b}x_{b}+(D(p_{i})+p_{i}D^{\prime }(p_{i}))(p_{T_{b} }-c_{b}) \end{aligned}$$
The elasticity of demand at price \(p_{i}\) is \(\epsilon _{i}=-\frac{p_{i}D^{\prime }(p_{i})}{D(p_{i})}\) so that the above equation can be rewritten as:
$$\begin{aligned} \frac{x_{T_{b}}}{D(p_{i})}=\frac{c_{b}x_{b}}{p_{T_{b}}D(p_{i})}+(1-\epsilon _{i})\left( 1-\frac{c_{b}}{p_{T_{b}}}\right) \end{aligned}$$
or equivalently:
$$\begin{aligned} \frac{x_{T_{b}}}{D(p_{i})}-1=\frac{c_{b}}{p_{T_{b}}}\left( \frac{x_{b} }{D(p_{i})}-1\right) -\epsilon _{i}\left( 1-\frac{c_{b}}{p_{T_{b}}}\right) \end{aligned}$$
As \(\frac{x_{T_{b}}}{D(p_{i})}-1<0\) and \(\frac{c_{b}}{p_{T_{b}}}\left( \frac{x_{b}}{D(p_{i})}-1\right) >0\), denoting \(\epsilon =\max _{i}(\epsilon _{i})\), it comes that:
$$\begin{aligned} \frac{x_{T_{b}}}{D(p_{i})}-1&=O(\epsilon ) \end{aligned}$$
(40)
$$\begin{aligned} \frac{x_{b}}{D(p_{i})}-1&=O(\epsilon )\frac{p_{T_{b}}}{c_{b}} \end{aligned}$$
(41)
Similarly, using Lagrange theorem between prices \(c_{e}\) and \(c_{b}\), one gets, with \(p_{j}\in ]c_{e},c_{b}[\):
$$\begin{aligned} \frac{x_{b}}{D(p_{j})}-1&=O(\epsilon ) \end{aligned}$$
(42)
$$\begin{aligned} \frac{x_{c_{e}}}{D(p_{j})}-1&=O(\epsilon )\frac{c_{b}}{c_{e}} \end{aligned}$$
(43)
So that, if the price elasticity of demand is such that \(\epsilon \frac{cb}{c_{e}}=O(\zeta )\), then \(\frac{x_{b}}{D(c_{e})}-1=O(\zeta )\).
Step 2. Recall that:
$$\begin{aligned} (u(x_{b})-c_{b}x_{b})-(u(x_{T_{b}})-p_{T_{b}}x_{T_{b}})=\rho F(T_{b} )-F^{\prime }(T_{b}) \end{aligned}$$
(44)
\(\rho F(T_{b})-F^{\prime }(T_{b})\) is decreasing with \(T_{b}\) as \(F^{\prime \prime }>0,\) so that \(\rho F(T_{b})-F^{\prime }(T_{b})<\rho F\left( \frac{\overline{Z}}{\theta _{d}x_{c_{e}}}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}x_{c_{e}}}\right) .\) Using Eq. (43), it comes that \(\forall c_{b},c_{e}\), there exists \(\epsilon \) such that \(\rho F(T_{b})-F^{\prime }(T_{b})\le \rho F\left( \frac{\overline{Z} }{\theta _{d}x_{b}}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) \). Equation (44) thus implies that:
$$\begin{aligned} (u(x_{b})-c_{b}x_{b})-(u(x_{T_{b}})-p_{T_{b}}x_{T_{b}})\le \rho F\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) \end{aligned}$$
so that
$$\begin{aligned} 0\le p_{T_{b}}x_{T_{b}}-c_{b}x_{b}\le \rho F\left( \frac{\overline{Z} }{\theta _{d}x_{b}}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) \end{aligned}$$
so that
$$\begin{aligned} 0\le \frac{p_{T_{b}}x_{T_{b}}}{c_{b}x_{b}}-1\le \frac{\rho F\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) }{c_{b}x_{b}} \end{aligned}$$
and thus
$$\begin{aligned} 1\le \frac{p_{T_{b}}}{c_{b}}\le \left[ 1+\frac{\rho F\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) -F^{\prime }\left( \frac{\overline{Z} }{\theta _{d}x_{b}}\right) }{c_{b}x_{b}}\right] \frac{x_{b}}{x_{T_{b}}} \end{aligned}$$
Substituting the equation above in Eq. (41), it comes that:
$$\begin{aligned} \frac{x_{b}}{D(p_{i})}-1\le O(\epsilon )\left[ 1+\frac{\rho F\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) }{c_{b}x_{b}}\right] \frac{x_{b} }{x_{T_{b}}} \end{aligned}$$
which can be rewritten, multiplying both sides by \(\frac{x_{T_{b}}}{x_{b}}\):
$$\begin{aligned} \frac{x_{T_{b}}}{D(p_{i})}-\frac{x_{T_{b}}}{x_{b}}\le O(\epsilon )\left[ 1+\frac{\rho F\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) }{c_{b}x_{b}}\right] \end{aligned}$$
For an arbitrarily small \(\zeta \), one can find \(\epsilon \) such that \(\epsilon \left[ 1+\frac{\rho F\left( \frac{\overline{Z}}{\theta _{d}x_{b} }\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) }{c_{b}x_{b}}\right] \le \zeta \). As a result, if \(\epsilon \left[ 1+\frac{\rho F\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}x_{b}}\right) }{c_{b}x_{b}}\right] =O(\zeta )\), then, using Eq. (40): \(\frac{x_{T_{b}}}{x_{b}} =\frac{x_{T_{b}}}{D(p_{i})}+O(\zeta )=1+O(\zeta )\).
So that, \(\forall \zeta ,c_{e},c_{b},x_{b},\overline{Z}\), there exists \(\epsilon \) such that, if the elasticity of demand is always below \(\epsilon \) then \(\forall p\in [c_{e},p_{T_{b}}],\):
$$\begin{aligned} \frac{x_{p}}{x_{e}}=1+O(\zeta ) \end{aligned}$$
For a small local damage, we have shown in Appendix C.2 that \(\frac{dX_{e}}{d\overline{Z}}\) has the sign of \(x_{e}(0)\theta _{e}(\lambda _{0}+\theta _{e}-\theta _{d})\mu _{0}-x_{d}(T_{d})\theta _{d}\lambda _{0}\). Using that, for a sufficiently low elasticity of demand \(x_{e}(0)=x_{d} (T_{d})+O(x_{e}(0)\zeta )\), it comes that \(\frac{dX_{e}}{d\overline{Z}}\) has the sign of \(-x_{e}(0)((\theta _{d}-\theta _{e})(\lambda _{0}+\theta _{e}\mu _{0})+O(\zeta \theta _{d}\lambda _{0}))<0\).
E Cheaper Renewable Energy
E.1 Large Local Damage
Using the same steps as above one can define:
$$\begin{aligned} A\times \left( \begin{array}{c} dT_{e}\\ dX_{e}\\ d\lambda _{0}\\ d\mu _{0} \end{array} \right) =\left( \begin{array}{c} -x_e(T_b)\\ 0\\ 0\\ 0 \end{array} \right) dT_b \end{aligned}$$
with
$$\begin{aligned} A=\left( \begin{array}{ccccc} -x_{e}(T_{e}) &{} -1 &{} \frac{x_{e}(T_{b})-x_{e}(T_{e})}{(\lambda _{0}+\theta _{e}\mu _{0})\rho } &{} \theta _{e}\frac{x_{e}(T_{b} )-x_{e}(T_{e})}{(\lambda _{0}+\theta _{e}\mu _{0})\rho }\\ x_{e}(T_{e}) &{} \frac{\theta _{e}}{\theta _{d}} &{} 0 &{} \frac{x_{e} (T_{e})-x_{d}(0)}{\mu _{0}\rho }\\ \left[ \lambda _{0}+(\theta _{e}-\theta _{d})\mu _{0}\right] \rho &{} 0 &{} 1 &{} \theta _{e}-\theta _{d}\\ 0 &{} -z_{2} &{} 1 &{} 0 \end{array} \right) \end{aligned}$$
where
\(z_{2} =E^{\prime \prime }(X_{e})>0\)
Hence:
\(\frac{dX_e}{dT_b}\) has the sign of \(x_e(T_e)\lambda _0 + x_d(0)(\theta _d \mu _0-(\lambda _0+\theta _e \mu _0))>0\)
E.2 Small Local Damage
$$\begin{aligned} A\times \left( \begin{array}{c} dT_{d}\\ dX_{e}\\ d\lambda _{0}\\ d\mu _{0} \end{array} \right) =\left( \begin{array}{c} 0\\ -x_{d}(T_{b})\\ 0\\ 0 \end{array} \right) dT_b \end{aligned}$$
with
$$\begin{aligned} A=\left( \begin{array}{ccccc} x_{d}(T_{d}) &{} -1 &{} \frac{x_{d}(T_{d})-x_{e}(0)}{(\lambda _{0}+\theta _{e}\mu _{0})\rho } &{} \theta _{e}\frac{x_{d}(T_{d})-x_{e}(0)}{(\lambda _{0} +\theta _{e}\mu _{0})\rho }\\ -x_{d}(T_{d}) &{} \frac{\theta _{e}}{\theta _{d}} &{} 0 &{} \frac{x_{d}(T_{b})-x_{d}(T_{d})}{\mu _{0}\rho }\\ \left[ -\theta _{d}\mu _{0}+(\lambda _{0}+\theta _{e}\mu _{0})\right] \rho &{} 0 &{} 1 &{} -(\theta _{d}-\theta _{e})\\ 0 &{} -E^{\prime \prime }(X_{e}) &{} 1 &{} 0 \end{array} \right) \end{aligned}$$
Hence:
\(\frac{dX_e}{dT_b}\) has the sign of \(\theta _dx_d(T_b)\lambda _0 + \theta _ex_e(0)(\theta _d \mu _0-(\lambda _0+\theta _e \mu _0))\)
F Switch to Clean Energy in the Moratorium Case
Using the envelope theorem:
$$\begin{aligned} \frac{\partial \widetilde{V}(T_{b})}{\partial T_{b}}e^{\rho T_{b}}=\left[ u\left( \widetilde{x}_{d}(T_{b})\right) -(c_{d}+\theta _{d}\widetilde{\mu }_{0}e^{\rho T_{b}})\widetilde{x}_{d}(T_{b})\right] -\pi _{b}-(F^{\prime }(T_{b})-\rho F(T_{b}))\quad \end{aligned}$$
(45)
F.1 Large Local Damage
To avoid confusions between the optimum and the moratorium case, let us denote in this Appendix \(x_{e}^{*}(t)\) the optimal extraction of shale gas, \({\mu }_{0}^{*}\) the optimal initial shadow price of carbon and \(\lambda _{0}^{*}\) the optimal initial scarcity rent of shale gas (for a date of the switch to solar \(T_{b}^{*}\) by definition of the optimum); remember that the variables in the moratorium case are denoted with a \(\widetilde{};\) and denote for instance by \(\widetilde{{\mu }}_{0,T_{b}}\) the initial shadow price of carbon in the moratorium case for a date of the switch to solar \(T_{b}\).
According to Eq. (17), in the case of a large local damage, \(T_{b}^{*}\) is such that:
$$\begin{aligned} \pi _{b}-\left[ u\left( x_{e}^{*}(T_{b}^{*})\right) -(c_{e} +d+(\lambda _{0}^{*}+\theta _{e}\mu _{0}^{*})e^{\rho T_{b}^{*}} )x_{e}^{*}(T_{b}^{*})\right] =\rho F(T_{b}^{*})-F^{\prime }({T} _{b}^{*}) \end{aligned}$$
(46)
Introducing Eq. (46) in Eq. (45), it comes that:
$$\begin{aligned} \left. \frac{\partial V(T_{b})}{\partial T_{b}}\right| _{T_{b}^{*} }e^{\rho T_{b}^{*}}&=\left[ u\left( \widetilde{x}_{d,T_{b}^{*} }(T_{b}^{*})\right) -(c_{d}+\theta _{d}\widetilde{\mu }_{0,T_{b}^{*} }e^{\rho T_{b}^{*}})\widetilde{x}_{d,T_{b}^{*}}(T_{b}^{*})\right] \\&-\left[ u\left( x_{e}^{*}(T_{b}^{*})\right) -(c_{e}+d+(\lambda _{0}^{*}+\theta _{e}\mu _{0}^{*})e^{\rho T_{b}^{*}})x_{e}^{*} (T_{b}^{*})\right] \end{aligned}$$
This expression is strictly positive if and only if:
$$\begin{aligned} c_{e}+d+(\lambda _{0}^{*}+\theta _{e}\mu _{0}^{*})e^{\rho T_{b}^{*} }>c_{d}+\theta _{d}\tilde{\mu }_{0,T_{b}^{*}}e^{\rho T_{b}^{*}} \end{aligned}$$
(47)
At the date \(T_{e}^{*}\) of the switch from coal to gas at the optimum, coal and gas price are equal. Hence:
$$\begin{aligned} c_{e}+d-c_{d}=\left( \theta _{d}{\mu }_{0}^{*}-(\lambda _{0}^{*} +\theta _{e}\mu _{0}^{*})\right) e^{\rho T_{e}^{*}}>0 \end{aligned}$$
(48)
As \(T_{b}^{*}>T_{e}^{*}\), it implies that \(c_{e}+d-c_{d}<\left( \theta _{d}{\mu }_{0}^{*}-(\lambda _{0}^{*}+\theta _{e}\mu _{0}^{*})\right) e^{\rho T_{b}^{*}}\). Using this inequality and assuming that inequality (47) holds, we get that
$$\begin{aligned} {\mu }_{0}^{*}>\widetilde{\mu }_{0,T_{b}^{*}} \end{aligned}$$
(49)
This last inequlity implies that more coal is extracted between dates 0 and \(T_{e}^{*}\) in the moratorium case than in first best. Equality (48) and inequality (49) imply that
$$\begin{aligned} c_{e}+d+(\lambda _{0}^{*}+\theta _{e}\mu _{0}^{*})e^{\rho T_{e}^{*} }>c_{d}+\theta _{d}\widetilde{\mu }_{0,T_{b}^{*}}e^{\rho T_{e}^{*}} \end{aligned}$$
(50)
Together with Assumption (47), inequality (50) implies that for all t in \(\left[ T_{e}^{*},T_{b}^{*}\right] \), \(c_{e}+d+(\lambda _{0}^{*}+\theta _{e}\mu _{0}^{*})e^{\rho t}>c_{d}+\theta _{d}\widetilde{\mu }_{0,T_{b}^{*}}e^{\rho t}\). This last equation implies that extraction of coal in the moratorium case is higher than extraction of gas at the optimum between dates \(T_{e}^{*}\) and \(T_{b}^{*}\), implying more pollution between 0 and \(T_{e}^{*}\) also in the moratorium case than in the optimum. There is a contradiction: emissions are higher at all dates in the moratorium case than at the optimum, which contradicts the fact that the ceiling \(\overline{Z}\) should not be violated in both cases.
F.2 Small Local Damage
Call \(x_{d}^{*}(t)\) optimal extraction path of coal and \({\mu }_{0}^{*}\) the optimal shadow price of carbon. By definition \(T_{b}^{*}\) is such that (small local damage):
$$\begin{aligned} \pi _{b}-\left[ u\left( {x}_{d}^{*}({T}_{b}^{*})\right) -(c_{d} +\theta _{d}{\mu }_{0}^{*}e^{\rho {T}_{b}^{*}}){x}_{d}^{*}(T_{b}^{*})\right] =\rho F(T_{b}^{*})-F^{\prime }({T}_{b}^{*}) \end{aligned}$$
(51)
As a result, introducing (51) in (45), it comes that:
$$\begin{aligned} \left. \frac{\partial V(T_{b},\overline{Z})}{\partial T_{b}}\right| _{T_{b}^{*}}= & {} \left[ u\left( \widetilde{x}_{d,T_{b}^{*}}(T_{b}^{*})\right) -(c_{d}+\theta _{d}\widetilde{\mu }_{0,T_{b}^{*}}e^{\rho T_{b}^{*}})\widetilde{x}_{d,T_{b}^{*}}(T_{b}^{*})\right] \\&-\left[ u\left( {x}_{d}^{*}({T}_{b}^{*})\right) -(c_{d}+\theta _{d}{\mu } _{0}^{*}e^{\rho {T}_{b}^{*}}){x}_{d}^{*}(T_{b}^{*})\right] \end{aligned}$$
so that:
$$\begin{aligned} \left. \frac{\partial V(T_{b},\overline{Z})}{\partial T_{b}}\right| _{T_{b}^{*}}>0\Leftrightarrow \widetilde{\mu }_{0,T_{b}^{*}}<{\mu } _{0}^{*} \end{aligned}$$
F.3 Low Elasticity of Demand
Assume that \(\widetilde{\mu }_{0,{T}_{b}^{*}}<\mu _{0}^{*}\). Then more coal is extracted between dates \(T_{d}^{*}\) and \(T_{b}^{*}\) in the moratorium case than in first best. Before \(T_{d}^{*}\), only gas in extracted in first best and coal in the moratorium. For the ceiling constraint to be satisfied in both cases, it must be the case that:
$$\begin{aligned} \theta _{e}\int _{0}^{T_{d}^{*}}D\left( c_{e}+d+(\lambda _{0}^{*} +\theta _{e}\mu _{0}^{*})e^{\rho t}\right) dt>\theta _{d}\int _{0} ^{T_{d}^{*}}D\left( c_{d}+\theta _{d}\widetilde{\mu }_{0,T_{b}^{*} }e^{\rho t}\right) dt \end{aligned}$$
(52)
We have:
$$\begin{aligned} D(c_{d})\ge D\left( c_{d}+\theta _{d}\widetilde{\mu }_{0,T_{b}^{*}}e^{\rho t}\right) \ge D\left( c_{d}+\theta _{d}\widetilde{\mu }_{0,T_{b}^{*} }e^{\rho T_{b}^{*}}\right) \equiv x(\widetilde{p}(T_{b}^{*})) \end{aligned}$$
Step 1. Expenditure pD(p) is continuous and increasing with p. From Lagrange theorem, there exists a price \(p_{i}\in ]c_{b},\widetilde{p}({T_{b}^{*}})[\) such that:
$$\begin{aligned} \widetilde{p}({T_{b}^{*}})D(\widetilde{p}({T_{b}^{*}}))=c_{b} x_{b}+(D(p_{i})+p_{i}D^{\prime }(p_{i}))(\widetilde{p}({T_{b}^{*}})-c_{b}) \end{aligned}$$
The elasticity of demand at price \(p_{i}\) is \(\epsilon _{i}=-\frac{p_{i}D^{\prime }(p_{i})}{D(p_{i})}\) so that the above equation can be rewritten as:
$$\begin{aligned} \frac{D(\widetilde{p}({T_{b}^{*}}))}{D(p_{i})}=\frac{c_{b}x_{b}}{\widetilde{p}({T_{b}^{*}})D(p_{i})}+(1-\epsilon _{i})\left( 1-\frac{c_{b} }{\widetilde{p}({T_{b}^{*}})}\right) \end{aligned}$$
or:
$$\begin{aligned} \frac{D(\widetilde{p}({T_{b}^{*}}))}{D(p_{i})}-1=\frac{c_{b}}{\widetilde{p}({T_{b}^{*}})}\left( \frac{x_{b}}{D(p_{i})}-1\right) -\epsilon _{i}\left( 1-\frac{c_{b}}{\widetilde{p}({T_{b}^{*}})}\right) \end{aligned}$$
As \(\frac{D(\widetilde{p}({T_{b}^{*}}))}{D(p_{i})}-1<0\) and \(\frac{c_{b} }{\widetilde{p}({T_{b}^{*}})}\left( \frac{x_{b}}{D(p_{i})}-1\right) >0\), denoting \(\epsilon =\max _{i}(\epsilon _{i})\), it comes that:
$$\begin{aligned} -\epsilon<-\epsilon \left( 1-\frac{c_{b}}{\widetilde{p}({T_{b}^{*}} )}\right)<\frac{D(\widetilde{p}({T_{b}^{*}}))}{D(p_{i})}-1<0 \end{aligned}$$
so that:
$$\begin{aligned} \frac{D(\widetilde{p}({T_{b}^{*}}))}{D(p_{i})}-1&=O(\epsilon ) \end{aligned}$$
(53)
$$\begin{aligned} \frac{x_{b}}{D(p_{i})}-1&=O(\epsilon )\frac{\widetilde{p}({T_{b}^{*}} )}{c_{b}} \end{aligned}$$
(54)
Similarly, using Lagrange theorem between prices \(c_{d}\) and \(c_{b}\), one gets, with \(p_{j}\in ]c_{d},c_{b}[\):
$$\begin{aligned} \frac{x_{b}}{D(p_{j})}-1&=O(\epsilon ) \end{aligned}$$
(55)
$$\begin{aligned} \frac{D({c_{d})}}{D(p_{j})}-1&=O(\epsilon )\frac{c_{b}}{c_{d}} \end{aligned}$$
(56)
So that for any arbitrarily small \(\zeta \), if the price elasticity of demand is such that \(\epsilon \frac{c_{b}}{c_{d}}=O(\zeta )\), then:
$$\begin{aligned} \frac{x_{b}}{D(c_{d})}-1=O(\zeta ) \end{aligned}$$
(57)
Step 2. Recall that:
$$\begin{aligned} (u(x_{b})-c_{b}x_{b})-(u(D(\widetilde{p}(T_{b}^{*}))-\widetilde{p} (T_{b}^{*})D(\widetilde{p}(T_{b}^{*}))= \end{aligned}$$
(58)
But \(\rho F(T_{b}^{*})-F^{\prime }(T_{b}^{*})\) is decreasing with \(T_{b}^{*}\) (as \(F^{\prime \prime }>0\)), so that \(\rho F(T_{b}^{*})-F^{\prime }(T_{b}^{*})\le \rho F\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d} D(c_{d})}\right) \). Equation (58) thus implies that:
$$\begin{aligned} (u(x_{b})-c_{b}x_{b})-(u(D(\widetilde{p}(T_{b}^{*}))-\widetilde{p} (T_{b}^{*})D(\widetilde{p}(T_{b}^{*}))\le \rho F\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) \end{aligned}$$
so that:
$$\begin{aligned} 0\le \widetilde{p}(T_{b}^{*})D(\widetilde{p}(T_{b}^{*}))-c_{b}x_{b} \le \rho F\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) \end{aligned}$$
so that:
$$\begin{aligned} 0\le \frac{\widetilde{p}(T_{b}^{*})D(\widetilde{p}(T_{b}^{*}))}{c_{b}x_{b}}-1\le \frac{\rho F\left( \frac{\overline{Z}}{\theta _{d}D(c_{d} )}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) }{c_{b}x_{b}} \end{aligned}$$
and thus:
$$\begin{aligned} 1\le \frac{\widetilde{p}(T_{b}^{*})}{c_{b}}\le \left[ 1+\frac{\rho F\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) }{c_{b}x_{b}}\right] \frac{x_{b}}{D(\widetilde{p}(T_{b}^{*}))} \end{aligned}$$
Substituting the equation above in Eq. (54), it comes that:
$$\begin{aligned} \frac{x_{b}}{D(p_{i})}-1\le O(\epsilon )\left[ 1+\frac{\rho F\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) }{c_{b}x_{b}}\right] \frac{x_{b}}{D(\widetilde{p}(T_{b}^{*}))} \end{aligned}$$
which implies, multiplying both sides by \(\frac{D(\widetilde{p}(T_{b}^{*}))}{x_{b}}\):
$$\begin{aligned} \frac{D{(\widetilde{p}(T_{b}^{*}))}}{D(p_{i})}-\frac{D{(\widetilde{p} (T_{b}^{*}))}}{x_{b}}\le O(\epsilon )\left[ 1+\frac{\rho F\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) }{c_{d}D(c_{d})}\right] \end{aligned}$$
For an arbitrarily small \(\zeta \), one can find \(\epsilon \) such that \(\epsilon \left[ 1+\frac{\rho F\left( \frac{\overline{Z}}{\theta _{d}D(c_{d} )}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) }{c_{d}D(c_{d})}\right] \le \zeta \). As a result, if \(\epsilon \left[ 1+\frac{\rho F\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) -F^{\prime }\left( \frac{\overline{Z}}{\theta _{d}D(c_{d})}\right) }{c_{d}D(c_{d})}\right] =O(\zeta )\), then, using Eq. (53):
$$\begin{aligned} \frac{D{(\widetilde{p}(T_{b}^{*}))}}{x_{b}}=\frac{D{(\widetilde{p} (T_{b}^{*}))}}{D(p_{i})}+O(\zeta )=1+O(\zeta ) \end{aligned}$$
so that, \(\forall \zeta ,c_{d},c_{b},D(c_{d}),\overline{Z}\), there exists \(\epsilon \) such that, if the elasticity of demand is always below \(\epsilon ,\) then \(\forall p\in [c_{d},\widetilde{p}(T_{b}^{*})],\):
$$\begin{aligned} \frac{D(p)}{D(c_{d})}=1+O(\zeta ) \end{aligned}$$
The exact same reasoning gives that: \(\forall p\in [c_{e},{p} (T_{b}^{*})]\):
$$\begin{aligned} \frac{D(p)}{D(c_{d})}=1+O(\zeta ) \end{aligned}$$
A necessary condition for inequality (52) to hold is that:
$$\begin{aligned} \theta _{e}D\left( c_{e}+d+(\lambda _{0}^{*}+\theta _{e}\mu _{0}^{*})e^{\rho T_{b}^{*}}\right) >\theta _{d}D(c_{d}) \end{aligned}$$
But for any arbitrarily small \(\zeta \), if \(\epsilon \) small enough,
$$\begin{aligned} \theta _{e}D(c_{e}+d+(\lambda _{0}^{*}+\theta _{e}\mu _{0}^{*})e^{\rho T_{b}^{*}})=\theta _{e}D(c_{d})+O(\zeta ) \end{aligned}$$
For \(\theta _{d}>\theta _{e}\), one can choose \(\zeta \) such that \(\theta _{e}D(c_{d})+O(\zeta )<\theta _{d}D(c_{d})\) so that inequality (52) does not hold for \(\epsilon \) small enough.
G Optimal Ceiling: Moratorium Versus Optimum
We consider the case of a large local damage.
For ease of notation, call \(p_{b}=\widetilde{p}(\widetilde{T}_{b})\) the optimal final price of coal under the moratorium constraint. We first show that if this final price is given, and \(X_{e}\) is exogenous, the value of \(\mu _{0}\) decreases with \(X_{e}\). At \(X_{e},p_{b}\) given, the price path is defined by the following system of equations:
$$\begin{aligned} \theta _{d}\int _{0}^{T_{e}}D(c_{d}+\theta _{d}\mu _{0}e^{\rho t})dt&=\overline{Z}-Z_{0}-\theta _{e}X_{e} \end{aligned}$$
(59)
$$\begin{aligned} \int _{T_{e}}^{T_{d}}D(c_{e}+(\lambda +\theta _{e}\mu _{0})e^{\rho t})dt&=X_{e}\end{aligned}$$
(60)
$$\begin{aligned} c_{d}+\theta _{d}\mu _{0}e^{\rho T_{e}}&=c_{e}+(\lambda _{0}+\theta _{e}\mu _{0})e^{\rho T_{e}}\end{aligned}$$
(61)
$$\begin{aligned} c_{e}+d+(\lambda _{0}+\theta _{e}\mu _{0})e^{\rho T_{b}}&=p_{b} \end{aligned}$$
(62)
For ease of notation, we denote \(p_{T_{e}}=c_{e}+d+(\lambda _{0}+\theta _{e} \mu _{0})e^{\rho T_{e}}\) and \(p_{0}=c_{d}+\theta _{d}\mu _{0}\). We make the substitution \(u=T_{b}-t\) in the integral of Eq. (60). We get:
$$\begin{aligned} \int _{T_{b}-T_{e}}^{0}D(c_{e}+d+(p_{b}-c_{e}-d)e^{-\rho t})dt=X_{e} \end{aligned}$$
(63)
Differentiating Eq. (63) gives:
$$\begin{aligned} d(T_{b}-T_{e})x(T_{e})=dX_{e} \end{aligned}$$
(64)
Equations (61) and (62) give \(p_{0}=c_{d}+(p_{T_{e} }-c_{d})e^{-\rho T_{e}},\) so that:
$$\begin{aligned} dp_{0}=dp_{T_{e}}e^{-\rho T_{e}}+(p_{T_{e}}-c_{d})e^{-\rho T_{e}}(-\rho e^{-\rho T_{e}}) \end{aligned}$$
so that:
$$\begin{aligned} \frac{dp_{0}e^{\rho T_{e}}}{\rho (p_{T_{e}}-c_{d})}=\frac{dp_{T_{e}}}{\rho (p_{T_{e}}-c_{d})}-dT_{e} \end{aligned}$$
(65)
We make the substitution \(u=T_{e}-t\) in the integral of Eq. (59). We get:
$$\begin{aligned} -\,\theta _{d}\int _{T_{e}}^{0}D(c_{d}+\theta _{d}\mu _{0}e^{\rho (T_{e} -u)})du=\overline{Z}-Z_{0}-\theta _{e}X_{e} \end{aligned}$$
which can be rewritten as:
$$\begin{aligned} \theta _{d}\int _{0}^{T_{e}}D(c_{d}+(p_{T_{e}}-c_{d})e^{\rho (T_{e} -u)})du=\overline{Z}-Z_{0}-\theta _{e}X_{e} \end{aligned}$$
(66)
Differentiating Eq. (66), one gets:
$$\begin{aligned} \theta _{d}x(0)dT_{e}+\left( \theta _{d}\int _{0}^{T_{e}}\frac{dD(c_{d} +(p_{T_{e}}-c_{d})e^{\rho (T_{e}-u)})}{dp}e^{-\rho u}du\right) dp_{T_{e} }=-\,\theta _{e}dX_{e} \end{aligned}$$
which rewrites:
$$\begin{aligned} \theta _{d}x(0)dT_{e}-\left( \theta _{d}\int _{0}^{T_{e}}\frac{dD(c_{d} +(p_{T_{e}}-c_{d})e^{\rho (T_{e}-u)})}{du}du\right) \frac{dp_{T_{e}}}{\rho (p_{T_{e}}-c_{d})}=-\,\theta _{e}dX_{e} \end{aligned}$$
This gives:
$$\begin{aligned} \theta _{d}x(0)\left[ dT_{e}-\frac{dp_{T_{e}}}{\rho (p_{T_{e}}-c_{d})}\right] +\frac{\theta _{d}x(T_{e})dp_{T_{e}}}{\rho (p_{T_{e}}-c_{d})}=-\,\theta _{e} X_{e} \end{aligned}$$
(67)
Using Eqs. (65) and (67), we get:
$$\begin{aligned} \frac{dp_{0}e^{\rho T_{e}}}{\rho (p_{T_{e}}-c_{d})}=\frac{\theta _{d} x(T_{e})dp_{T_{e}}}{\rho (p_{T_{e}}-c_{d})}+\theta _{e}dX_{e} \end{aligned}$$
(68)
But \(p_{T_{e}}=c_{e}+d+p_{b}+(p_{b}-c_{e}-d)e^{-\rho (T_{b}-T_{e})}\), so that:
$$\begin{aligned} dp_{T_{e}}=-(p_{b}-c_{e}-d)e^{-\rho (T_{b}-T_{e})}\rho d(T_{b}-T_{e}) \end{aligned}$$
Using Eq. (64):
$$\begin{aligned} dp_{T_{e}}=-(p_{b}-c_{e}-d)e^{-\rho (T_{b}-T_{e})}\rho \frac{dX_{e}}{x(T_{e} )} \end{aligned}$$
(69)
So that Eq. (68) can be rewritten as:
$$\begin{aligned} \frac{dp_{0}e^{\rho T_{e}}}{\rho (p_{T_{e}}-c_{d})}= & {} \left[ \frac{\theta _{d}(p_{b}-c_{e}-d)e^{-\rho (T_{b}-T_{e})})}{\rho (p_{T_{e}}-c_{d})}+\theta _{e}\right] dX_{e}\equiv \left[ -\theta _{d}\frac{\lambda _{0}+\theta _{e} \mu _{0}}{\theta _{d}\mu _{0}}+\theta _{e}\right] dX_{e}\\\equiv & {} -\frac{\lambda _{0} }{\mu _{0}} \end{aligned}$$
As a result \(\frac{dp_{0}}{dX_{e}}<0\), which gives that:
$$\begin{aligned} \frac{d\mu _{0}}{dX_{e}}<0 \end{aligned}$$
So that for a given final price \(p_{b}\), \(\mu _{0}\) is higher with a moratorium than without. This is a sufficient condition to prove that the initial shadow cost of pollution is higher in the moratorium case than at the optimum, as we showed that the final price in the moratorium case is in fact higher than in the optimum. As a result, the optimal damage is higher in the moratorium case than at the optimum.