1 Introduction

Mitigation of future climate change requires stringent curtailment of CO2 emissions and carbon taxes have emerged as a tool-of-choice in numerous policy proposals (including the EU Green Deal) to implement these reductions (Carattini et al. 2019). Despite emission-reductions being their primary aim, few of the existing carbon taxes have been evaluated for their effectiveness in reducing emissions at observed tax rates. Existing studies that do look at the tax impact of emissions predominantly focus on single sector emissions (such as transportation) rather than assessing the aggregate impact. However, the impact on aggregate emissions is an important metric since anthropogenic climate change is driven by cumulative CO2 emissions, mitigation of which requires aggregate reductions.

Here I assess the effectiveness of a carbon tax in reducing emissions by studying the impacts of the introduction of the first major carbon tax in North America: the carbon tax in British Columbia (BC), Canada introduced in 2008. The tax has been widely hailed as a success—both in popular perception as well as in its impact on consumer demand for fossil fuels, natural gas, and lack of impact on economic output (Murray and Rivers 2015). Nevertheless, no formal empirical impact estimates assess whether the tax has been measurably effective in reducing aggregate CO2 emissions since its introduction. Existing studies that find large effects of carbon taxes in other countries tend to be overly optimistic in their assessment by focusing on individual sectors alone. Unlike other carbon taxes, such as the Swedish tax which predominantly affects transportation (and appears to have led to significant transport emission reductions—see Andersson 2019), the BC carbon tax lends itself to an assessment of the impact of the introduction of a carbon tax on aggregate emissions due to its broad coverage including roughly 70% of greenhouse gas emissions (Murray and Rivers 2015) with comparatively few exemptions.Footnote 1

I study the emission response to the introduction of a carbon tax using two established approaches based on econometric policy analysis and then introduce a new econometric method to detect unknown interventions based on the detection of structural breaks. First, I assess the impact of the tax by imposing the tax as a known policy intervention in a panel difference-in-differences specification using province-level aggregated data accounting for the small number of treated units using randomisation inference, as well by using spatially-gridded emissions at a 1-degree spatial resolution. Second, I study the impacts of the introduction of the tax using synthetic control methods. Third, due to the myriad of potential interventions affecting CO2 emissions, I propose a new econometric approach that agnostically detects treatment interventions (such as the introduction of the carbon tax) based on the detection of breaks using machine learning in panel fixed effects difference-in differences models.

Following the maximFootnote 2 that when the title is a question, the answer is `no’, I find that the introduction of a carbon tax is imprecisely estimated and has not ‘yet’ had a large and statistically significant impact on aggregate CO2 emissions. The qualifier ‘yet’ is justified because the tax has led to some significant sector-specific emission reductions. In particular, the tax is associated with an estimated long-run 19% decrease in transportation-related emissions including personal vehicle traffic (when using difference-in-differences), and an average reduction of 5% when using synthetic control. I estimate the (short-run) semi-elasticity of transportation emissions with respect to the carbon tax rate as roughly a 0.2% reduction in emissions per dollar increase in the carbon tax rate, with a long-run estimate of 0.9% per dollar (albeit not accounting for introduction effects). The semi-elasticity of aggregate emissions, however, cannot be accurately estimated due to the large uncertainty around the point estimate. The absence of a statistically-significant effect on aggregate emissions can be reconciled with sector-specific reductions by the fact that some sectors (such as stationary combustion including manufacturing and heat production) in the taxed regions saw a (statistically insignificant) rise in emissions relative to control regions.

The identifying assumption needed in the present analysis is that there was no concurrent shock to the carbon tax that affected BC emissions relative to the control provinces conditional on the control variables—naturally it will never be possible to obtain perfect identification of the impact of carbon taxes on a large (country or province level) scale. Though I would argue that the absence of perfect identification should not prohibitFootnote 3 the study of carbon pricing given its importance and prevalence as a policy tool to mitigate climate change. While perhaps a political surprise at the time, the carbon tax was not randomly assigned to the province of BC. However, if the introduction of a tax already reflects a higher willingness to reduce carbon emissions in the province, then policy-evaluation estimates yield an upward-biased estimate of the magnitude of the tax impact on emissions. Further, the results are robust to varying specifications of the control group.

Notably, the absence of a statistically significant effect of the introduction of the carbon tax on aggregate emissions does not imply there has been no effect. This particularly applies to the BC carbon tax for which only few years of data post-tax introduction exist, raising concerns about statistical power to detect an effect (though comparable studies such as Andersson 2019 rely on similarly short samples). However, given the praise carbon taxes have received in both literature and policy, the absence of a statistically detectable effect in aggregate emissions is sobering. Any tax-induced aggregate emission reductions do not appear to be detectable in the data and thus are far from climate policy targeting net-zero emissions by mid-century. Regardless of whether the absence of a statistically-significant reduction stems from a short sample, low tax rate, or incomplete coverage, the resulting implication for climate change is the same—there has not been a substantial detectable change in aggregate emissions in response to the introduction of this tax.

Conventional difference-in-differences and synthetic control approaches test whether the tax (or comparable policy) was effective in reducing emissions, but do not tell us whether other (previously unknown) interventions had significant effects. I therefore introduce a break-detection approach using machine learning in fixed effects panel estimation that allows for the detection of treated units and treatment dates without prior knowledge of their occurrence. Break detection is commonly applied in time series econometrics to detect policy interventions (and recently also in regression discontinuity designs—see Porter and Yu 2015), however, time series applications generally do not consider control groups making it difficult to identify causal effects. The proposed approach here presents an agnostic way of identifying shocks that affected individual units in the panel while using untreated units as control. Intuitively, a standard difference-in-difference or synthetic control estimator will tell us whether a particular intervention mattered, while the proposed approach here tells us which—previously unknown—interventions were statistically significant and when they occurred. Specifically, I propose to use indicator saturation methods (Castle et al. 2015) to detect breaks in individual fixed effects in panel, which correspond to treatment interaction dummy variables. When applied to CO2 emissions, the introduction of the BC carbon tax is detected without prior knowledge as a structural break in transportation emissions, however, neither the tax nor carbon pricing in other provinces is detected in aggregate emissions. Instead, break detection identifies sudden changes in aggregate emissions in untaxed provinces which can be attributed to the closure of emission-intense industries as well as efficiency improvements in electricity-generation.

Overall, the absence of a tax effect on aggregate emissions and the lack of breaks coinciding with carbon-pricing policy interventions suggests that existing carbon prices are set too low to achieve significant aggregate emission reductions in the (albeit short) time frame since their introduction, consistent with evidence from cross-country carbon pricing and policy studies (see e.g. Metcalf and Stock 2020; Rafaty et al. 2020; Eskander and Fankhauser 2020) At current rates, a carbon tax alone is likely insufficient in inducing substantial aggregate emission reductions and must be supplemented with complementary technology-pushing policies, ideally targeting sensitive interventions points (Farmer et al. 2019).

1.1 Background: The BC Carbon Tax and Carbon Pricing

British Columbia introduced a carbon tax on CO2 emissions in 2008 at $10 CAD per tonne emitted. While not randomly assigned, the introduction was unexpected as the government had not campaigned on the introduction of a carbon tax. Additionally, the tax was introduced rapidly, first proposed in February 2008 and implemented by July. The tax rate was initially set to rise $5 CAD per year, this increase occurred until 2012, from when onwards it remained at 30$ until 2018 when it was raised again to $35 CAD per tonne of CO2 emitted. The tax is designed to be revenue-neutral, making it a popular scheme with the public (Harrison and Peet 2012), with tax revenue returned to the public through tax credits and a dividend.

BC was the only province in Canada operating under its own carbon tax scheme, though the Canadian government introduced a federal carbon tax in 2019 (which falls outside of the sample under analysis here), and two other provinces have implemented major formal measures to curb emissions that fall within the sample available hereFootnote 4: Alberta uses a carbon price for industrial emitters above an emissions threshold (introduced in 2007), and Quebec operates under a cap and trade system (introduced in 2013 which appears to have resulted in some emission reductions when measured at the plant level, see Hanoteau and Talbot 2019). These alternative schemes raise some questions on the optimal choice of control groups. Since the carbon pricing policies in Alberta and Quebec are not of the form of a carbon tax, the main results presented here therefore include Alberta and Quebec in the control sample. Robustness checks omitting these provinces from the control group show that the results are unaffected by the inclusion of these two regions.

1.1.1 The Effectiveness of Carbon Taxes in Reducing Emissions

Analyses of the effectiveness of carbon taxes and prices have predominantly concentrated on metrics other than CO2 emissions themselves. Where studies have assessed the impact on emissions, they primarily concentrate on specific sub-sectors. Green (2021) provides a comprehensive meta analysis of carbon pricing on emissions, finding mixed evidence of their effectiveness. Metcalf and Stock (2020) study the effects of carbon pricing in the European Union finding marginal reductions in emissions. Rafaty et al. (2020) assess the effect of carbon prices (rather than the introduction of a tax) on emissions in a sample of OECD countries, finding that while carbon pricing reduced emissions growth, current carbon prices are too low to show substantial emission reductions. This matches the results obtained by Lin and Li (2011) who study the introduction of a carbon tax in countries in Northern Europe finding significant impacts on per capita growth in emission reductions in only a small subset of countries, and the findings by Best et al. (2020) who find some emission reduction in response to carbon pricing. Assessing a single sector on a local level, Davis and Kilian (2011) study the effects of US gasoline taxes on carbon emissions finding a significant tax effect on vehicle emissions. This effect of a significant reduction in transportation emissions is also identified by Andersson (2019) who studies the introduction of a carbon tax in Sweden using synthetic controls but does not consider aggregate emissions due to the many exemptions in emission-intense sectors.

1.1.2 The Effectiveness of the BC Carbon Tax Across Different Measures

Murray and Rivers (2015) provide an early review of the literature on the general impacts of the BC carbon tax, though the empirical papers surveyed therein do not formally test the effect of the tax on emissions. There exist, however, a range of papers studying the impacts on employment and fuel demand. Xiang and Lawley (2018) and Gulati and Gholami (2015) find significant reductions in residential and commercial natural gas demand following the introduction of the tax. Rivers and Schaufele (2015) as well as Lawley and Thivierge (2016) and Antweiler and Gulati (2016) show that the carbon tax lead to a reduction in gasoline demand. Gittens (2018) provides evidence of the channel of this effect through incomplete pass-through of the tax. Saberian (2018) studies unintended consequences of the carbon tax, finding that the introduction of the tax led consumers to substitute to diesel cars thus increasing local pollution. Yip (2018) investigates the impact on labour markets and finds a small increase in unemployment of less-educated workers. Yamazaki (2017) assesses the employment impacts, finding a positive effect, though Azevedo et al. (2018) also assess the employment impact of the tax finding heterogeneous responses across industries, but no effect on aggregate employment. These opposite effects can be reconciled by concerns about the parallel trend assumptions in Yamazaki (2017) raised by Azevedo et al. (2018). Comparing econometric estimates to those of a general equilibrium model Carbone et al. (2019) find that employment shifted from more carbon-intense to less carbon-intense sectors.Footnote 5

An early report around the time of the introduction of the tax predicted a 5% reduction in aggregate emissions using numerical simulations (BC 2008) though the accuracy of this prediction has not been assessed. Elgie and McCLay (2013) use data up until 2011 to compute means and construct the non-parametric difference-in-differences without reporting standard errors. Their comparison finds a 9% reduction in per capita emissions following the introduction of the tax, however, due to the lack of standard errors these results do not provide a formal test.

Ahmadi and Yamazaki (2020) study the emission impacts of the BC carbon tax in the manufacturing sector at a plant level, finding marginal emission reductions in response to the tax. The authors also raise concerns about analysing the BC carbon tax using differences-in differences since the introduction of the carbon tax coincides with the Great Recession. The concern is valid when not accounting for underlying economic conditions, however, when controlling for changes in GDP—as implemented here and in the wider literature (see Metcalf 2019; Lin and Li, 2011; Rafaty et al. 2020), this is less of a concern, Shortly after the first release of the present paper, Metcalf (2019) also reported results of the tax impacts on the BC carbon tax using difference-in-differences, finding (marginally) significant aggregate emission reductions. However, these apparent reductions are not robust to the inclusion of dynamics in the model, choosing a subset of the available years, or controlling for the small number of individuals in the panel. A further discussion contrasting the results is provided in Sect. 3.1.2.

The present paper makes three contributions to the literature. First, using difference-in-differences and synthetic control methods I show that there is little evidence of aggregate reductions in CO2 emissions in response to the introduction of a carbon tax. These results are robust to the choice of the control group, the specification of the standard errors, randomisation inference, using spatially-gridded or province level data, accounting for biases in dynamic panel estimates, or the use of synthetic control methods to address concerns on non-parallel trends. Second, I show that despite the lack of an aggregate effect, there are heterogeneous effects across sectors with some (such as transportation) exhibiting significant large reduction in emissions consistent with existing evidence on carbon taxes. In addition, I also provide sector-level estimates of the emission elasticities with respect to tax rate. Third, I introduce an econometric break-detection method to assess the presence of treatment interventions, allowing for an agnostic approach to discover when interventions took place, and who was affected by these interventions. This identifies the closure of emission-intense mills combined with efficiency-improvements in electricity generation as likely effective measures in reducing emissions in three provinces.

2 Data

I obtain annual time series on provincial aggregate and sectoral CO2 emissions from the National Inventory Report of Environment and Climate Change Canada (2018) from 1990 to 2016. Sector-level CO2 emissions are categorised using the IPCC (Intergovernmental Panel on Climate Change) classificationFootnote 6 and comprises stationary combustion sources including electricity generation; transportation including personal vehicles; fugitive emissions from coal, oil, and gas extraction; industrial processes including cement production; agriculture; and waste processing.

Provincial GDP (in constant 2007 dollars) and population data is available from Statistics Canada (2018) and included as socio-economic control variables to account for differences in population and economic development across provinces. Specifically, GDP allows for a control of differentiated impacts across provinces during the 2008/2009 recession.Footnote 7 Due to lack of available data, the Northwest Territories, Nunavut, and the Yukon are omitted from the sample, resulting in a (small) panel of N = 10 Canadian provinces over T = 27 years, one of which (BC) received the intervention of a carbon tax in 2008. The total regional sample spans 270 province-year observations. The annual regional emissions series are shown in percentage deviations relative to their 1990–2007 mean in Fig. 1 (for aggregate & transportation emissions—the single largest emitting sector in the taxed region) and for all available sectors in Fig. 2, together with the sectoral share of emissions over time for the taxed region. Summary statistics are shown in Table 1 for BC and provinces included in the control group.

Fig. 1
figure 1

Percentage change in province-level aggregate CO2 emissions relative to the 1990–2007 average for aggregate emissions (left) and transportation-emissions (right, the single largest emitting sector in the taxed region). Blue solid line corresponds to the average across included provinces other than British Columbia (BC) which is shown in red. The carbon tax is shown as purple dashed. Blue shading corresponds to the 95% confidence interval of the ‘rest of Canada’ average in each period [Canadian provinces are abbreviated as follows. British Columbia (bc), Alberta (ab), Ontaria (on), Newfoundland and Labrador (nl), Nova Scotia (ns), New Brunswick (nb), Saskatchewan (sk), Quebec (qc), Prince Edward Island (pe), Manitoba (mb).]

Fig. 2
figure 2

Percentage change in province-level sector-specific CO2 emissions relative to the 1990–2007 average. Sector definitions correspond to IPCC emission categories. Blue solid line shows the average across included provinces other than British Columbia which is shown in red. Blue shading corresponds to the 95% confidence interval of the ‘rest of Canada’ average in each period. Lower right panel shows the share of emissions in the taxed province (BC) by sector over time

Table 1 Province-level summary statistics for taxed region (BC) and control provinces 1990–2016

Naturally working with a single treated province and a small number of post-treatment observations raises concerns about inference and statistical power. In particular, there are concerns about inference based on difference-in-differences with a single treated region (Conley and Taber 2011). I use multiple approaches to address this in the difference-in-differences model, while also estimating the impact of the introduction of the carbon tax using synthetic control methods suitable for policy analysis with a single treated unit (Abadie and Gardeazabal 2003; Abadie et al. 2010). In the difference in differences setup, in addition to conventional estimates I report cluster-robust bootstrap standard errors (Cameron et al. 2008).Footnote 8 I also apply randomisation inference (see MacKinnon and Webb 2019) comparing the estimated tax coefficients to all possible coefficients when a hypothetical placebo tax is estimated at every other time point and province throughout the sample. To increase the number of available cross-sectional units and to assess the impacts of the tax on local emissions, I use spatially-gridded emissions data available from EDGAR (2017) from 1970 until 2012. To reduce concerns about local measurement errors I aggregated the gridded emissions from the available 0.1 degree spatial resolution to 1 degrees (approx. 111 km at the equator). This results in a total sample of 34,230 spatio-temporal observations spanning 815 grid-cells over 42 years of which 4,515 observations fall within the taxed region. The grid-level difference in CO2 emissions is shown in Fig. 3. No annual GDP or population series are available on a gridded basis. Instead, I rely on lights-at-night as socio-economic controls in the gridded model (Zhang et al. 2016). Lights data is merged across 11 satellites by matching the means of overlapping years and averaged to 1-degree spatial resolution to match the emission data. Since my analysis focuses on Canadian provinces below 60°N, concerns about measurement above 60° (Feir et al. 2018) do not apply here.

Fig. 3
figure 3

Change in CO2 emissions at a 1-degree spatial grid-level between the post-tax (2008–2012) and the pre-tax (1970–2007) average. Top panel shows the change in aggregate CO2 emissions, lower panel shows the change in transportation CO2 emissions. British Columbia (BC) which saw the introduction of a carbon tax in 2008 is shown with gray-dashed outline

Emissions are measured imprecisely which may result in substantial measurement errors in the dependent variable. Nevertheless, I do not expect measurement errors to present a major obstacle in the present analysis. First, measurement error in the dependent variable does not bias the coefficient estimates (though may amplify power concerns by increasing the noise). Second, I conduct the analysis using two different datasets to support the overall results. The first stemming from the national inventory reports for Canada, the second being the spatially-resolved EDGAR emission estimates with a standardised methodology across different countries. Even though EDGAR uses national records, the emission estimates are standardised across countries which may reduce some of the measurement error from national inventories. While some measurement error is unavoidable, an analysis of emissions rather than other measures of fossil fuel use is warranted since compliance with climate agreements (e.g. the Paris agreement) is primarily assessed on based on emissions reported in national inventories.

3 Estimating the Emissions Impact of the BC Carbon Tax

3.1 Difference-in-Differences

To study the impact of the carbon on CO2 emissions I estimate the difference-in-difference panel fixed effects model (1) and test the coefficient \(\beta\) of the variable of interest, \(Ta{x}_{i,t}\), which is a dummy variable equal to one for BC following the introduction of the carbon tax in British Columbia (in 2008), and zero otherwise. The per-tonne carbon tax has increased since its introduction; thus the single dummy variable captures the average impact of the introduction of the carbon tax. Section 3.1.4 also considers the semi-elasticity of emissions with respect to the tax by specifying \(Ta{x}_{i,t}\) in (1) as the tax rate per tonne emitted. Modelling the incremental price increase using multiple intervention dummies—while feasible—would face even lower power than the present analysis due to the short time-span of available observations. Additional control variables in (1) include population, real GDP, and lagged emissions (to address potential autocorrelation), together with year-fixed-effects common across all provinces \({\lambda }_{t}\) and province-fixed effects \({\mu }_{i}\):

$${\mathrm{log}\left(Emiss\right)}_{i,t}={\lambda }_{t}+{\mu }_{i}+\beta Ta{x}_{i,t}+{\alpha }_{1}{\mathrm{log}\left(Emiss\right)}_{i,t-1}+{\gamma }_{1}{\mathrm{log}\left(Pop\right)}_{i,t }+{{\gamma }_{2}\mathrm{log}\left(GDP\right)}_{i,t}+{\epsilon }_{i,t}$$

Due to the inclusion of lagged emissions, the parameter \(\beta\) captures the contemporaneous impact of the tax. To compute the estimated long-run effect of the tax I consider the equilibrium tax effect given by \({\beta }_{Eqbm}=\beta /(1-{\alpha }_{1})\). Standard errors of the estimate of \({\beta }_{Eqbm}\) can be obtained using the delta method. The inclusion of an autoregressive term models persistence in emissions, not forcing any tax effect to have an immediate effect on the level of the dependent variable. Formal tests for autoregressive dynamics support the inclusion of one lag in (1)—diagnostic tests suggest little evidence of residual autocorrelation and longer lags are not statistically significant in most specifications and do not affect the overall results (see Appendix A). Additionally, the dynamic specification in (1) allows for tax effects on the growth rate instead of the level: if the tax effect predominantly affects the growth rate rather than the level, we expect the coefficient on lagged emissions to be close to one (see Appendix A for additional discussion). Difference-in-differences relies on parallel trends in the treatment and control groups. While parallel trends are not formally testable, it is common to estimate the model (1) including leads of the treatment dummy to assess whether parallel trends are rejected in the pre-treatment periods. In the present context with a single treated province, this amounts to a test of a single period indicator (zero–one dummy variable for a single period and province) in the pre-treatment periods. Single indicators can be estimated without bias, however, estimates will not be consistent (Hendry and Santos 2005). Visual inspection of the data suggests no obvious pre-trends (Fig. 1), formal tests for pre-trends based on leads of the tax treatment potentially identify one period ahead of the tax as an outlier but there is no obvious evidence of overall pre-trends (Appendix B). For robustness, Sect. 3.2 reports the results when using synthetic control methods (where results are consistent with findings when using difference-in-differences), and Sect. 3.3 considers break detection using an estimator robust to outliers (Fig. 4).

Fig. 4
figure 4

Top panel: estimation results—immediate impact estimates (\(\widehat{\beta }\) in Eq. (1)) expressed in exact percentage change (standard errors clustered at province level transformed using the delta method). Lower panel: share of emissions by sector in 2016

Earlier empirical evidence suggests significant decline in gasoline and residential natural gas demand in response to the introduction of the carbon tax. To assess the emission impact on a sectoral level, beyond aggregate effects I estimate model (1) for each sector defined by the IPCC using disaggregated CO2 emissions. All panel models are estimated using the plm and lfe packages in R (Croissant and Millo 2008; Gaure 2013).

3.1.1 Aggregate Difference-in-Differences Estimation Results

The introduction of the BC carbon tax is imprecisely estimated but does not appear to have led to a detectable statistically significant reduction in aggregate CO2 emissions. While the point-estimate of the introduction of the carbon tax on log aggregate CO2 emissions is negative (− 0.009, se = 0.008 corresponding to a − 0.09% change, with a long-run estimate of − 5% with se = 4%),Footnote 9 the coefficient is not statistically different from zero (Table 2, left column and Fig. 4). These results are unaffected by the exclusion of provinces that themselves have carbon-pricing schemes from the control group (Alberta and Quebec, see Table A1 in Appendix C), as well as accounting for potential biases in dynamic panel estimates (Nickell 1981)—I report Arellano-Bond (1991) GMM estimates of the dynamic panel in Appendix A.

Table 2 The effect of the carbon tax on CO2 emissions (difference-in-differences estimation)

The main results of the difference-in-differences estimation in Table 2 (and Fig. 4) are reported using standard errors clustered at the province level. Even though there are concerns with cluster standard errors when the number of clusters is small, the results here are robust to specifying conventional, clustered, or cluster-robust bootstrap standard errors (Cameron et al. 2008)—see Appendix D. To further alleviate concerns about the construction of standard errors with few treated units, I also use randomisation inference (see e.g. MacKinnon and Webb 2019; Buchmueller et al. 2011 on permutation tests). I estimate the above model (1) where the tax variable is modified to consider every time period in every control province as a placebo treatment. This results in 270 estimates of treatment coefficients and their t-statistics. I then compare the estimate of the actual BC carbon tax treatment effect (coefficient and t-statistic) to the distribution of estimated placebo tax coefficients and t-statistics and compute the resulting p-values. The results are robust to using randomisation inference (see Fig. 5). Comparing the actual carbon tax treatment estimates to the placebo coefficients and t-statistics shows that the estimated aggregate tax effect and resulting t-statistics do not appear uncommon when compared to all placebo treatments. In other words, estimating a hypothetical treatment on provinces that were not subject to the tax results in estimated coefficients that commonly show a stronger response to the placebo treatment than the actual estimated treatment effect.

Fig. 5
figure 5

Randomisation Inference—top panel shows results for aggregate emissions. Bottom panel shows results for transportation emissions. Left panel shows the distribution of placebo treatment estimates together with the actual treatment estimate (red). Right panel shows the distribution of placebo t-statistics, together with the actual treatment t-statistic. Dashed lines denote the 2.5th and 97.5th percentiles. Dotted lines denote the 5th and 95th percentiles. (Color figure online)

3.1.2 Reconciling the Results with Metcalf (2019)

Metcalf (2019) estimates a static model of log CO2 emissions in a difference-in-differences fixed effects panel and finding marginally significant reductions in emissions as a result of tax. These results, however, are not robust and can be directly reconciled with the insignificant findings here: they rely on omitting controls for serial correlation in the data and not accounting for the small number of clusters when constructing standard errors. Either of these controls results in insignificant effects of the tax on aggregate emissions (see Appendix E).

3.1.3 Sectoral Analysis

How can the absence of a tax effect on aggregate emissions be reconciled with existing evidence of significant decline in BC gasoline demand? To answer this question, I turn to sector-level emission data which disaggregates CO2 emissions to transportation (Transp.); stationary combustion sources (statEnergy); fugitive emissions from coal, oil and gas extraction (Fugit.); industrial processes (Indust) agriculture (Agric.); and waste processing (Waste). Emission series are shown together in Fig. 3. Fugitive emissions (which only make up a small fraction of overall emissions, approx. 3% in 2016) here act as a placebo test as they are entirely exempt from the tax. Greenhouse gas operations in agriculture are exempt, as well as non-fossil fuel greenhouse gas emissions, such as industrial processes and landfills, thus we expect small effects (if any at all) in the agricultural and industry sector.

A formal test for poolability confirms the heterogeneity of coefficients across sectors. I estimate a panel across sectors including sector-fixed effects and sector-year fixed effects together with interaction terms of the carbon tax (as well as each socio-economic control) with 6 sector dummy variables. An F-test for the joint significance of interaction terms strongly rejects the null hypothesis of the model being pool-able across sectors (p < 0.001). This supports the analysis on a sector-by-sector basis. In particular, not only does the tax effect vary across sectors, but so do the relationships between sector-specific emissions and socio-economic controls. I therefore estimate the difference-in-differences estimation (1) for each sector with the results reported in Table 2 (right columns) and Fig. 4. Notably, population and GDP are insignificant in many of the sectors suggesting a potential de-coupling of emissions from population and economic trends in Canada differing from results found in global studies (see e.g. Best et al. 2020).

There is evidence that the introduction of the tax has led to significant reductions in emissions in at least one out of the 6 sectors. Specifically, there is a significant decline in emissions from transportation (including vehicle traffic)—the single largest source of BC emissions (making up roughly 51% of emissions in 2016, see Fig. 4 lower panel). The estimated effect of the carbon tax is a reduction in transportation emissions of − 5% (se = 1%) in the short run, with an estimated equilibrium effect of − 19% (se = 3%).Footnote 10 The significant reductions in transportation emissions are further supported when using randomisation instead of conventional inference (see Fig. 5). The estimated tax impact in BC lies below the majority of all possible placebo treatment estimates.

The magnitude of transportation emission reductions in BC are comparable to the 6–9% reduction in transportation emissions in Sweden following the introduction of a carbon tax (Andersson 2019). This magnitude of the effect perhaps is surprising given that the Swedish carbon tax (at roughly $130 USD) is more than four times the current BC carbon tax at $35 CAD, and was close to double the BC rate for the first years following its introduction. This may suggest that the emission semi-elasticity with respect to carbon taxes is higher in BC than Sweden, though a formal comparison cannot be made as Andersson (2019) does not provide estimates of the emission-elasticity with respect to the tax (for BC emission semi-elasticities see Sect. 3.1.4).

The difference-in-differences estimation also suggests a reduction of agricultural emissions, however, these impacts are more uncertain and not robust across methods. Specifically, the reduction in agriculture emissions is not statistically significant when using bootstrap standard errors (see Appendix D), randomisation inference (see Appendix F), or removing provinces that operate under carbon prices from the control group (see Appendix C). The point estimate of the carbon tax in industry is negative, consistent with the plant-level findings of Ahmadi and Yamazaki (2020), but imprecisely estimated here at the province level.

Why are the apparent sectoral reductions in transportation (and perhaps agriculture) not detectable in aggregate emissions? First, some sectors (such as stationary combustion) have seen an increase in emissions (albeit not statistically significant) relative to the control provinces. This is also apparent in the synthetic control analysis which shows an increase in emissions of observed relative to synthetic BC. Thus, aggregate emissions appear unchanged relative to the control group. Second, the sample of treated post-tax observations is small. There are only 9 years of emissions data available following the introduction of the tax, thus any aggregate effect may simply be drowned out by the variability of emissions.

3.1.4 The Semi-elasticity of Emission with Respect to the Tax Rate

The above difference-in-differences specification models the introduction of the tax using a single dummy variable. Here I report estimates of the response of emissions to the rate of the tax by estimating model (1) using the tax rate per tonne emitted. Consistent with above results, the uncertainty around aggregate emissions is too high to confidently provide an estimate of the semi-elasticity of aggregate emissions with respect to the tax. Notably, the estimation results here do not differentiate between an introduction effect and a price effect (see Rafaty et al. 2020).

The emission semi-elasticity (combining introduction and price effects—see Rafaty et al. 2020) for transportation emissions is statistically significant and estimated at an approximate 0.2% (se = 0.1%) reduction in emissions per dollar increase in the tax rate (Table 3), with a long-run estimate of − 0.9% per dollar. This estimate captures the net effect of both the introduction as well as the marginal increases of the tax rate on emissions themselves, implicitly accounting for the pass-through of the tax (such as through gasoline prices for the case of transport) and subsequently the response of those emitting CO2 (such as consumers purchasing gasoline). Using the estimates in Table 3 allows us to quantify the net impact of the tax on emissions. For example, the $5 increase in the per-tonne carbon tax in 2018 corresponds to an approximate $0.02 increase in the retail gasoline price (from 6.7 to 8.9 cents per litre). Using the estimates from Lawley and Thivierge (2016),Footnote 11 this should result in a predicted 3.2% reduction in gasoline demand, which is of a similar order of magnitude as the estimated emission impact using estimates from Table 3, predicting a 1% ($5 × 0.002) contemporaneous reduction in transportation emissions, with a long-run reduction of approximately − 4%.Footnote 12 The estimated long-run impact on transportation emissions is thus close to the estimated direct impact on gasoline fuel consumption.

Table 3 Emission elasticity with respect to the carbon tax—(using the tax rate)

3.1.5 Local Emissions Using Spatially-Gridded Data

Province-level aggregation of emissions results in a small number of observations in the sample. To increase the number of available cross-sectional units and also identify local impacts I use spatially-gridded emissions data. Unfortunately, no consistent spatially-gridded annual GDP or population controls exist for Canada for the time period matching the emission series. However, lights at night data is available and have been used in past geo-spatial analyses as proxies for local population and economic activity (see e.g. Donaldson and Storeygard 2016). Thus, I estimate panel models (1) using gridded emissions with lights at night as socio-economic control, including lagged autoregressive terms, grid-cell and year-fixed effects, as well as the tax intervention. Standard errors are clustered at the province (treatment) level (using gridded observations results in a large number of individual grid cells in each cluster). As an additional robustness check (and given the estimation results strongly supporting growth effects) I also report the regression results when modelling the first difference of log emissions.

To assess locally-varying impacts of the tax, I interact the tax intervention dummy with a full set of individual fixed effects for the treatment region (each grid-cell within BC is allowed to deviate from the common tax coefficient). I assess both individual significance of the interaction terms (to identify regional variation in tax impacts) as well as overall significance of the global hypothesis of no tax impact using the Simes (1986) modified Bonferroni correction for multiple testing. Future work may explore a spatial regression discontinuity design across the borders of the taxed region, though region-specific challenges (such as the Rocky Mountains) and low population density near the borders may reduce the informativeness of this approach.

Using spatially-gridded data confirms the absence of a significant tax effect on aggregate emissions. The coefficient on the tax intervention is not statistically different from zero in the gridded model (Table 4) of aggregate emissions while controlling for local socio-economic development using satellite-derived lights-at-night observations. Nevertheless, there is some evidence of a regional response to the tax. The tax effect is statistically significant and negative for aggregate emissions for some regions (Fig. 6) and we reject the global null hypothesis of no effect when correcting the grid-based tests for multiple testing (using Simes 1986). Some regions that systematically show a decline in emissions relative to others when allowing the effect of the tax to vary over space—a notable cluster is in the North East of the province which constitutes a large oil and gas field. The introduction of the carbon tax coincides with the `great recession’, so the estimated reduction in emissions following the tax may reflect downturn in a particular emission-intense industry in the treated region that is independent of the tax. Using spatially-disaggregated transportation emissions also confirms the findings when using aggregate data. The tax has led to a significant reduction in transportation emissions, with the immediate impact estimate of a reduction in emissions of 1.3%. In levels, this effect is notably smaller than the estimate obtained when using province-level data (Sect. 3.1.1). This difference in magnitude can be driven by the different temporal coverage of the samples. The spatially-gridded sample only spans up until 2012 compared to 2016 for the province-level data. Additionally, the autoregressive coefficient in the spatially-gridded panel is close to one, supporting an analysis in first difference, and thus perhaps capturing an effect on the growth rate of transportation emissions.

Table 4 Gridded Emissions Estimates when estimating model (1) using spatially-gridded aggregate emissions (left) and transportation emissions (right)
Fig. 6
figure 6

Individual Significance of Tax Effect: local grid-cell specific t-statistic of the tax within British Columbia when estimating model (1) using spatially-gridded observations (from Sect. 3.1.5) and grid-specific tax coefficients. Aggregate emissions shown in the left panel, transportation emissions shown in the right panel. Blue denotes significant negative tax effect on emissions, red indicates a significant positive tax effect on emissions. Light blue and light red/pink denotes insignificant negative/positive point estimates

3.2 Synthetic Control

To further alleviate concerns about difference-in-differences with a single treated unit, I investigate the effect of the introduction of the carbon tax using synthetic controls (Abadie and Gardeazabal 2003; Abadie et al. 2010). This approach is suitable for a single treated region and relaxes the (difficult to verify) parallel trend assumption underlying difference-in-differences by constructing a synthetic (estimated) version of the treated region, and subsequently compares the synthetic (untaxed) region to the observed region under the tax. The approach is appealing in the case of a single treated region, but the small number of control regions means that inference in synthetic control is less straight forward. I apply the permutation test (Abadie et al. 2010) where each control region is treated as a placebo and assigned a placebo intervention in 2008 when the carbon tax was introduced. Subsequently I compare the difference between observed BC and synthetic BC to the difference in every placebo region using the ratio of pre- and post-treatment mean square prediction errors. Nevertheless, the construction of p-values based on the observed deviations of the placebo-permutation study here is only marginally informative as only 9 untreated regions are available. In other words, the difference of observed and synthetic BC can only be compared to at most 9 placebo regions. Synthetic control is implemented here using the Synth package in R (Abadie et al. 2011).

Synthetic control results using province-level data are consistent with the estimates from the analysis using panel difference-in-differences (Fig. 7 and Appendix G—which also reports the weights on control provinces). There is little noticeable difference in aggregate emissions between BC and synthetic BC. The gap in emissions between synthetic and observed BC ranges from − 3% to + 3% (Fig. 7, top panel). Even though there are only a small number of control provinces and therefore the permutation test is of limited use, the results also support the absence of effects on aggregate emissions (Appendix G). The synthetic control analysis also confirms sector-level results of the difference-in-differences estimation. Transportation emissions in observed BC are around 3–15% lower than synthetic BC (Fig. 7, lower panel) with an average effect of − 5% over the treated period. The apparent re-bound towards the end of the sample could be driven by the absence of a planned increase in the carbon tax from 2012 onwards. The original announcement of the tax aimed for a steady increase of the tax by $5 dollars per tonne of CO2 per year, however instead the tax was held constant from 2012 to 2018. Synthetic control also shows that agriculture is not well-approximated by the empirical model (see Appendix G, similar to difference-in-differences as evidenced by the high standard error when using bootstrap –cluster robust errors).

Fig. 7
figure 7

Synthetic control analysis of the BC carbon tax for aggregate (top) and transport emissions. Left panels show synthetic and observed log CO2 emissions in BC, middle panel shows the difference (gap) between log emissions of synthetic and observed BC. Right panels show the gap for the permutation across all control regions

3.3 Break-Detection to Assess Policy Impacts

An alternative approach to imposing the policy intervention a-priori as in conventional difference-in-differences or synthetic control is to ask the data-driven question: have there been sudden changes in emissions, and if so, do these align with policy interventions? I propose to use break-detection methods to discover level-shifts in individual fixed effects \({\mu }_{i}\) in a panel model to identify treatment time and treated units (Castle et al. 2015). Such level shifts are equivalent to treatment dummy variables in a difference-in-differences specification where the period before the shift is the pre-treatment period, the period after the shift is the post-treatment period, and the unit for which the shift has been detected corresponds to a treated unit. The retained step-shifts identify unit-specific heterogeneous treatment effects (see Pretis and Schwarz 2022 for more detail). Thus, this method allows treatment dates and units to be identified without prior knowledge of treatment timing or assignment.

While standard difference-in-differences approaches act as a test of whether a known intervention had an effect, using break detection in the similar context allows us to identify what mattered, and if any of the detected effects coincide with known interventions (such as the introduction of a carbon tax). Thus, difference-in-differences answers the question of whether a known policy mattered while break detection evaluation answers the question of what mattered (that might have been unknown previously). The challenge is then to attribute the detected breaks to interventions that occurred during the identified time periods.

Detecting breaks with subsequent policy (or event) attribution has become common practise in time series analysis (and recently in regression discontinuity designs, see e.g. Porter and Yu 2015, or Hansen 2017), however, has not been used in panels to detect treatment. In time series for example, in early work, Perron (1989) detects breaks in GNP time series attributable to the Great Depression and an oil price shocks. More recently, Hendry (2018) employs a break detection approach in a single time series to detect policy interventions in UK CO2 emissions, Pretis et al. (2015) use break detection to evaluate the accuracy of climate models, and Mukanjari and Sterner (2018) use break detection as a modified univariate event-study method. Williams and Reade (2016) detect jumps in Twitter data aligned with news-worthy events, and Ericsson (2017) uses breaks to identify periods of varying forecast accuracy in forecasts of the US debt. Estrada et al. (2013) employ break detection to assess whether the Montreal Protocol significantly affected CFC emissions and subsequently temperatures, and Schneider et al. (2017) as well as Pretis et al. (2016), and Pretis (2020) use break detection to identify volcanic eruptions in temperature records. Focusing on policy interventions, Apergis and Lau (2015) assess whether breaks in Australian electricity markets can be attributed to introduction of climate policy. In a micro-econometric setting Piehl et al. (2003) use break detection to evaluate policy on youth homicide reduction. However, these existing time series applications of break detection to evaluate policy (and interventions in general) do not have control groups, thus making a causal interpretation difficult.

Here I generalise break detection to a fixed-effects panel setting. This is comparable to a difference-in-differences event study where every time period is considered a potential event and every unit is considered as potentially receiving treatment while un-treated units without breaks act as controls. The approach is also comparable to the regression discontinuity design in Porter and Yu (2015) where every level of the running variable may provide a discontinuity which is unknown a-priori. In follow-up work, we further show the equivalence between structural breaks and heterogeneous treatment effects in Pretis and Schwarz (2022) with applications to EU transport emissions in Koch et al. (2022).

3.3.1 Break Detection Approach to Detecting Treatment

The starting point for the proposed approach here is to specify a general (fully-saturated) model allowing for a treatment dummy for every time period and every unit. This corresponds to a model with more variables than observations, which can be reduced to a specific model using variable selection algorithms from machine learning (for example general-to-specific selection or shrinkage based estimators such as the LASSO, see Tibshirani 1996). The general model is given by Eq. (2):

$${\mathrm{log}\left(Emiss\right)}_{i,t}={\lambda }_{t}+{\mu }_{i}+\beta Ta{x}_{i,t}+{\alpha }_{1}{\mathrm{log}\left(Emiss\right)}_{i,t-1}+{\gamma }_{1}{\mathrm{log}\left(Pop\right)}_{i,t }+{{\gamma }_{2}\mathrm{log}\left(GDP\right)}_{i,t}+{\sum }_{j=2}^{T}{\delta }_{i,j}{1}_{\{t\ge j\}}+ {\epsilon }_{i,t}$$

where \({1}_{\{t\ge j\}}\) denotes an indicator function for the interaction of each t and i, in other words the difference-in-differences estimator for each time period and unit. This allows for a step-function (capturing treatment) at time t = j for any unit i with heterogeneous treatment magnitude given by the estimated coefficients \({\delta }_{i,j}\) at any point in time. As many units are likely ‘untreated’, the coefficient matrix of all \({\delta }_{i,j}\) will be sparse (zero coefficients for most units and time periods) and we can rely on machine learning model selection methods for sparse models with more variables than observations. Here I use indicator saturation methods (Castle et al. 2015 implemented in R using `gets’, see Pretis et al. 2018a, b), though the LASSO and related estimators are also feasible.

Using a general-to-specific indicator saturation approach, the general model (2) is reduced to a specific one (removing treatment dummy variables) using automated model selection at a chosen target significance level p that determines the expected number of false-positives (controlling for multiple testing) given by pN(T-1), where NT denotes the panel sample size in a balanced panel.Footnote 13 Johansen and Nielsen (2009, 2016) provide the asymptotic theory for simple impulse indicators and Nielsen and Qian (2018) extend the results to step-indicators in time series, showing that the false-positive rate converges to the chosen target significance level p. Thus, the false-detection rate is easily controlled by altering p. For example, setting p = 0.05 yields an expected false-positive rate of 0.05 detected treatment interventions. The final model then includes a subset of ki of all possible break variables. The detected break times here are denoted by Ti,j:

$${\mathrm{log}\left(Emiss\right)}_{i,t}={\lambda }_{t}+{\mu }_{i}+\beta Ta{x}_{i,t}+{\alpha }_{1}{\mathrm{log}\left(Emiss\right)}_{i,t-1}+{\gamma }_{1}{\mathrm{log}\left(Pop\right)}_{i,t }+{{\gamma }_{2}\mathrm{log}\left(GDP\right)}_{i,t}+{\sum }_{j=1}^{k{}_{i}}{\delta }_{i,j}{1}_{\{t\ge {T}_{i,j}\}}+ {\epsilon }_{i,t}$$

Note that not all units may exhibit a significant break (i.e. treatment) at all, so ki = 0 for most i. Intuitively this approach searches over zero–one dummy variables at every point in time for each i, generalised here to a panel setting where the retained indicator can be interpreted as treatment affecting unit i, in period Ti,j.

As an illustration, consider a province that implemented a policy in a particular year leading to a significant reduction in emissions relative to all other provinces. A researcher may, however, have not been aware of this policy and so is unable to estimate a conventional difference-in-differences or synthetic control model. The proposed approach assesses every period for every province as a potential treatment intervention, where significant interventions are retained and insignificant removed from the model (at a pre-specified level of significance p which determines the false-positive rate). Having identified individuals in which breaks occur together with the estimated break dates, break-detection requires ex-post attribution of detected interventions. If timing coincides with the previously unknown policy intervention, this can provide evidence for its effectiveness. Naturally, had this policy been common knowledge, then it could have been tested in a conventional difference-in-differences (or synthetic control) approach.

The proposed break detection method can be interpreted in two ways. First, as a robustness check for conventional difference-in-differences analysis where a known policy to be tested is `forced’ in the model and not selected over, while selecting over potential unknown treatment. This allows the known policy to be evaluated while simultaneously testing whether there have been other significant interventions affecting the outcome variable since omitting relevant variables (such as previously unknown other policy interventions) may bias the estimates of the intervention being tested. This follows the theory-embedding approach of Hendry and Johansen (2015) where some variables are not selected over, while simultaneously selecting over a full set of break-variables, a subset of which can describe any sudden change that may have been unknown a-priori. Second, break detection can act as a purely data-driven method to detect significant interventions without imposing any single intervention from the outset. Here this allows for detection of abnormal periods of emissions relative to the remaining control group.

Break detection using step-indicator functions has high power when the underlying data generating process is well-approximated by a step-shift change. However, some emission policies may have induced more gradual changes, such as a trend change in CO2 emissions. These more gradual changes can be detected by either searching for breaks in the first difference of the series (where a step-shift in first differences corresponds to a trend break in the level). An alternative is to look for breaks in individual trend functions in the first place, as for example recently employed by Walker et al. (2019) who assess previously unknown changes in prescription patterns of medical doctors. For the present application I focus on step-shifts in the individual fixed effects, which risks missing some of the more gradual change (as perhaps induced in Ontario following the phase-out of coal in the 2000s).

3.3.2 Break Detection: Illustrative Monte Carlo Simulation

To investigate the properties of the proposed break detection approach I consider a simple Monte Carlo exercise (details provided in Appendix H) that closely matches the sample of Canadian emission. I simulate a single treated unit with varying levels of treatment (in standard deviations of the error term) in a sample of T = 27, N = 10 observations. Simulation results in Fig. 8 show the detection frequency of the unknown intervention compared to the null rejection frequency when the intervention is known and imposed in a conventional difference-in-differences panel. The results demonstrate that treatment can be identified without prior knowledge of its occurrence at a rate that is close to the optimal case where the intervention is known and tested using difference-in-differences (see also Appendix H for more details). For example, consider a treatment intervention of 2SD of the error term in Fig. 8: the null-rejection frequency of the known difference-in-differences intervention dummy of 98% (at a 1% level of the test, blue in Fig. 8) is close to the detection frequency of 97% when the treatment is unknown (at a false-detection rate of 1%, red in Fig. 8). Simulation results also confirm that the false-positive rate of detected interventions is easily controlled by adjusting the level of significance of selection (dashed lines in Fig. 8). The simulations in Appendix H further show that the indicator saturation implementation of treatment detection outperforms the LASSO both in terms of power of detecting treatment, as well as in controlling the false-positive rate.

3.3.3 Break Detection Estimates for CO2 Emissions

Break detectionFootnote 14 identifies three periods of significant province-level reductions in aggregate  CO2 emissions (Table 5 and Fig. 9), however, these coincide neither with the BC carbon tax, nor the introduction of carbon prices and trading in Alberta or Quebec. A significant decline in emissions is instead detected in New Brunswick in 2012 (95% CI: ± 1 year), Nova Scotia in 2012 (95% CI: ± 3 year) as well as in Newfoundland and Labrador in 1999 (95% CI: ± 0 year). The detected step-shift breaks are robust to allowing for the relationship between emissions and control variables to also change over time—when allowing for possible breaks in the coefficients on control variables, the identical treatment events are detected and no breaks on random variables are detected—see Appendix I). Attributing the detected step-shift breaks ex-post to possible `treatment interventions’, the decline in emissions in New Brunswick around 2012 is likely driven by reduced use of fossil-fuel driven power generators (see CBC 2012). In Nova Scotia, the detected break can be attributed to the closure of emission-intense pulp and paper mills together with their respective power plantsFootnote 15 in 2012 (Govt. of Canada, 2014), as well as with electricity policies put into place in 2015 (Prov. Of Nova Scotia, 2017) which falls within the ± 3 year approximate 95% confidence interval on the detected break date (Hendry and Pretis 2016). The reduction in emissions in Newfoundland and Labrador in 1999 relative to control provinces coincides with increases in hydroelectric capacity for electricity generation (Newfoundland Power Emission Reduction Report, 2001). The detected breaks together with their predicted counter-factual over 5 years are shown in Fig. 9. The counterfactual is constructed as the dynamic forecast conditional on the observed control variables in absence of the detected break. Together these results show that industry-level interventions appear to have led to significant reductions in emissions while policy interventions in the form of carbon pricing have not had a detectable impact on aggregate emissions (Fig. 9).

Table 5 Break detection results for aggregate CO2 emissions at a significance level of detection of p = 0.025
Fig. 8
figure 8

Simulation results comparing a known difference-in-differences intervention to detecting this intervention when it is unknown using break detection. The figure shows the detection/rejection frequency and false positive rate of break-detection (unknown intervention, red) and difference-in-differences (known intervention, blue) for different magnitudes of the treatment intervention in SD of the error term, including the no-treatment case (gray shaded). The correct rejection frequency (i.e. identification of treatment) in the unknown treatment break detection approach (red solid lines) is close to the optimal known treatment case (blue solid lines). The false positive rate of break detection is stable around the chosen level of significance (dashed lines). The false-positive rate of the break detection approach is shown as red-dashed and closely matches the nominal level of significance of selection (p = 0.01 left, p = 0.025 middle, and p = 0.05 right). Simulation results based on 5000 replications. Full results are also given in Appendix Table H1. (Color figure online)

Fig. 9
figure 9

Detected negative breaks (interventions) in the panel fixed effects model of log aggregate emissions (3). Detected breaks (relative to the control group) are shown as dashed vertical coloured lines for the three provinces in which interventions are identified. Break magnitude is shown as the estimate immediate impact. Three negative breaks (reductions in emissions) are identified and shown in bold: New Brunswick in 2012 (± 1 year 95%CI), Nova Scotia in 2012 (± 3 year 95%CI), and Newfoundland and Labrador in 1999. Counter-factual given by a dynamic counterfactual forecast conditional on the control variables in absence of the detected break is shown as dashed lines for 5 years following the break. Introduction of the BC carbon tax is shown as gray-dotted line but is not detected as a break for BC in the model. Full estimation results are reported in Appendix I

Nevertheless, the introduction of the BC carbon tax is detected without prior knowledge of its occurrence in transportation emissions. Without a-priori specifying the BC carbon tax to be included in the model, break detection identifies the introduction of the tax as a significant reduction in BC transportation emissions in 2009 (95% CI: ± 4 year, see Appendix I). The detected break has an estimated magnitude of − 5% (se = 2%) which is near identical to the difference-in-differences estimate imposing the intervention from the outset in transportation emissions.

4 Conclusion

Using difference-in-differences, synthetic control, and introducing a new econometric break detection approach to identify previously unknown treatment assignment and timing, I find evidence of some sectors having significantly reduced their CO2 emissions in response to the introduction of a carbon tax in British Columbia, Canada. Particularly there is evidence for a significant reduction in emissions from transportation, the single largest source of emissions in the taxed region. However, there is no statistically significant effect of the introduction of the carbon tax on the aggregate level of CO2 emissions, highlighting the heterogeneity in emission elasticities across sectors as well as the importance of command-and-control policy interventions beyond carbon pricing. These results are consistent across all methods employed.

The proposed panel econometric break detection approach identifies the BC carbon tax as having reduced transportation emissions. Detected significant reductions in aggregate CO2 emissions across Canadian provinces, in turn, do not coincide with the implementation of the BC carbon tax, nor the carbon pricing schemes in Alberta and Quebec. Instead, the detected interventions in aggregate emissions reflect the closures of emission-intense industries (pulp and paper with accompanying power-generating facilities) and reforms in electricity generation in untaxed provinces such as New Brunswick and Nova Scotia transitioning away from fossil-fuel driven electricity generation. More generally, the break detection approach based on machine learning provides a new way to identify successful policy in settings where multiple (known and unknown) interventions have taken place.

How representative are the results from British Columbia for carbon taxes as a whole? The BC carbon tax enjoys a high degree of public acceptance and—with its relatively few exemptions—is a good candidate for the analysis of the tax impact on aggregate emissions. The effectiveness of reducing emissions in the transportation sector are encouraging, particularly as BC relies heavily on individual motor vehicles due to the long driving distances and limited public transport. This bodes well for the introduction of carbon taxes in comparable regions (such as the US). The significant impact of the tax on transportation emissions is also consistent with the impact of the Swedish carbon tax (Andersson 2019). BC, however, is a special case with regards to the tax impacts of electricity generation—the province relies predominantly on hydro power for electricity generation; thus it is not surprising to see little impact of the carbon tax on emissions related to electricity production.

A likely explanation for the absence of a detectable statistically significant tax effect on aggregate emissions is that the carbon tax is too low to result in rapid changes in the short time frame since its introduction given the heterogeneous emission elasticities across sectors. The break detection results also highlight that carbon taxes are not the only available tool to curb emissions—command and control approaches, closures of industries, and (policy mandated) efficiency improvements can lead to significant aggregate emission reductions. These findings also support the work of Eskander and Fankhauser (2020) arguing that climate policies have been limited in their large-scale effectiveness. Pricing carbon at currently-observed rates is unlikely to be sufficient alone to achieve major emission reductions and policy mixes will be needed.

Overall, the present results also highlight the importance of monitoring the effectiveness of policy tools based on their target metric—small changes in gasoline demand, transportation emissions, or natural gas use are insufficient when the required policy target is aggregate CO2 emissions due to the cumulative nature of atmospheric CO2 concentrations.