1 Introduction

It is well understood, that in order to achieve international climate targets as agreed in Paris, global greenhouse gas emissions need to decrease rapidly in the upcoming years (IPCC 2014). In order to achieve this goal, market-based instruments, such as carbon taxes, cap-and-trade systems or fossil fuel subsidy reforms, are frequently recommended by leading economists such as Nicolas Stern and Joseph Stiglitz (High-Level Commission on Carbon Prices 2017). Economic theory highlights that these instruments are environmentally effective and economically efficient (Pigou 1920; Nordhaus 1991; Pearce 1991).Footnote 1 In 2018, 51 carbon pricing schemes, such as carbon taxes or cap-and-trade systems were implemented or planned, covering 20 percent of the global greenhouse gas emissions (World Bank and Ecofys 2018), albeit at carbon prices well below those that are considered to be in line with the targets of the Paris Agreement. Numerous countries also apply taxes or levies on fossil fuel use, for instance for transportation or heating. Even though these are not directly proportional to the carbon content, they nevertheless provide an incentive to reduce greenhouse gas emissions. Commitment was made to the phase-out of fossil fuel subsidies by the G20 in 2009 at the Pittsburgh summit (G20 Leaders Statement 2009) and several reforms have been enacted in recent years.Footnote 2 (IEA and OECD 2017).

The distributional impacts of carbon and energy taxes however strongly influence the political acceptability (Baumol and Oates 1988; Baranzini et al. 2000; Tiezzi 2005). Regressive distributional impacts harm vulnerable groups and decrease the likelihood of policies being implemented and sustained (Parry 2015). Social equity concerns can thus quickly dominate the public debate if energy prices increase (Shammin and Bullard 2009). For example, the incidence and the distributional impacts of the repealed Australian carbon tax were subject to public and academic debate (Rahman 2013; Sajeewani et al. 2015). The progressive Nigerian fuel and petrol subsidy reform in 2012 even resulted in mass protests and strikes which led to a partial reimplementation (Soile and Mu 2015; Lockwood 2015; Dorband et al. 2017).Footnote 3

The literature on the distributional impacts of climate policies provides ambiguous results. Many studies find an overall tendency for regressive impacts (Araar et al. 2011; Gonzalez 2012). Others detect mostly regressive findings for developed countries while developing countries show an inconsistent picture with a tendency towards proportional or progressive impacts (Verde and Tol 2009; Wang et al. 2016). Nevertheless, progressive impacts have also been shown for developed countries like Australia (Sajeewani et al. 2015), Canada (Dissou and Siddiqui 2014) and Spain (Labandeira et al. 2009).

Previous literature reviews provide initial insights but do not systematically explain outcome heterogeneity, i.e. what drives the differences in results of studies. Wang et al.(2016) have conducted the most comprehensive literature review on distributional impacts of carbon prices so far. They consider distributional impacts across households differing by income, location and demographic characteristics. This broad scope provides valuable insights into various dimensions of distributional impacts. However, a common problem ailing most literature reviews is the lack of explicit or transparent selection and evaluation criteria for their study sets as well as rigorous methods for analysis of observed variation, which exposes them to the criticism of subjectivity and a lack of validity. The literature on meta-analysis is littered with examples that show how traditional literature reviews and vote-counting approaches can be misleading and inconsistent in their assessment of the state-of-art (Stanley and Doucouliagos 2012; Ringquist 2013).

We focus our analysis on distributional impacts across household income groups. This narrow scope allows the use of a meta-analysis to quantitatively determine the sources of variation in the study outcomes.Footnote 4 Thus far, meta-analyses have mainly been applied in the fields of education and medicine, but organizations like the Campbell Collaboration or the Collaboration for Environmental Evidence have tried to establish rigorous quality standards and mainstream such work in the social and environmental sciences. In fact, there is an increasing volume of meta-analyses in social science including environmental economics (Moeltner et al. 2007; Nelson and Kennedy 2009; Tunçel and Hammitt 2014).

This study applies an ordered probit meta-analysis framework to 53 original studies covering economy-wide and transport sector climate mitigation policies, providing 183 effects in 39 countries. We analyze all market-based policies that affect the price of fossil fuels, regardless of whether these are put into place for climate change mitigation (e.g. a carbon tax, an explicit price on the carbon) or a different purpose, such as generating public revenues (e.g. excises taxes on fuels). We also include studies that address fossil fuel subsidy removals, as the absence of a Pigouvian tax can also be regarded to constitute a (so-called ’post-tax’) subsidy (Coady et al. 2017). We include moderator variables accounting for different policies, modeled economic effects, and countries, while controlling for a publication bias and a time trend. We find a significantly increased likelihood of progressive study outcomes for lower income countries and transport policies. The same applies to study designs considering indirect effects, demand-side adjustments of consumers, or lifetime income proxies. In contrast, we find that subsidy reforms are not inherently more progressive than carbon pricing instruments.

We structure the remainder of this paper as follows: Section 2 elaborates our four key hypotheses with respect to theory and literature findings. Section 3 describes the data selection process, explains the variables and introduces the quantitative model. Section 4 presents the main results while Sect. 5 discusses and concludes the findings.

2 Hypotheses

Based on economic theory and previous research findings, we expect that the policy type, the affected sectors and the modeled economic effects systematically influence the outcomes of studies assessing distributional impacts. The following paragraphs discuss factors that might drive the results and subsequently develop hypotheses about the estimated impact.

First, literature reviews show mostly regressive impacts in developed countries. Developing countries, however, show an inconsistent picture with a tendency towards proportional or progressive impacts (Verde and Tol 2009; Wang et al. 2016). These findings could be explained by low carbon intensities of the consumption baskets of poor households in lower income countries, resulting from a higher share of subsistence consumption, a low access to modern energy services, or the lack of affordability of energy. In fact, Flues and van Dender (2017) demonstrate a negative correlation between the energy affordability risk and GDP, for 20 OECD countries.

Second, literature reviews strongly suggest progressive outcomes for reforms that decrease or abolish fossil fuel subsidies (Anand et al. 2013; Clements et al. 2013; Coady et al. 2017) while carbon pricing policies show ambiguous impacts (Wang et al. 2016). Fossil fuel subsidies have primarily been implemented in developing countries (Coady et al. 2017). Currently implemented fossil fuel subsidies are mostly regressive as they especially benefit well-organized interest groups while disadvantaging low-income households that spend relatively little on energy (Inchauste and Victor 2017). Small groups of powerful and highly profiting actors have a greater incentive to organize and influence a legislative process than a large group of individuals with low payoffs (Oye and Maxwell 1994). The political economy in combination with the consumption baskets of households in developing countries might thus explain the progressive literature findings for subsidy reforms.

Third, Wang et al. (2016) review a tendency towards progressive outcomes for transport sector policies.Footnote 5 Others, however, show proportional or regressive outcomes in the United States (Casler and Rafiqui 1993; Chernick and Reschovsky 1997; Metcalf 1999; Chernick and Reschovsky 2000; Williams et al. 2015), Germany (Nikodinoska and Schröder 2016) and six other European countries (Sterner 2012). Sterner (2012) argues that the smaller car ownership rate in low-income countries makes fuel a luxury product. Santos and Catchesides (2005), however, also find a lower car ownership rate for low-income household in the United Kingdom, resulting in a reverse U-shape relationship between income and incidence. The efficiency of the public transport system as well as indirect fuel expenditures on public transport could additionally influence the results (Datta 2010). Nevertheless, Kpodar (2006) and Ziramba (2009) find no impact of indirect expenditures.

Finally, we compare the modeling of indirect effects, demand-side adjustments of consumers, general equilibrium effects and studies that apply lifetime income proxies. We thus complement the previous discussion on policy and country impacts by considering different study designs and their corresponding modeled economic effects (see Sect. 3). Distributional analyses at least consider direct effects, i.e. the price increase of all goods that directly contain \(\hbox {CO}_2\), such as gasoline. The following paragraphs discuss the potential impact of additional economic effects on the study outcomes.

Indirect distributional effects are caused by price changes of goods in the consumption basket due to \(\hbox {CO}_2\) emissions embedded to their value chain. Considering indirect effects might influence the distributive impact in both directions. Generally, their impact depends on the relative difference of \(\hbox {CO}_2\) intensities in the consumption baskets between low- and high-income households (Anand et al. 2013). Hasset et al. (2009) provide evidence that indirect effects mitigate regressivity in the United States. Other authors show that indirect effects increase regressivity as low-income households tend to spend large fractions of their incomes on energy-intensive food and public transport (Jacobsen et al. 2003; da Silva Freitas et al. 2016).

Modeling demand-side adjustments of consumers could also ambiguously influence the study outcomes. The impact depends on differences in the demand elasticities between low- and high-income households. Zhang (2015) shows larger demand-side adjustments for richer households and argues that low-income households are required to focus on their basic needs and hence less responsive to price signals. On the contrary, West and Williams (2004) show larger demand-side adjustments for low-income households which results in more progressive outcomes. Their study however only considers transport fuel taxes.

We expect more progressive outcomes for studies that capture general equilibrium effects. Several studies find general equilibrium effects to foster progressive outcomes (Rausch et al. 2011; Dissou and Siddiqui 2014; Vandyck and Van Regemorter 2014; Beck et al. 2015; Sajeewani et al. 2015; da Silva Freitas et al. 2016). Dissou and Siddiqui (2014) show that carbon taxes particularly affect the capital-intensive energy industry. This decreases the capital income of rich households and thus makes the distributive effect more progressive. Fullerton and Heutel (2011), however, highlight the results’ sensitivity on parameter values.

Using lifetime income proxies, rather than annual household incomes, is hypothesized to increase progressivity. Several literature findings based on lifetime incomes show more progressive outcomes for excise and transport taxes (Poterba 1989, 1991; Bull et al. 1994; Lyon and Schwab 1995; Hassett et al. 2009). The permanent income hypothesis (Friedman 1957) assumes that households smooth their consumption over their lifetime. Accordingly, lifetime income proxies consider that low annual incomes in isolated years do not necessarily correspond to low welfare as, for instance, elderly people and students tend to live on savings or loans. The magnitude of the effect (Fullerton and Rogers 1993), as well as the most suitable lifetime income proxy (Metcalf 1999; Chernick and Reschovsky 2000), are widely debated.

Based on this discussion, we hypothesize an increasing share of progressive study outcomes for first, low-income countries, second, subsidy reforms and third, transport sector policies. We also expect more progressive findings for studies that model general equilibrium effects or use lifetime income proxies. Studies that consider indirect and demand-side effects could either provide more progressive or more regressive findings.

3 Methodology

This section first explains how studies included in the meta-analysis were selected. It then provides an overview of the sample, including the dependent variables from the literature and explanatory variables that were either derived from the studies themselves, or drawn from external sources. It finally describes the empirical strategy to assess determinants of distributional outcomes.

3.1 Data Selection

We follow Ringquist (2013) for the structure of the data selection process. This process comprises identifying relevant study authors and keywords, developing a search strategy, considering additional citations and defining study selection criteria, which allow to identify and classify studies as potentially relevant, relevant and finally, as acceptable. For literature identification we conduct a query search in the Web of Science and the Scopus literature databases. We connect three groups of keywords with boolean operators filtering for research on \(\hbox {CO}_2\) related (carbon\(CO_2\), gasoline, emission, environment, ecologic, energy) pricing policies (tax, allowance, subsidy, policy, price) investigating the distributional impacts (distribution, regressive, progressive, incidence, inequality, household income). We exclude findings from unrelated research fields by permitting characteristic keywords (see “Appendix Search Query” for details). The literature search identified 1023 studies restricted to literature written in English. In the first step, we exclude 856 studies with titles indicating irrelevant research questions, leaving 167 potentially relevant studies.

For the next steps of the selection process we apply the following study selection criteria. First, we exclude 61 studies because of differing research questions, replicating findings of previous studies including double hits, unavailability or insufficient quality. Second, we only select quantitative studies, thus excluding 34 studies that provide qualitative results or apply theoretical models. Third, we exclude 46 studies with an incomparable scope, i.e. studies pricing multiple pollutants beyond \(\hbox {CO}_2\), imposing sectoral restrictions apart from transport, only including effects with revenue recycling schemes or only concentrating on urban or rural households. Last, we only select countries or large regions, thus excluding 8 studies for single cities and supranational unions.

Fig. 1
figure 1

Study selection process. Notes The figure shows the number of retained and excluded studies during the study selection process. It differs between study origin (query search and references) and study evaluation step (identified, potentially relevant, relevant and accepted)

We employ these selection criteria successively to the abstract and the full text of the 167 (potentially) relevant studies, resulting in 36 acceptable studies. In order to supplement our sample by grey literature and literature from other databases, we subsequently screen the references of all acceptable studies from the query search to identify further relevant studies. Based on this reference search, we identify another 35 relevant studies, resulting in another 17 acceptable studies. The final sample comprises 53 original studies with 183 effects. Figure 1 provides an overview of the selection process. Further details are documented in the codebook, which is available upon request.

3.2 Sample Overview

The final sample comprises 53 studies with 183 effects in total. The original study author names, the publication years, the number of included effects per study and the percentage share of included effects per study relative to the 183 total effects are listed in the “Appendix Study Overview”. All studies were published between 1991 and 2017 with an average publication year of 2007.Footnote 6 Most original studies report several effects which account for alternative policies, different model setups or multiple countries. The number of effects per study are thus unequally balanced with Flues and Thomas (2015) providing 22.4% of the sample, Sterner (2012) 14.2% and Hasset et. al (2009) 6.6%, while the other studies contribute less than 5%. The 53 studies include 46 peer-reviewed journal articles (126 effects) and 7 articles from grey literature (57 effects).

Figure 2 shows the number of effects and the percentage share of the total sample for each country included. The effects per country are also unequally balanced, with the United States 30.6%, the United Kingdom 6.6% and Germany 4.9% contributing the largest shares in the sample. Grouping the effects by World Bank country income levels provides 144 effects for high-income countries and 39 effects for low, lower-middle and upper-middle income countries.

Fig. 2
figure 2

Country sample overview. Notes The figure shows the number of effects (left y-axis) and their share in the sample in percent (right y-axis) per country (x-axis)

3.3 Dependent Variable

The ordered categorical variable Distributional impact captures the progressive, proportional or regressive distributive impact of each effect included. We only aim to explain whether a policy is progressive, regressive or proportional, without addressing the size of this effect, as the inequality measures applied in the original studies are not quantitatively comparable. The methods suggested by the meta-analysis literature to harmonize different effect size metrics are not applicable to this study.Footnote 7 We also tried to subsample studies with identical inequality metrics, but unfortunately the sample sizes became too small to conduct a quantitative analysis. Section 5 discusses the implications of abstracting from the effect size. Neglecting the effect size increases the significance and validity of the results as it allows us to examine a larger sample of original studies. The coding decision either directly relies on quantitative inequality measures or on the interpretation of the original study author’s. The 183 effects comprise 52 progressive, 13 proportional and 118 regressive outcomes (see Table 1).

Table 1 Variable summary statistics

3.4 Moderator Variables

Moderator variables are hypothesized to systematically influence the outcomes of the original studies (Ringquist 2013). We include moderator variables that allow us to test the hypotheses developed in Sect. 2. The policy and the country moderator variables account for differences in the presumed distributional impact, while the economic effect variables implicitly capture different study designs. We also control for a potential publication bias and a time trend. Table 1 summarizes the variables included. We exclude effects that model revenue recycling schemes as those are either too context-specific for designing reasonable moderator variables or, in case of lump-sum, completely offset prior regressive findings, which leads to a perfect predictor. This particularly applies to effects in studies using computable general equilibrium (CGE) models, which our analysis only considers if results are explicitly reported without the impact of revenue recycling schemes.

Furthermore, we test the bivariate relationship between the moderator variables and the dependent variable. For the binary moderator variables we conduct a two-proportion z-test. Similarly, we conduct a correlation analysis for the continuous moderator variables. The results of the two tests indicate an overall suitable selection of moderator variables. Further analysis, however, requires a multiple regression analysis as the bivariate tests ignore potential correlations between the moderator variables. The remainder of this section briefly explains the moderator variables included. More details about individual moderator variables and the bivariate analyses, including their results tables, are provided in “Appendix Detailed Moderator Variable Description”.

Policy Variables We include two variables controlling for policy differences: The Subsidy variable differs between subsidy reforms and carbon pricing schemes. The Transport variable compares policies only on the transport sector with economy-wide policies. Generally, we only include effects increasing the burden for households, i.e. resulting from increasing or introducing energy or carbon prices as well as decreasing or removing existing subsidies.

Economic Effect Variables We include four moderator variables which account for different economic effects: Indirect, Demand-side, General equilibrium and Lifetime income. The first three variables correspond to the model types used in the original studies while lifetime income proxies reflect differences in the underlying data. We explicitly include moderator variables on the modeled economic effects and not on the model type. This method allows us to extract more information from the original studies. Many authors, for example, using Input-Output models separately report both the direct and the indirect distributive impact. We however disregard information on the impact of the different model types themselves.

Each model type at least considers direct effects. We identify and include three major groups of more advanced models in the literature: Input-Output models, micro-simulation models and CGE models.Footnote 8 The Indirect variable covers the joint impact of direct and indirect effects and comprises findings from Input-Output and CGE models. The Demand-side variable covers demand-side changes of different income groups which are considered by micro-simulations and CGE models. The General equilibrium variable covers the long-term general equilibrium effects and thus the income source side which are only analyzed by CGE models. The Lifetime income effects variable accounts for effects considering lifetime income proxies as opposed to annual household incomes.

Context Variables The Publication type variable differs between peer-reviewed journal articles and grey literature. The Publication year variable accounts for a potential time trend of study outcomes.

Country Variables We address the panel structure of our dataset by including time-fixed country dummies and time-variant variables. Our main specification includes 38 \((N-1, N=39)\) single country dummies that account for unobservable time-fixed country effects. It also includes three time-variant country variables: the GDP per capita, the Gini and the Poverty gap variable (see “Appendix Detailed Moderator Variable Description” for more details). These variables control for the country income and its distribution. For additional robustness checks, we group the countries based on the World Bank country income level classifications, namely high, upper-middle, lower-middle and low-income countries. The country data originates from the World Bank dataset between the years 1990 and 2014.Footnote 9 (World Bank 2017).

3.5 Ordered Probit Model

The bivariate analyses indicate a significant impact of most moderator variables on the dependent variable (see “Appendix Detailed Moderator Variable Description”). Identifying the isolated influence of each moderator variable, however, requires a regression analysis. The ordered categorical dependent variable with the outcomes progressive, proportional and regressive suggests the application of an ordered probit model. The approach is based on Greene (2012) and methodologically similar to the meta-analyses of Waldorf and Byun (2005), Card et al. (2009) and Wehkamp et al. (2018).

This ordered probit model uses a continuous latent variable \(y^*\) to measure the unobserved effect size of each original study. We assume \(y^*\) to be correlated with the three observed distributional effects: progressive (\(y=0\)), proportional (\(y=1\)) and regressive (\(y=2\)). Suppressing the observation-specific index, the relationship between \(y^*\) and the moderator variables X is assumed to follow a linear regression model of the form

$$y^*= X\beta + \epsilon$$

with \(y^*\) potentially varying between \(-\infty\) and \(\infty\) and \(\epsilon\) being a normally distributed error term. The observed distributional impact y is linked to the underlying latent variable \(y^*\) by

$$\begin{aligned}&y = 0 \quad \text {if}\quad y^*<0\\&y = 1 \quad \text {if}\quad 0< y^*< \mu _1\\&y = 2 \quad \text {if}\quad \mu _1 < y^*\\\end{aligned}$$

where \(\mu _1\) is an unknown threshold parameter simultaneously estimated with \(\beta\).

The probability of estimating a progressive (\(y=0\)), proportional (\(y=1\)) or regressive (\(y=2\)) distributional effect is given by

$$\begin{aligned}&P(y=0|X) = \Phi (-X\beta )\\&P(y=1|X) = \Phi (\mu _1 - X\beta ) - \Phi (-X\beta )\\&P(y=2|X) = 1-\Phi (\mu _1 - X\beta )\\\end{aligned}$$

where \(\Phi\) denotes the standard normal cumulative distribution function. We estimate the parameters by the maximum likelihood method with the previously described probabilities entering the likelihood function. The beta coefficients in combination with the p-value provide the direction and the significance of the effect; a positive \(\beta\) coefficient suggests that the respective moderator variable X increases the probability of obtaining a regressive outcome (\(P(y=2)\)). Vice versa, a negative \(\beta\) coefficient suggests that the respective moderator variable X increases the probability of finding a progressive outcome (\(P(y=0)\)). The coefficients have an ambiguous effect on the probability of finding a proportional outcome (\(P(y=1)\)). The marginal effects at means show the magnitude of the probability change for the three possible outcomes induced by the moderator variables. The pseudo-\(R^2\) is reported as a measure of fit (McFadden 1974).

We conduct several sensitivity analyses and specification tests as proposed by the best-practice guideline for future meta-analysis by Nelson and Kennedy (2009). First, we impose cluster-robust standard errors by country to address non-independence of observations. Second, our dataset contains only a few observations and thus a low time variation for several countries which imposes the risk of multicollinear time-fixed and time-variant variables. We thus alter our model by assuming fixed-effects for country income groups instead for single countries and also by omitting country fixed-effects to investigate their overall impact. Furthermore, we test several combinations of the time-variant country variables. Third, we test the validity of the ordered probit model specification by conducting significantly progressive and regressive probit regressions. Fourth, we use a jackknife method to identify the impact of single countries on the results (Gould 1995). Fifth, we present our findings for carbon pricing policies only, i.e. under exclusion of effects for subsidy reforms. Sixth, we test whether simulated policies provide systematically different results than actually implemented policies. Finally, we test for multicollinearity using the variance inflation factors and the joint significance of the variable groups using the likelihood-ratio test. “Appendix Robustness Checks” provides more details about the sensitivity analyses and specification tests, “Appendix Regression Results Overview” contains the regression coefficients without subsidy reforms.

4 Results

Table 2 shows the regression results of our main ordered probit model specification which includes the single country dummies and robust standard errors clustered by countries. The first column provides the estimated coefficients, the subsequent three columns present the marginal effects at mean for the three possible original study outcomes. A negative coefficient indicates an increased probability of a progressive study outcome, but conveys no information on the magnitude of this increased probability. Hence, we include marginal effects at mean. For binary variables they indicate by how many percentage points the likelihood of an outcome differs when the binary variable is one compared to when it is zero. For continuous variables, they indicate by how many percentage points the likelihood of an outcome changes by a change of one unit, taking the mean variable value as the starting point.

Table 2 Ordered probit results

Figure 3 additionally plots the coefficients for the most relevant alternative model specifications, i.e. regressions with single country dummies, group country dummies and no country dummies. For all three regression types we show the results with and without the three time-variant country variables (“Baseline” and “No Country Variables”). General findings from all robustness checks are discussed in Sect. 4.5. For a better overview we report the 38 coefficients of the single country dummies separately in the “Appendix Country Dummy Coefficients”.

The results confirm our hypotheses of a significantly increased likelihood for progressive study outcomes of transport policies, within lower income countries and for studies applying lifetime income proxies. In contrast, we show that studies on subsidy reforms are not inherently more progressive than carbon pricing instruments. The regression results show no impact of studies considering general equilibrium effects, while modeling indirect effects and demand-side adjustments of consumers provide more progressive study outcomes. The next subsections discuss the results for the different variable groups in detail.

4.1 Policy Variables

We hypothesize that the two policy variables Subsidy and Transport will foster progressive outcomes; the Transport coefficient indeed indicates a significantly higher likelihood of progressive outcomes while the Subsidy coefficient is insignificant. Both findings are highly robust among most other model specifications (see Fig. 3).

The insignificant finding for the Subsidy coefficient sharply contrasts with other literature findings but supports standard economic theory; as subsidies are equal to negative taxes (Varian 2009), the impact of removing subsidies should not be systematically different to that of taxes or cap-and-trade systems, after controlling for all other influences. The finding is robust over all other specifications besides one notable exception; the regression with no country dummies and no country variables shows a highly significant negative coefficient, indicating more progressive results for subsidies as previously expected. Again, energy subsidies have primarily been implemented in developing countries (Coady et al. 2017). Accordingly, our sample only includes subsidy policies in non high-income countries, such as India, Mali, Mexico, Nigeria, Poland and Turkey. We thus reason that the country variables capture the progressive impact of subsidy reforms.

Fig. 3
figure 3

Results overview. Notes The figure plots the coefficients and confidence intervals (90 and 95%) for the three main model specifications with “Single Country Dummies”, “Group Country Dummies” and “No Country Dummies”, either including the three time-variant country variables (“Baseline”) or excluding them (“No Country Variables”)

The Transport coefficient indicates a significantly and highly increased likelihood of progressive outcomes, as hypothesized. The marginal effects at mean show an increased likelihood of progressive outcomes of 44.7%, and a 55.9% decreased likelihood of regressive outcomes, at the 1% and 5% significance levels (see Table 2). Hence, a progressive impact of a policy in the transport sector is 55.9 percentage points more likely than an economy-wide policy. Transport sector policies thus largely contribute to the overall share of progressive findings in our sample. Most robustness checks confirm this finding though the magnitude of the effect decreases for regressions without single country dummies. Again, one notable exception is the regression with no country dummies and variables which shows an insignificant coefficient. This finding corresponds with the ambiguous literature outcomes which mostly show progressive but also regressive impacts in primarily high-income countries.

4.2 Economic Effect Variables

We hypothesize a progressive impact of the Lifetime income and the General equilibrium variables while being inconclusive about the Indirect and the Demand-side variables. Table 2 confirms that the application of Lifetime income proxies increases the likelihood of progressive findings. Progressive findings are also more likely in studies including Indirect and Demand-side effects. The General equilibrium coefficient is insignificant and hence does not support our hypothesis.

The marginal effects at means for the Lifetime income variable indicate an increasing likelihood of progressive outcomes by 42.6%. Regressive outcomes are 49.9% less likely. The results confirm the theory and are supported by the robustness checks. The magnitude of the coefficient, however, decreases for all regressions without single country dummies, though the significance level increases from 10% to 5%.

The marginal effects for the Indirect variable indicate an increasing likelihood for progressive outcomes by 21.4%. Regressive outcomes are 25% less likely at the 5% significance level. Other model specifications consistently show coefficients of slightly smaller magnitudes at mostly the same significance level. Previous literature findings show both increasing and decreasing regressivity of indirect effects (see Sect. 2). The results suggest more \(\hbox {CO}_2\)-intensive consumption baskets of richer households.

The Demand-side variable increases the likelihood of progressive outcomes by 26.4% while regressive outcomes are 30.9% less likely. Robustness checks including single dummy variables show mostly significant coefficients at the 5 or 10% level except when standard errors are clustered by studies. Without the single country dummies the coefficients become insignificant. The progressive effect of the Demand-side variable is thus sensitive to the modeling of unobserved country characteristics. Though our findings suggest larger elasticities for low-income households, additional and country-specific research is recommended.

The General equilibrium coefficient remains insignificant over most model specifications. This finding strictly contradicts our hypothesis. One explanation would be the small number of general equilibrium effects included, in combination with our categorical dependent variable; CGE models are the only model type capturing general equilibrium effects. Many CGE models in the literature, however, include revenue recycling schemes which we exclude from this analysis. Our sample thus only contains 12 effects from CGE models of which 50% show regressive outcomes (see “Appendix Detailed Moderator Variable Description”). The ordered categorical dependent variable only considers the overall outcome, i.e. regressive, proportional or progressive. We thus do not account for changes within each category, e.g. from strongly to weakly regressive. Therefore, we do not account for the presumably progressive source side effects within those six overall regressive outcomes. We further elaborate the implications of using a categorical dependent variable in Sect. 5.

Summing up, including a wider range of economic effects mostly fosters more progressive outcomes. The economic effects either reflect the application of more sophisticated model types or a different data base using lifetime income proxies.

4.3 Context Variables

Table 2 neither shows a publication bias, nor a time trend. The Publication Type coefficients remain insignificant over model specifications including single country dummies. The robustness checks without single country dummies, however, indicate a publication bias towards more progressive outcomes. The Publication Year coefficients are insignificant over most model specifications though there are two significant coefficients with opposite signs. The two-proportion z-test results suggest a progressive publication bias and a time trend towards more progressive outcomes (see “Appendix Detailed Moderator Variable Description”). In fact, the grey literature included primarily investigates developing countries. Furthermore, research on developing countries has been increasing over recent years. The findings suggest that the country variables, and especially the single country dummies, account for both trends.

4.4 Country Variables

The regression results support our hypothesis of more progressive study outcomes for countries with lower income levels. Our main regression includes 38 single country dummies and three country variables accounting for time-fixed and time-variant country characteristics, respectively. The interpretation of the results of this variable group requires a particularly detailed investigation of the regression outputs.

Table 2 shows a significantly negative coefficient for the Poverty gap variable as expected. The finding indicates a higher likelihood of progressive outcomes for very poor or unequal countries. The coefficient, however, becomes small or insignificant for regressions without single country dummies. The finding is further sensitive to the countries included (see Sect. 4.5). The Gini coefficient is insignificant for all regressions. The GDP per capita coefficients are mostly insignificant in regressions with single country dummies which contradicts our hypothesis (see “Appendix Regression Results Overview”).

An increased likelihood of progressive impacts in lower income countries is, however, clearly indicated by additional model specifications. The insignificant GDP per capita coefficients can be explained by the small temporal variation of the country variables, as the sample includes only a few observations for particularly low-income countries. The reduced temporal variation evokes multicollinear time-variant country variables and time-fixed single country dummies. The coefficients for the single country dummies and the country variables are thus inefficient for the main model specification. We address this problem by estimating another model that replaces the country group dummies with the single country dummies and another version which excludes the time-variant country variables. All model specifications without single country dummies, i.e. with country group dummies or without any country dummies, show significantly positive GDP per capita coefficients which implies more regressive study outcomes for richer countries. The regression coefficients for our specification with country group dummies but without country variables confirm this finding; the three group dummies coefficients (upper-middle, lower-middle and low) are significantly negative and increase in magnitude for decreasing income levels of the country groups.

4.5 Robustness Checks

We conduct several additional analyses to validate our findings.

First, we address non-independence of observations by imposing cluster-robust standard errors by country for every regression. Additionally we test the sensitivity of the standard errors to the clustering decision by imposing cluster-robust standard errors by study. Results are reported in “Appendix Regression Results Overview”. Clustering by study shows broadly similar significance levels for most coefficients. Notable exceptions are the insignificant coefficient for the Demand-side variable and the significant coefficients for the Publication Year and the GPD per capita variables. We conclude that the clustering decision has a slight influence on the results. The overall findings, however, remain unchanged.

Second, we test the influence of single countries on the results by conducting jackknife regressions (Gould 1995). The jackknife method performs N regressions by leaving out the jth observations where j = 1, 2, ..., N is the number of each country (N = 39). “Appendix Jackknife Findings” shows the distribution of the N jackknife coefficients for each moderator variable including fitted normal distributions. The coefficients outside the 99% confidence interval unsurprisingly mostly correspond to countries with large numbers of effects, i.e. the United States and the United Kingdom. Most coefficients, however, remain similar in sign or overall magnitude besides the Subsidy and the Poverty gap coefficients. Omitting Brazil or Poland strongly influences these two coefficients as the sample contains just a few effects from lower income countries while both variables only have few positive observations. These two outlier countries have no impact on jackknife regressions for model specifications without single country dummies.Footnote 10

Third, we analyze how excluding subsidy reforms affects our findings. “Appendix Regression Results Overview” shows, that most of our coefficients remain largely similar. Findings of regressions including subsidy reforms are thus confirmed and hence also valid for carbon pricing policies only.

Fourth, we test whether policy simulations provide different results than actually implemented policies. We find no impact for most model specifications. Yet, selected specifications show a higher likelihood of regressive study outcomes for simulations, while the Transport coefficient turns insignificant. Both, the simulation and transport variable, are strongly correlated (− 0.78), as most studies only simulate economy-wide carbon price, whereas gasoline taxes are actually implemented.Footnote 11

Finally, we investigate the validity of our ordered probit model specification by conducting probit regressions on significantly regressive and progressive outcomes, reported in “Appendix Regression Results Overview”. The coefficients of the significantly regressive probit regression are close to the ordered probit model coefficients. The significantly progressive probit coefficients are mostly opposite in sign. The findings indicate a valid ordered probit model specification.

5 Discussion and Conclusion

Market-based climate mitigation policies often raise concerns about potentially adverse distributional impacts. Emission reductions at the expense of the poorest would aggravate poverty and undo progress for human development. Such inequitable outcomes could also severely undermine the political feasibility of market-based mitigation policies. Understanding the distributional implications of climate policy is hence crucial for the design of just and effective climate policies.

This study carries out a meta-analysis of the existing literature to systematically explain differences in distributional outcomes of carbon pricing. We employ an ordered probit analysis on 53 original studies and analyze how country-specific factors, the type of policy under study as well as the modeling approach affect the likelihood of regressive or progressive outcomes.

We find a significantly increased likelihood of progressive study outcomes within lower income countries and for transport policies. The same applies to study designs considering indirect effects, demand-side adjustments of consumers or lifetime income proxies. In contrast, there is no evidence to support the hypothesis that subsidy reforms are inherently more progressive than carbon pricing. These insights bear direct relevance for policy makers in the initial stages of policy design for the decision which possible policy options should be explored in more detail. Nevertheless, policies that are implemented should be subject to detailed analysis that takes the country-specific context into account instead of relying on general patterns that hold across countries.

The interpretation of the results should particularly consider the following limitations of the analysis. Disregarding the effect size of overall regressive, proportional or progressive distributional impacts influences the regression coefficients. Our methodology does not account for differences within outcome categories, for example between strongly and weakly regressive effects. Smaller changes in the distributional impact within single studies, which are mostly driven by the economic effect variables, are thus ignored. This results in downward biased and less significant coefficients, as illustrated by the General equilibrium coefficient. Likewise, treating similar distributional impacts between studies equally, irrespective of their magnitudes, might ambiguously influence the size and significance of the coefficients. Estimating the effect size using subsamples with common and thus quantitatively comparable inequality metrics, however, suffers from too few observations to be representative.

Finally, the small number of effects for lower income countries decreases the accuracy of our findings. Our analysis shows a large impact of two lower income countries on two variables (see Sect. 4.5). A higher proportion of effects on lower income countries in combination with a larger total sample would reduce the impact of outliers, allow for more refined moderator variables, and thus provide more precise insights. We thus recommend future researchers to put an emphasis on distributional impacts in lower income countries. The robustness checks, however, confirm the overall validity of our findings.

It should be noted that even progressive policies increase consumer prices, which raises the risk of poverty for low-income households. In the most extreme cases this may lead to public resistance as illustrated by the example of Nigeria in 2012. The risk of poverty can, however, be offset by suitable revenue recycling schemes that compensate poor households (van Heerden et al. 2006). Revenue recycling can also provide various other benefits. For example, using revenues to reduce distorting income taxes can potentially lead to more employment, higher individual welfare and higher GDP growth (Pearce 1991; Goulder 1995; Pezzey and Park 1998). Revenues can also be used for public investments in infrastructure, providing access to water, sanitation, electricity, telecommunications and transport (Jakob et al. 2016). Climate policies in combination with a targeted use of revenues thus have the potential to simultaneously mitigate climate change and address additional sustainable development goals. However, public debates frequently focus on the distributional impact of consumer expenditures and thus underestimate or ignore the usually progressive impact of revenue recycling schemes, even if simultaneously proposed. Our results could therefore be interpreted as a proxy for publicly perceived distributional impacts of climate mitigation policies, though being economically incomplete. Distributional impacts of different revenue recycling schemes are thus an interesting research avenue for further research, but beyond the scope of this paper due to unresolvable methodological challenges (see Sect. 3.4).

This study contributes to an increased understanding of the distributional impacts or the potential benefits of climate mitigation policies, which may further support their implementation. Thus far, there has been a widespread belief that consumption taxes, and particularly environmental taxes, would particularly impose a burden on the poor. However, more than one third of the effects included in this sample are progressive or proportional. Hence, distributional outcomes of market-based climate policies depend on a variety of (often country-specific) factors. This kind of research may thus help prevent actors with vested interests, such as investors fearing stranded assets or workers fearing job losses (Vogt-Schilb and Hallegatte 2017), from instigating a public opposition against unwanted policies.