Distributional Impacts of Carbon Pricing: A Meta-Analysis

Understanding the distributional impacts of market-based climate policies is crucial to design economically efficient climate change mitigation policies that are socially acceptable and avoid adverse impacts on the poor. Empirical studies that examine the distributional impacts of carbon pricing and fossil fuel subsidy reforms in different countries arrive at ambiguous results. To systematically determine the sources of variation between these outcomes, we apply an ordered probit meta-analysis framework. Based on a comprehensive, systematic and transparent screening of the literature, our sample comprises 53 empirical studies containing 183 effects in 39 countries. Results indicate a significantly increased likelihood of progressive distributional outcomes for studies on lower income countries and transport sector policies. The same applies to study designs that consider indirect effects, demand-side adjustments of consumers or lifetime income proxies.


Introduction
It is well understood, that in order to achieve international climate targets as agreed in Paris, global greenhouse gas emissions need to decrease rapidly in the upcoming years (IPCC 2014). In order to achieve this goal, market-based instruments, such as carbon taxes, capand-trade systems or fossil fuel subsidy reforms, are frequently recommended by leading economists such as Nicolas Stern and Joseph Stiglitz (High-Level Commission on Carbon Prices 2017). Economic theory highlights that these instruments are environmentally effective and economically efficient (Pigou 1920;Nordhaus 1991;Pearce 1991). 1 In 2018, 51 carbon pricing schemes, such as carbon taxes or cap-and-trade systems were implemented or planned, covering 20 percent of the global greenhouse gas emissions (World Bank and Ecofys 2018), albeit at carbon prices well below those that are considered to be in line with the targets of the Paris Agreement. Numerous countries also apply taxes or levies on fossil fuel use, for instance for transportation or heating. Even though these are not directly proportional to the carbon content, they nevertheless provide an incentive to reduce greenhouse gas emissions. Commitment was made to the phase-out of fossil fuel subsidies by the G20 in 2009 at the Pittsburgh summit (G20 Leaders Statement 2009) and several reforms have been enacted in recent years. 2 (IEA and OECD 2017).
The distributional impacts of carbon and energy taxes however strongly influence the political acceptability (Baumol and Oates 1988;Baranzini et al. 2000;Tiezzi 2005). Regressive distributional impacts harm vulnerable groups and decrease the likelihood of policies being implemented and sustained (Parry 2015). Social equity concerns can thus quickly dominate the public debate if energy prices increase (Shammin and Bullard 2009). For example, the incidence and the distributional impacts of the repealed Australian carbon tax were subject to public and academic debate (Rahman 2013;Sajeewani et al. 2015). The progressive Nigerian fuel and petrol subsidy reform in 2012 even resulted in mass protests and strikes which led to a partial reimplementation (Soile and Mu 2015;Lockwood 2015;Dorband et al. 2017). 3 The literature on the distributional impacts of climate policies provides ambiguous results. Many studies find an overall tendency for regressive impacts (Araar et al. 2011;Gonzalez 2012). Others detect mostly regressive findings for developed countries while developing countries show an inconsistent picture with a tendency towards proportional or progressive impacts (Verde and Tol 2009; Wang et al. 2016). Nevertheless, progressive impacts have also been shown for developed countries like Australia (Sajeewani et al. 2015), Canada (Dissou and Siddiqui 2014) and Spain (Labandeira et al. 2009).
Previous literature reviews provide initial insights but do not systematically explain outcome heterogeneity, i.e. what drives the differences in results of studies. Wang et al. (2016) have conducted the most comprehensive literature review on distributional impacts of carbon prices so far. They consider distributional impacts across households differing by income, location and demographic characteristics. This broad scope provides valuable insights into various dimensions of distributional impacts. However, a common problem ailing most literature reviews is the lack of explicit or transparent selection and evaluation criteria for their study sets as well as rigorous methods for analysis of observed variation, which exposes them to the criticism of subjectivity and a lack of validity. The literature on meta-analysis is littered with examples that show how traditional literature reviews and vote-counting approaches can be misleading and inconsistent in their assessment of the state-of-art (Stanley and Doucouliagos 2012;Ringquist 2013).
We focus our analysis on distributional impacts across household income groups. This narrow scope allows the use of a meta-analysis to quantitatively determine the sources of variation in the study outcomes. 4 Thus far, meta-analyses have mainly been applied in the fields of education and medicine, but organizations like the Campbell Collaboration or the Collaboration for Environmental Evidence have tried to establish rigorous quality standards and mainstream such work in the social and environmental sciences. In fact, there is an increasing volume of meta-analyses in social science including environmental economics (Moeltner et al. 2007;Nelson and Kennedy 2009;Tunçel and Hammitt 2014).
This study applies an ordered probit meta-analysis framework to 53 original studies covering economy-wide and transport sector climate mitigation policies, providing 183 effects in 39 countries. We analyze all market-based policies that affect the price of fossil fuels, regardless of whether these are put into place for climate change mitigation (e.g. a carbon tax, an explicit price on the carbon) or a different purpose, such as generating public revenues (e.g. excises taxes on fuels). We also include studies that address fossil fuel subsidy removals, as the absence of a Pigouvian tax can also be regarded to constitute a (so-called 'post-tax') subsidy (Coady et al. 2017). We include moderator variables accounting for different policies, modeled economic effects, and countries, while controlling for a publication bias and a time trend. We find a significantly increased likelihood of progressive study outcomes for lower income countries and transport policies. The same applies to study designs considering indirect effects, demand-side adjustments of consumers, or lifetime income proxies. In contrast, we find that subsidy reforms are not inherently more progressive than carbon pricing instruments.
We structure the remainder of this paper as follows: Section 2 elaborates our four key hypotheses with respect to theory and literature findings. Section 3 describes the data selection process, explains the variables and introduces the quantitative model. Section 4 presents the main results while Sect. 5 discusses and concludes the findings.

Hypotheses
Based on economic theory and previous research findings, we expect that the policy type, the affected sectors and the modeled economic effects systematically influence the outcomes of studies assessing distributional impacts. The following paragraphs discuss factors that might drive the results and subsequently develop hypotheses about the estimated impact.
First, literature reviews show mostly regressive impacts in developed countries. Developing countries, however, show an inconsistent picture with a tendency towards proportional or progressive impacts (Verde and Tol 2009;Wang et al. 2016). These findings could be explained by low carbon intensities of the consumption baskets of poor households in lower income countries, resulting from a higher share of subsistence consumption, a low access to modern energy services, or the lack of affordability of energy. In fact, Flues and van Dender (2017) demonstrate a negative correlation between the energy affordability risk and GDP, for 20 OECD countries.
Second, literature reviews strongly suggest progressive outcomes for reforms that decrease or abolish fossil fuel subsidies (Anand et al. 2013;Clements et al. 2013;Coady et al. 2017) while carbon pricing policies show ambiguous impacts (Wang et al. 2016). Fossil fuel subsidies have primarily been implemented in developing countries (Coady et al. 2017). Currently implemented fossil fuel subsidies are mostly regressive as they especially benefit well-organized interest groups while disadvantaging low-income households that spend relatively little on energy (Inchauste and Victor 2017). Small groups of powerful and highly profiting actors have a greater incentive to organize and influence a legislative process than a large group of individuals with low payoffs (Oye and Maxwell 1994). The political economy in combination with the consumption baskets of households in developing countries might thus explain the progressive literature findings for subsidy reforms.
Third, Wang et al. (2016) review a tendency towards progressive outcomes for transport sector policies. 5 Others, however, show proportional or regressive outcomes in the United States (Casler and Rafiqui 1993;Chernick and Reschovsky 1997;Metcalf 1999;Chernick and Reschovsky 2000;Williams et al. 2015), Germany (Nikodinoska and Schröder 2016) and six other European countries (Sterner 2012). Sterner (2012) argues that the smaller car ownership rate in low-income countries makes fuel a luxury product. Santos and Catchesides (2005), however, also find a lower car ownership rate for low-income household in the United Kingdom, resulting in a reverse U-shape relationship between income and incidence. The efficiency of the public transport system as well as indirect fuel expenditures on public transport could additionally influence the results (Datta 2010). Nevertheless, Kpodar (2006) and Ziramba (2009) find no impact of indirect expenditures.
Finally, we compare the modeling of indirect effects, demand-side adjustments of consumers, general equilibrium effects and studies that apply lifetime income proxies. We thus complement the previous discussion on policy and country impacts by considering different study designs and their corresponding modeled economic effects (see Sect. 3). Distributional analyses at least consider direct effects, i.e. the price increase of all goods that directly contain CO 2 , such as gasoline. The following paragraphs discuss the potential impact of additional economic effects on the study outcomes.
Indirect distributional effects are caused by price changes of goods in the consumption basket due to CO 2 emissions embedded to their value chain. Considering indirect effects might influence the distributive impact in both directions. Generally, their impact depends on the relative difference of CO 2 intensities in the consumption baskets between low-and high-income households (Anand et al. 2013). Hasset et al. (2009 provide evidence that indirect effects mitigate regressivity in the United States. Other authors show that indirect effects increase regressivity as low-income households tend to spend large fractions of their incomes on energy-intensive food and public transport (Jacobsen et al. 2003;da Silva Freitas et al. 2016).
Modeling demand-side adjustments of consumers could also ambiguously influence the study outcomes. The impact depends on differences in the demand elasticities between low-and high-income households. Zhang (2015) shows larger demand-side adjustments for richer households and argues that low-income households are required to focus on their basic needs and hence less responsive to price signals. On the contrary, West and Williams (2004) show larger demand-side adjustments for low-income households which results in more progressive outcomes. Their study however only considers transport fuel taxes.
We expect more progressive outcomes for studies that capture general equilibrium effects. Several studies find general equilibrium effects to foster progressive outcomes (Rausch et al. 2011;Dissou and Siddiqui 2014;Vandyck and Van Regemorter 2014;Beck et al. 2015;Sajeewani et al. 2015;da Silva Freitas et al. 2016). Dissou and Siddiqui (2014) show that carbon taxes particularly affect the capital-intensive energy industry. This decreases the capital income of rich households and thus makes the distributive effect more progressive. Fullerton and Heutel (2011), however, highlight the results' sensitivity on parameter values.
Using lifetime income proxies, rather than annual household incomes, is hypothesized to increase progressivity. Several literature findings based on lifetime incomes show more progressive outcomes for excise and transport taxes (Poterba 1989(Poterba , 1991Bull et al. 1994;Lyon and Schwab 1995;Hassett et al. 2009). The permanent income hypothesis (Friedman 1957) assumes that households smooth their consumption over their lifetime. Accordingly, lifetime income proxies consider that low annual incomes in isolated years do not necessarily correspond to low welfare as, for instance, elderly people and students tend to live on savings or loans. The magnitude of the effect (Fullerton and Rogers 1993), as well as the most suitable lifetime income proxy (Metcalf 1999;Chernick and Reschovsky 2000), are widely debated.
Based on this discussion, we hypothesize an increasing share of progressive study outcomes for first, low-income countries, second, subsidy reforms and third, transport sector policies. We also expect more progressive findings for studies that model general equilibrium effects or use lifetime income proxies. Studies that consider indirect and demand-side effects could either provide more progressive or more regressive findings.

Methodology
This section first explains how studies included in the meta-analysis were selected. It then provides an overview of the sample, including the dependent variables from the literature and explanatory variables that were either derived from the studies themselves, or drawn from external sources. It finally describes the empirical strategy to assess determinants of distributional outcomes.

Data Selection
We follow Ringquist (2013) for the structure of the data selection process. This process comprises identifying relevant study authors and keywords, developing a search strategy, considering additional citations and defining study selection criteria, which allow to identify and classify studies as potentially relevant, relevant and finally, as acceptable. For literature identification we conduct a query search in the Web of Science and the Scopus literature databases. We connect three groups of keywords with boolean operators filtering for research on CO 2 related (carbon, CO 2 , gasoline, emission, environment, ecologic, energy) pricing policies (tax, allowance, subsidy, policy, price) investigating the distributional impacts (distribution, regressive, progressive, incidence, inequality, household income). We exclude findings from unrelated research fields by permitting characteristic keywords (see "Appendix Search Query" for details). The literature search identified 1023 studies restricted to literature written in English. In the first step, we exclude 856 studies with titles indicating irrelevant research questions, leaving 167 potentially relevant studies.
For the next steps of the selection process we apply the following study selection criteria. First, we exclude 61 studies because of differing research questions, replicating findings of previous studies including double hits, unavailability or insufficient quality. Second, we only select quantitative studies, thus excluding 34 studies that provide qualitative results or apply theoretical models. Third, we exclude 46 studies with an incomparable scope, i.e. studies pricing multiple pollutants beyond CO 2 , imposing sectoral restrictions apart from transport, only including effects with revenue recycling schemes or only concentrating on urban or rural households. Last, we only select countries or large regions, thus excluding 8 studies for single cities and supranational unions.
We employ these selection criteria successively to the abstract and the full text of the 167 (potentially) relevant studies, resulting in 36 acceptable studies. In order to supplement our sample by grey literature and literature from other databases, we subsequently screen the references of all acceptable studies from the query search to identify further relevant studies. Based on this reference search, we identify another 35 relevant studies, resulting in another 17 acceptable studies. The final sample comprises 53 original studies with 183 effects. Figure 1 provides an overview of the selection process. Further details are documented in the codebook, which is available upon request.

Sample Overview
The final sample comprises 53 studies with 183 effects in total. The original study author names, the publication years, the number of included effects per study and the percentage share of included effects per study relative to the 183 total effects are listed in the  Figure 2 shows the number of effects and the percentage share of the total sample for each country included. The effects per country are also unequally balanced, with the United States 30.6%, the United Kingdom 6.6% and Germany 4.9% contributing the largest shares in the sample. Grouping the effects by World Bank country income levels provides 144 effects for high-income countries and 39 effects for low, lower-middle and upper-middle income countries.

Dependent Variable
The ordered categorical variable Distributional impact captures the progressive, proportional or regressive distributive impact of each effect included. We only aim to explain whether a policy is progressive, regressive or proportional, without addressing the size of this effect, as the inequality measures applied in the original studies are not quantitatively comparable. The methods suggested by the meta-analysis literature to harmonize different effect size metrics are not applicable to this study. 7 We also tried to subsample studies with identical inequality metrics, but unfortunately the sample sizes became too small to conduct a quantitative analysis. Section 5 discusses the implications of abstracting from the effect size. Neglecting the effect size increases the significance and validity of the results as it allows us to examine a larger sample of original studies. The coding decision either directly relies on quantitative inequality measures or on the interpretation of the original study author's. The 183 effects comprise 52 progressive, 13 proportional and 118 regressive outcomes (see Table 1).

Moderator Variables
Moderator variables are hypothesized to systematically influence the outcomes of the original studies (Ringquist 2013). We include moderator variables that allow us to test the hypotheses developed in Sect. 2. The policy and the country moderator variables account for differences in the presumed distributional impact, while the economic effect variables implicitly capture different study designs. We also control for a potential publication bias and a time trend. Table 1 summarizes the variables included. We exclude effects that model revenue recycling schemes as those are either too context-specific for designing reasonable moderator variables or, in case of lump-sum, completely offset prior regressive findings, which leads to a perfect predictor. This particularly applies to effects in studies using computable general equilibrium (CGE) models, which our analysis only considers if results are explicitly reported without the impact of revenue recycling schemes.
Furthermore, we test the bivariate relationship between the moderator variables and the dependent variable. For the binary moderator variables we conduct a two-proportion z-test. Similarly, we conduct a correlation analysis for the continuous moderator variables. The results of the two tests indicate an overall suitable selection of moderator variables. Further analysis, however, requires a multiple regression analysis as the bivariate tests ignore potential correlations between the moderator variables. The remainder of this section briefly explains the moderator variables included. More details about individual moderator variables and the bivariate analyses, including their results tables, are provided in "Appendix Detailed Moderator Variable Description".
Policy Variables We include two variables controlling for policy differences: The Subsidy variable differs between subsidy reforms and carbon pricing schemes. The Transport variable compares policies only on the transport sector with economy-wide policies. Generally, we only include effects increasing the burden for households, i.e. resulting from increasing or introducing energy or carbon prices as well as decreasing or removing existing subsidies.
Economic Effect Variables We include four moderator variables which account for different economic effects: Indirect, Demand-side, General equilibrium and Lifetime income. The first three variables correspond to the model types used in the original studies while lifetime income proxies reflect differences in the underlying data. We explicitly include moderator variables on the modeled economic effects and not on the model type. This method allows us to extract more information from the original studies. Many authors, for example, using Input-Output models separately report both the direct and the indirect distributive impact. We however disregard information on the impact of the different model types themselves. Each model type at least considers direct effects. We identify and include three major groups of more advanced models in the literature: Input-Output models, micro-simulation models and CGE models. 8 The Indirect variable covers the joint impact of direct and indirect effects and comprises findings from Input-Output and CGE models. The Demand-side variable covers demand-side changes of different income groups which are considered by micro-simulations and CGE models. The General equilibrium variable covers the longterm general equilibrium effects and thus the income source side which are only analyzed by CGE models. The Lifetime income effects variable accounts for effects considering lifetime income proxies as opposed to annual household incomes.
Context Variables The Publication type variable differs between peer-reviewed journal articles and grey literature. The Publication year variable accounts for a potential time trend of study outcomes.
Country Variables We address the panel structure of our dataset by including timefixed country dummies and time-variant variables. Our main specification includes 38 (N − 1, N = 39) single country dummies that account for unobservable time-fixed country effects. It also includes three time-variant country variables: the GDP per capita, the Gini and the Poverty gap variable (see "Appendix Detailed Moderator Variable Description" for more details). These variables control for the country income and its distribution. For additional robustness checks, we group the countries based on the World Bank country income level classifications, namely high, upper-middle, lower-middle and low-income countries. The country data originates from the World Bank dataset between the years 1990 and 2014. 9 (World Bank 2017).

Ordered Probit Model
The bivariate analyses indicate a significant impact of most moderator variables on the dependent variable (see "Appendix Detailed Moderator Variable Description"). Identifying the isolated influence of each moderator variable, however, requires a regression analysis. The ordered categorical dependent variable with the outcomes progressive, proportional and regressive suggests the application of an ordered probit model. The approach is based on Greene (2012) and methodologically similar to the meta-analyses of Waldorf and Byun (2005), Card et al. (2009) and Wehkamp et al. (2018). This ordered probit model uses a continuous latent variable y * to measure the unobserved effect size of each original study. We assume y * to be correlated with the three observed distributional effects: progressive ( y = 0 ), proportional ( y = 1 ) and regressive ( y = 2 ). Suppressing the observation-specific index, the relationship between y * and the moderator variables X is assumed to follow a linear regression model of the form 9 We adjust the data as further described in "Appendix Detailed Moderator Variable Description". with y * potentially varying between −∞ and ∞ and being a normally distributed error term. The observed distributional impact y is linked to the underlying latent variable y * by where 1 is an unknown threshold parameter simultaneously estimated with .
The probability of estimating a progressive ( y = 0 ), proportional ( y = 1 ) or regressive ( y = 2 ) distributional effect is given by where Φ denotes the standard normal cumulative distribution function. We estimate the parameters by the maximum likelihood method with the previously described probabilities entering the likelihood function. The beta coefficients in combination with the p-value provide the direction and the significance of the effect; a positive coefficient suggests that the respective moderator variable X increases the probability of obtaining a regressive outcome ( P(y = 2) ). Vice versa, a negative coefficient suggests that the respective moderator variable X increases the probability of finding a progressive outcome ( P(y = 0) ). The coefficients have an ambiguous effect on the probability of finding a proportional outcome ( P(y = 1) ). The marginal effects at means show the magnitude of the probability change for the three possible outcomes induced by the moderator variables. The pseudo-R 2 is reported as a measure of fit (McFadden 1974).
We conduct several sensitivity analyses and specification tests as proposed by the best-practice guideline for future meta-analysis by Nelson and Kennedy (2009). First, we impose cluster-robust standard errors by country to address non-independence of observations. Second, our dataset contains only a few observations and thus a low time variation for several countries which imposes the risk of multicollinear time-fixed and time-variant variables. We thus alter our model by assuming fixed-effects for country income groups instead for single countries and also by omitting country fixed-effects to investigate their overall impact. Furthermore, we test several combinations of the time-variant country variables. Third, we test the validity of the ordered probit model specification by conducting significantly progressive and regressive probit regressions. Fourth, we use a jackknife method to identify the impact of single countries on the results (Gould 1995). Fifth, we present our findings for carbon pricing policies only, i.e. under exclusion of effects for subsidy reforms. Sixth, we test whether simulated policies provide systematically different results than actually implemented policies. Finally, we test for multicollinearity using the variance inflation factors and the joint significance of the variable groups using the likelihood-ratio test. "Appendix Robustness Checks" provides more details about the sensitivity analyses and specification tests, "Appendix Regression Results Overview" contains the regression coefficients without subsidy reforms.
(1)  Table 2 shows the regression results of our main ordered probit model specification which includes the single country dummies and robust standard errors clustered by countries. The first column provides the estimated coefficients, the subsequent three columns present the marginal effects at mean for the three possible original study outcomes. A negative coefficient indicates an increased probability of a progressive study outcome, but conveys no information on the magnitude of this increased probability. Hence, we include marginal effects at mean. For binary variables they indicate by how many percentage points the likelihood of an outcome differs when the binary variable is one compared to when it is zero. For continuous variables, they indicate by how many  percentage points the likelihood of an outcome changes by a change of one unit, taking the mean variable value as the starting point. Figure 3 additionally plots the coefficients for the most relevant alternative model specifications, i.e. regressions with single country dummies, group country dummies and no country dummies. For all three regression types we show the results with and without the three time-variant country variables ("Baseline" and "No Country Variables"). General findings from all robustness checks are discussed in Sect. 4.5. For a better overview we report the 38 coefficients of the single country dummies separately in the "Appendix Country Dummy Coefficients".

Results
The results confirm our hypotheses of a significantly increased likelihood for progressive study outcomes of transport policies, within lower income countries and for studies applying lifetime income proxies. In contrast, we show that studies on subsidy reforms are not inherently more progressive than carbon pricing instruments. The regression results show no impact of studies considering general equilibrium effects, while modeling indirect effects and demand-side adjustments of consumers provide more progressive study outcomes. The next subsections discuss the results for the different variable groups in detail.

Policy Variables
We hypothesize that the two policy variables Subsidy and Transport will foster progressive outcomes; the Transport coefficient indeed indicates a significantly higher likelihood of progressive outcomes while the Subsidy coefficient is insignificant. Both findings are highly robust among most other model specifications (see Fig. 3).
The insignificant finding for the Subsidy coefficient sharply contrasts with other literature findings but supports standard economic theory; as subsidies are equal to negative taxes (Varian 2009), the impact of removing subsidies should not be systematically different to that of taxes or cap-and-trade systems, after controlling for all other influences. The finding is robust over all other specifications besides one notable exception; the regression with no country dummies and no country variables shows a highly significant negative coefficient, indicating more progressive results for subsidies as previously expected. Again, energy subsidies have primarily been implemented in developing countries (Coady et al. 2017). Accordingly, our sample only includes subsidy policies in non high-income countries, such as India, Mali, Mexico, Nigeria, Poland and Turkey. We thus reason that the country variables capture the progressive impact of subsidy reforms.
The Transport coefficient indicates a significantly and highly increased likelihood of progressive outcomes, as hypothesized. The marginal effects at mean show an increased likelihood of progressive outcomes of 44.7%, and a 55.9% decreased likelihood of regressive outcomes, at the 1% and 5% significance levels (see Table 2). Hence, a progressive impact of a policy in the transport sector is 55.9 percentage points more likely than an economy-wide policy. Transport sector policies thus largely contribute to the overall share of progressive findings in our sample. Most robustness checks confirm this finding though the magnitude of the effect decreases for regressions without single country dummies. Again, one notable exception is the regression with no country dummies and variables which shows an insignificant coefficient. This finding corresponds with the ambiguous literature outcomes which mostly show progressive but also regressive impacts in primarily high-income countries.

Economic Effect Variables
We hypothesize a progressive impact of the Lifetime income and the General equilibrium variables while being inconclusive about the Indirect and the Demand-side variables. Table 2 confirms that the application of Lifetime income proxies increases the likelihood of progressive findings. Progressive findings are also more likely in studies including Indirect and Demand-side effects. The General equilibrium coefficient is insignificant and hence does not support our hypothesis.
The marginal effects at means for the Lifetime income variable indicate an increasing likelihood of progressive outcomes by 42.6%. Regressive outcomes are 49.9% less likely. The results confirm the theory and are supported by the robustness checks. The magnitude of the coefficient, however, decreases for all regressions without single country dummies, though the significance level increases from 10% to 5%.
The marginal effects for the Indirect variable indicate an increasing likelihood for progressive outcomes by 21.4%. Regressive outcomes are 25% less likely at the 5% significance level. Other model specifications consistently show coefficients of slightly smaller magnitudes at mostly the same significance level. Previous literature findings show both increasing and decreasing regressivity of indirect effects (see Sect. 2). The results suggest more CO 2 -intensive consumption baskets of richer households.
The Demand-side variable increases the likelihood of progressive outcomes by 26.4% while regressive outcomes are 30.9% less likely. Robustness checks including single dummy variables show mostly significant coefficients at the 5 or 10% level except when standard errors are clustered by studies. Without the single country dummies the coefficients become insignificant. The progressive effect of the Demand-side variable is thus sensitive to the modeling of unobserved country characteristics. Though our findings suggest larger elasticities for low-income households, additional and country-specific research is recommended.
The General equilibrium coefficient remains insignificant over most model specifications. This finding strictly contradicts our hypothesis. One explanation would be the small number of general equilibrium effects included, in combination with our categorical dependent variable; CGE models are the only model type capturing general equilibrium effects. Many CGE models in the literature, however, include revenue recycling schemes which we exclude from this analysis. Our sample thus only contains 12 effects from CGE models of which 50% show regressive outcomes (see "Appendix Detailed Moderator Variable Description"). The ordered categorical dependent variable only considers the overall outcome, i.e. regressive, proportional or progressive. We thus do not account for changes within each category, e.g. from strongly to weakly regressive. Therefore, we do not account for the presumably progressive source side effects within those six overall regressive outcomes. We further elaborate the implications of using a categorical dependent variable in Sect. 5.
Summing up, including a wider range of economic effects mostly fosters more progressive outcomes. The economic effects either reflect the application of more sophisticated model types or a different data base using lifetime income proxies. Table 2 neither shows a publication bias, nor a time trend. The Publication Type coefficients remain insignificant over model specifications including single country dummies.

Context Variables
The robustness checks without single country dummies, however, indicate a publication bias towards more progressive outcomes. The Publication Year coefficients are insignificant over most model specifications though there are two significant coefficients with opposite signs. The two-proportion z-test results suggest a progressive publication bias and a time trend towards more progressive outcomes (see "Appendix Detailed Moderator Variable Description"). In fact, the grey literature included primarily investigates developing countries. Furthermore, research on developing countries has been increasing over recent years. The findings suggest that the country variables, and especially the single country dummies, account for both trends.

Country Variables
The regression results support our hypothesis of more progressive study outcomes for countries with lower income levels. Our main regression includes 38 single country dummies and three country variables accounting for time-fixed and time-variant country characteristics, respectively. The interpretation of the results of this variable group requires a particularly detailed investigation of the regression outputs. Table 2 shows a significantly negative coefficient for the Poverty gap variable as expected. The finding indicates a higher likelihood of progressive outcomes for very poor or unequal countries. The coefficient, however, becomes small or insignificant for regressions without single country dummies. The finding is further sensitive to the countries included (see Sect. 4.5). The Gini coefficient is insignificant for all regressions. The GDP per capita coefficients are mostly insignificant in regressions with single country dummies which contradicts our hypothesis (see "Appendix Regression Results Overview").
An increased likelihood of progressive impacts in lower income countries is, however, clearly indicated by additional model specifications. The insignificant GDP per capita coefficients can be explained by the small temporal variation of the country variables, as the sample includes only a few observations for particularly low-income countries. The reduced temporal variation evokes multicollinear time-variant country variables and timefixed single country dummies. The coefficients for the single country dummies and the country variables are thus inefficient for the main model specification. We address this problem by estimating another model that replaces the country group dummies with the single country dummies and another version which excludes the time-variant country variables. All model specifications without single country dummies, i.e. with country group dummies or without any country dummies, show significantly positive GDP per capita coefficients which implies more regressive study outcomes for richer countries. The regression coefficients for our specification with country group dummies but without country variables confirm this finding; the three group dummies coefficients (upper-middle, lowermiddle and low) are significantly negative and increase in magnitude for decreasing income levels of the country groups.

Robustness Checks
We conduct several additional analyses to validate our findings.
First, we address non-independence of observations by imposing cluster-robust standard errors by country for every regression. Additionally we test the sensitivity of the standard errors to the clustering decision by imposing cluster-robust standard errors by study. Results are reported in "Appendix Regression Results Overview". Clustering by study shows broadly similar significance levels for most coefficients. Notable exceptions are the insignificant coefficient for the Demand-side variable and the significant coefficients for the Publication Year and the GPD per capita variables. We conclude that the clustering decision has a slight influence on the results. The overall findings, however, remain unchanged.
Second, we test the influence of single countries on the results by conducting jackknife regressions (Gould 1995). The jackknife method performs N regressions by leaving out the jth observations where j = 1, 2, ..., N is the number of each country (N = 39). "Appendix Jackknife Findings" shows the distribution of the N jackknife coefficients for each moderator variable including fitted normal distributions. The coefficients outside the 99% confidence interval unsurprisingly mostly correspond to countries with large numbers of effects, i.e. the United States and the United Kingdom. Most coefficients, however, remain similar in sign or overall magnitude besides the Subsidy and the Poverty gap coefficients. Omitting Brazil or Poland strongly influences these two coefficients as the sample contains just a few effects from lower income countries while both variables only have few positive observations. These two outlier countries have no impact on jackknife regressions for model specifications without single country dummies. 10 Third, we analyze how excluding subsidy reforms affects our findings. "Appendix Regression Results Overview" shows, that most of our coefficients remain largely similar. Findings of regressions including subsidy reforms are thus confirmed and hence also valid for carbon pricing policies only.
Fourth, we test whether policy simulations provide different results than actually implemented policies. We find no impact for most model specifications. Yet, selected specifications show a higher likelihood of regressive study outcomes for simulations, while the Transport coefficient turns insignificant. Both, the simulation and transport variable, are strongly correlated (− 0.78), as most studies only simulate economy-wide carbon price, whereas gasoline taxes are actually implemented. 11 Finally, we investigate the validity of our ordered probit model specification by conducting probit regressions on significantly regressive and progressive outcomes, reported in "Appendix Regression Results Overview". The coefficients of the significantly regressive probit regression are close to the ordered probit model coefficients. The significantly progressive probit coefficients are mostly opposite in sign. The findings indicate a valid ordered probit model specification.

Discussion and Conclusion
Market-based climate mitigation policies often raise concerns about potentially adverse distributional impacts. Emission reductions at the expense of the poorest would aggravate poverty and undo progress for human development. Such inequitable outcomes could also severely undermine the political feasibility of market-based mitigation policies. 10 In addition, we test whether the three studies with the highest number of observations influence our findings by manually removing each study and subsequently conducting the six main regressions. The Lifetime income variable becomes insignificant when removing the Flues and Thomas (2015) study, which can be attributed to the fact that the study indeed provides a large number of effects that distinguish between distributional impacts with, and without considering lifetime incomes. This highly cited study has been published by OECD Publishing and has thus not undergone an academic, but an OECD internal peer-review processes, which comprises double checks by other departments and by member country delegates. 11 We thank an anonymous reviewer for suggesting to test this interesting potential impact on the results.
Understanding the distributional implications of climate policy is hence crucial for the design of just and effective climate policies.
This study carries out a meta-analysis of the existing literature to systematically explain differences in distributional outcomes of carbon pricing. We employ an ordered probit analysis on 53 original studies and analyze how country-specific factors, the type of policy under study as well as the modeling approach affect the likelihood of regressive or progressive outcomes.
We find a significantly increased likelihood of progressive study outcomes within lower income countries and for transport policies. The same applies to study designs considering indirect effects, demand-side adjustments of consumers or lifetime income proxies. In contrast, there is no evidence to support the hypothesis that subsidy reforms are inherently more progressive than carbon pricing. These insights bear direct relevance for policy makers in the initial stages of policy design for the decision which possible policy options should be explored in more detail. Nevertheless, policies that are implemented should be subject to detailed analysis that takes the country-specific context into account instead of relying on general patterns that hold across countries.
The interpretation of the results should particularly consider the following limitations of the analysis. Disregarding the effect size of overall regressive, proportional or progressive distributional impacts influences the regression coefficients. Our methodology does not account for differences within outcome categories, for example between strongly and weakly regressive effects. Smaller changes in the distributional impact within single studies, which are mostly driven by the economic effect variables, are thus ignored. This results in downward biased and less significant coefficients, as illustrated by the General equilibrium coefficient. Likewise, treating similar distributional impacts between studies equally, irrespective of their magnitudes, might ambiguously influence the size and significance of the coefficients. Estimating the effect size using subsamples with common and thus quantitatively comparable inequality metrics, however, suffers from too few observations to be representative.
Finally, the small number of effects for lower income countries decreases the accuracy of our findings. Our analysis shows a large impact of two lower income countries on two variables (see Sect. 4.5). A higher proportion of effects on lower income countries in combination with a larger total sample would reduce the impact of outliers, allow for more refined moderator variables, and thus provide more precise insights. We thus recommend future researchers to put an emphasis on distributional impacts in lower income countries. The robustness checks, however, confirm the overall validity of our findings.
It should be noted that even progressive policies increase consumer prices, which raises the risk of poverty for low-income households. In the most extreme cases this may lead to public resistance as illustrated by the example of Nigeria in 2012. The risk of poverty can, however, be offset by suitable revenue recycling schemes that compensate poor households (van Heerden et al. 2006). Revenue recycling can also provide various other benefits. For example, using revenues to reduce distorting income taxes can potentially lead to more employment, higher individual welfare and higher GDP growth (Pearce 1991;Goulder 1995;Pezzey and Park 1998). Revenues can also be used for public investments in infrastructure, providing access to water, sanitation, electricity, telecommunications and transport (Jakob et al. 2016). Climate policies in combination with a targeted use of revenues thus have the potential to simultaneously mitigate climate change and address additional sustainable development goals. However, public debates frequently focus on the distributional impact of consumer expenditures and thus underestimate or ignore the usually progressive impact of revenue recycling schemes, even if simultaneously proposed. Our results could therefore be interpreted as a proxy for publicly perceived distributional impacts of climate mitigation policies, though being economically incomplete. Distributional impacts of different revenue recycling schemes are thus an interesting research avenue for further research, but beyond the scope of this paper due to unresolvable methodological challenges (see Sect. 3.4).
This study contributes to an increased understanding of the distributional impacts or the potential benefits of climate mitigation policies, which may further support their implementation. Thus far, there has been a widespread belief that consumption taxes, and particularly environmental taxes, would particularly impose a burden on the poor. However, more than one third of the effects included in this sample are progressive or proportional. Hence, distributional outcomes of market-based climate policies depend on a variety of (often country-specific) factors. This kind of research may thus help prevent actors with vested interests, such as investors fearing stranded assets or workers fearing job losses (Vogt-Schilb and Hallegatte 2017), from instigating a public opposition against unwanted policies.
Acknowledgements Open Access funding enabled and organized by Projekt DEAL. We gratefully acknowledge financial support by the German Federal Ministry of Education and Research (funding code 03EK3046B). The authors thank Nicolas Koch, Franziska Holz, Steven Sexton as well as participants of research seminars at the MCC and the 6th WCERE 2018 in Gothenburg for valuable comments and suggestions. This paper has not been submitted elsewhere in identical or similar form, nor will it be during the first three months after its submission to the Publisher.

Funding Bundesministerium für Bildung und Forschung (BMBF).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.

Search Query
We use different combinations of keywords to comprehensively identify a broad set of original studies for our analysis. Unsuitable categories (for Scopus) and keywords indicating unsuitable categories (for both literature databases) are directly excluded. Adapting the two search queries to the respective syntaxes gives: Web of Science TS = (((carbon OR CO2 OR fuel OR gasoline OR emission* OR environment* OR ecologic* OR energy) NEAR/3 ("tax" OR "taxes" OR "taxation" OR allowance* OR subsid* OR polic* OR pric*)) NEAR/10 (distribut* OR regressive OR progressive OR incidence OR inequality OR (household* NEAR/1 income*))) NOT TS = ("smart grid" OR biomass OR (distribut* NEAR/1 (energ* OR network* OR spatial)) OR "power plant" OR "natural gas" OR health OR solar OR hydropower OR software OR wireless OR "computer" OR forest) NOT WC = ("engineering electrical electronic" OR "thermodynamics" OR zoology OR oceanography OR "engineering civil" OR "computer science theory methods") Scopus TITLE-ABS-KEY(((carbon OR CO2 OR fuel OR gasoline OR emission* OR environment* OR ecologic* OR energy) W/3 (("tax" OR "taxes" OR "taxation" OR allowance* OR subsid* OR polic* OR pric*)) W/10 (distribut* OR regressive OR progressive OR incidence OR inequality OR "household income"))) AND NOT TITLE-ABS-KEY("smart grid" OR biomass OR (distribut* W/1(energ* OR network* OR spatial)) OR "power plant" OR "natural gas" OR health OR solar OR hydropower OR software OR wireless OR "computer" OR forest) AND (EXCLUDE (SUBJAREA,"COMP") OR EXCLUDE (SUBJAREA,"MATH") OR EXCLUDE (SUBJAREA,"CENG"))

Study Overview
See Table 3.

Detailed Moderator Variable Description
Policy Variables The Subsidy variable includes all effects of studies modeling subsidy reforms. For this variable we allow for policies on single fuels, while carbon taxes and capand-trade systems only consider economy-wide policies. The variable implicitly abstracts from differences between carbon tax policies and cap-and-trade systems and specific policy designs though both have been widely debated in the literature (Parry et al. 1999;Parry 2004;Leach 2009;Dissou and Karnizova 2016;Shinkuma and Sugeta 2016). The Transport variable includes all effects of studies on the transport sector alone. This includes higher prices for petrol, diesel or liquefied petroleum gas (LPG) explicitly used for transport purposes. To ensure comparability with other policies, we only include distributional impacts on all households, irrespective if they own a car or not.
Country Variables and Data We measure the GDP per capita variable in steps of 1000 US$ in constant 2010 US$. The Gini coefficient, as commonly applied to measure the distribution of income and wealth, takes values between 0 and 1 if the measure is nonnegative. A higher Gini coefficient indicates a larger inequality. The Poverty gap variable measures the mean shortfall from the poverty line of 3.10 US$ of 2011 PPP. It therefore simultaneously captures the amount of people below the poverty line as well as their distance to it. A value of 0 indicates that no household can be found below the poverty line. The higher the value the larger the number or depth of households in poverty. Further information on the poverty line methodology can be found at: http://irese arch.world bank. org/Povca lNet/metho dolog y.aspx.
The four country dummies by income level refer to the GNI per capita in US$ using the Atlas methodology. We use data from the World Bank World Development Indicators for the years 1990-2014 (World Bank 2017). Further information on the dataset can be obtained from: https ://data.world bank.org/data-catal og/world -devel opmen t-indic ators .     We adjust our coded data or the dataset to consistently match the World Bank data for the countries included. First, we match the country data with the publication year of the original study's underlying household survey data, unless the authors provide an explicit reference year. As our dataset only contains data from 1990-2014, we truncate the reference year/household data publication year accordingly. Second, the dataset lacks timeconsistent data on the gini coefficients and the poverty gap. We fill the gaps with the next available datapoint in the future. If there is no available datapoint, we use the last available datapoint. Third, as there is no available data for British Columbia and Taiwan we use data for Canada and China, respectively. Further information on the coding and the data are documented in the codebook which is available upon request.
Bivariate z-test We test the bivariate relationship between the moderator variables and the dependent variable. For the binary moderator variables we conduct a two-proportion z-test. The test results indicate, if using the variable significantly changes the proportion of progressive, proportional or regressive study outcomes. Similarly we conduct a correlation analysis for the continuous moderator variables. The results indicate sign and significance of the correlation between the moderator variables and the dependent variable. Table 4 shows the results for both tests.
The two-proportion z-test results indicate a significant impact of more than half of the binary moderator variables on the proportion of study outcomes. For instance, the share of progressive findings for studies modeling transport policies increases to 36.2% compared to 20.2% for studies on economy-wide policies. The correlation analysis shows a significant correlation between most continuous variables and the dependent variable. The results of the two tests indicate a reasonable selection of moderator variables. The bivariate tests however ignore potential correlations between the moderator variables.
Additional Moderator Variables We exclude several potentially interesting moderator variables on policies and the study design. In particular, we neglect policy variables on revenue recycling schemes, levels of pricing and the impact of single fuels. The modeled revenue recycling schemes in the literature are too context-specific to be aggregated to homogeneous groups. The impact of different pricing levels is especially relevant for CGE models covering demand-side and income side effects. The small number of CGE models included, however, prevents us from determining their quantitative impact. Covering the impact of single fuels would allow us to conduct a more disaggregated analysis. The distributional impact of single fuels is, however, too rarely and inconsistently reported to provide robust findings. We further exclude moderator variables on the study design for different household equivalence scales, lifetime income measures and inequality measurement units. The reasons for exclusion are: scarce reporting of equivalence scales; heterogeneous lifetime income proxies; and too few literature sources comparing different inequality measures. The following references explicitly discuss or compare the impact of the excluded moderator variables: revenue recycling schemes (Speck 1999;Rausch et al. 2011;Mathur and Morris 2014;Williams et al. 2015); level of pricing (Dissou and Siddiqui 2014;Grottera et al. 2017); single fuels (Casler and Rafiqui 1993;Jacobsen et al. 2003); equivalence scales (Grainger and Kolstad 2010); lifetime income measures (Bull  Hassett et al. 2009) and inequality measurement units (Cornwell and Creedy 1996;Nikodinoska and Schröder 2016).

Original Study Model Details
Input-Output models cover direct and indirect price changes of different product categories. The indirect impact accounts for higher prices of goods and services using carbon intensive intermediate inputs by applying a static input-output matrix. This approach commonly assumes that levies are fully passed through to the final consumers. The assumption of inelastic demand corresponds to the short term incidence of price increases (Hassett et al. 2009;Feng et al. 2010;Anand et al. 2013). As the income side is neglected, those models do not capture long-term impacts but may be useful as an approximation to the true effect.
Micro-simulation models account for demand-side changes by considering consumer choices. The consumer demand is elastic with consumers maximizing their utility for given preferences, prices and budgets. Commonly used micro-simulation models are almost ideal demand systems (AIDS) (Deaton and Muellbauer 1980;West and Williams 2004;Tiezzi 2005;Rosas-Flores et al. 2017); its more flexible quadratic specification (QAIDS) (Banks et al. 1997;Brännlund and Nordström 2004;Nikodinoska and Schröder 2016); or more recently the exact affine stone index (EASI) demand system (Lewbel and Pendakur 2009;Tovar Reaños and Wölfing 2017).
CGE models cover direct and indirect price changes, demand-side changes of consumers and producers, and long-term general equilibrium effects. This approach considers policy effects on the source side of income in addition to the use side. CGE models assume explicit functional forms of demand and supply functions, and use exogenous parameters for demand elasticities and elasticities of substitution between production sectors (Hassett et al. 2009). Linked models, such as Input-Output and micro-simulations (Creedy and Sleeman 2006) or CGE models and micro-simulations are further extensions (Labandeira et al. 2009;Vandyck and Van Regemorter 2014).

Robustness Checks
This part of the appendix gives a comprehensive overview of the sensitivity analyses and specification tests conducted in this study. First, we address non-independence of observations as a common problem in meta-analysis (Ringquist 2013). Non-independence of observations generally occurs if at least one country or original study provides multiple effects (Ringquist 2013) which also applies to our analysis (see "Appendix Study Overview" and Fig. 2). It potentially causes correlated results within countries or studies. Though estimators are not biased or inconsistent they potentially become inefficient (Waldorf and Byun 2005). We account for that problem by imposing cluster-robust standard errors by country for the subsequent estimations. Additionally, we conduct one regression with cluster-robust standard errors by study to test the impact of the clustering decision.
Second, we conduct several robustness checks on the country modeling. Figure 2 shows five or fewer effects for 32 countries. Countries with few observations have a low time variation and thus pose the risk of multicollinear time-fixed and time-variant variables. For an alternative model specification, we create country groups based on the level of income that replaces the single country dummies. Grouping the countries increases the number of effects per dummy variable but assumes similar fixed-effects for all countries within the respective income group. For another model specification we exclude all time-variant country variables which leaves the respective country dummies to solely account for country differences. Finally we exclude all dummy variables to investigate the overall influence of the time-fixed effects. For all three regression types ("Single Country Dummies", "Group Country Dummies" and "No Country Dummies") we show the results with and without the three time-variant country variables (Fig. 3 "Baseline" and "No Country Variables"). In addition we test different combinations of the three time-variant variables.
Third, we test the validity of the ordered probit model specification. For a valid ordered probit specification the regression coefficients of a significantly regressive probit regression (1=regressive, 0=proportional or progressive) should be similar to the ordered probit coefficients. The regression coefficients for a significantly progressive probit regression (1=progressive, 0=proportional or regressive) should be similar in magnitude but opposite in sign (Wehkamp et al. 2018). We conduct the two probit regressions without country dummies because including single country dummies results in infinite iterations.
Fourth, we use a jackknife method to identify the impact of single countries on the results (Gould 1995). The descriptive analysis shows unequally distributed effects per country (see Fig. 2) which is a common problem in meta-analyses (Ringquist 2013). The jackknife method performs N regressions by leaving out the jth observations where j = 1, 2, … , N is the number of each country (N = 39) . The method thus provides N coefficients for each moderator variable. Jackknife regression coefficients that largely deviate from the ordered probit coefficient indicate a highly influential country or study.
Finally, we test for multicollinearity using the variance inflation factors and the joint significance of the variable groups using the likelihood-ratio test. The variance inflation factors for model specifications without single country dummies are rather small ( < 6.08 ), indicating no problems with multicollinearity. The context variables are the only group of variables that fail the likelihood ratio test (p>0.397). The other variable groups are at least significant at the 5% significance level. The pseudo-R 2 values range from 0.51 for the main regression to 0.13 for the regression without country dummies or variables.

Regression Results Overview
Tables 5 and 6 show the coefficients of the main regression as well as of the robustness checks without the single country dummy coefficients. Table 7 shows the results without including effects for subsidy reforms.  The table shows our regression coefficients without effects for subsidy reforms and thus only for carbon pricing Cluster-robust standard errors in parentheses. Dependent variable: Distributional impact: 0 = progressive, 1 = proportional, 2 = regressive * p < 0.10 , ** p < 0.05 , *** p < 0.01

Jackknife Findings
See Fig. 4.   Fig. 4 Jackknife country coefficients. Notes The figure shows the frequency (y-axis) of binned regression coefficients (x-axis) for each moderator variable a-k using the Jackknife method over countries. The distributions thus indicate the sensitivity of the regression coefficients to single countries. The x-axis scales are adjusted to the range of coefficient values

Country Dummy Coefficients
See Table 8.