Introduction

Understanding the link between the shadow economy (informal or underground economy) and carbon dioxide emissions gains relevance as the climatic problem worsens. Indeed, researching this link helps quantify the extent of carbon dioxide emissions associated with informal economic activities, providing a more comprehensive picture of a country’s carbon footprint and allowing policymakers to design incentives and a proper intervention to achieve zero emissions. Comprehension of the connection between the shadow economy and CO2 emissions is crucial to contribute to global efforts to combat climate change while addressing social and economic aspects of the informal economy.

This work focuses on the environmental Kuznets curve (EKC) controlling for the shadow economy, using a broad panel of 145 countries from 1991 to 2017. The EKC posits an inverted U-shaped relationship between economic growth and environmental degradation. In the early stages of a country’s development, the need to increase material production to reduce poverty and hunger takes center stage, leading to increased pollutant emissions. Later, as incomes rise, people become more concerned about environmental protection, and governments establish institutions and enact laws to protect the environment, resulting in lower environmental pressures.

The shadow or non-observed economy refers to all economic activities hidden from official authorities. According to OECD (2014), these activities can be grouped into five broad classes: (i) underground production, which represents the legal product that is deliberately concealed from public authorities; (ii) illegal production; (iii) informal sector production; (iv) production of households for own-final use; (v) statistical underground, which refers to unrecorded production activities due to deficiencies in statistical systems.

The size of the shadow economy varies widely across countries and even over time. Therefore, there is no easy way to measure it. Nevertheless, several attempts have been made to obtain a value for the dimension of the phenomenon, such as the works of Elgin et al. (2021) and Medina and Schneider (2019).

The underground economy causes measurement errors in the true value of the economic activity, leading to biases in gross domestic product. This issue is particularly relevant as countries adopt diverse practices for estimating and including underground activities in measured gross domestic product (OECD 2012). Typically, developed countries attempt to estimate the shadow economy and include it in their gross domestic product. However, poorer countries often lack the will and the resources to do that. As a result, the shadow economy treatment often varies among countries at similar development levels.

In sum, our contribution to the literature on the EKC hypothesis involves the combination of several features that were not considered in past studies. First, we assess the EKC hypothesis for a large panel of 145 countries over 27 years, accounting for the heterogeneity problem in longitudinal data by forming homogeneous country groups based on the fixed-effects estimates. This innovation clarifies situations where EKC hypotheses do not hold, contributing to shed light on the Altman and Bland (1995) warn “absence of evidence is not evidence of absence.” Second, we include the shadow economy as an explanatory variable to control for the measurement errors in economic activity. Third, we use time as a proxy for technological evolution, which is crucial because late-developing countries can access modern and clean production techniques. Finally, we examine the stability of the marginal effects of the covariates on carbon emissions using the panel transition regression model.

The remaining of this study is organized as follows: “Literature survey, theoretical background, and research hypothesis” section surveys the literature. “Data and methods” section presents the data and the methods used. “Empirical results” section reveals the results. Finally, “Discussion” section discusses the results, and “Conclusion and policy implications” section presents the conclusion and policy implications.

Literature survey, theoretical background, and research hypothesis

Literature survey

Concerns about environmental degradation have led to the development of literature on the factors and policies that affect environmental quality, usually represented by CO2 emissions (Sultana et al. 2022; Shirazi and Šimurina 2022; Işık et al. 2021; Al-Mulali and Ozturk 2015). This section summarizes the recent studies on the determinants of environmental quality, including economic growth, energy consumption, and institutional arrangements in a jurisdiction, e.g., policies, regulations, acts, and informal economy, which are first summarized in Table 1. Then, the theoretical framework for linking the shadow economy, the controlling variables, and the \({\mathrm{CO}}_{2}\) emissions are presented. Finally, the research hypotheses are organized.

Table 1 Representative works on the determinants of environmental quality

As seen in Table 1, the studies on the relationships between environmental quality and the shadow economy are largely unknown and ignored, although the informal economy encompasses various pollution-intensive activities that intensify the environmental impacts (Wang et al. 2019). Remarkably, the existing literature mainly refers to fossil fuel consumption and GDP per capita as the main determinants of \({\mathrm{CO}}_{2}\) emissions (Adekoya et al. 2022; Ahmad et al. 2021b; Alvarado et al. 2018, among others). The studies commonly present results to assess whether the economies experience the EKC hypothesis (Churchill et al. 2018). Likewise, other determinants of environmental deterioration are resource rents, oil prices, industry value-added, urbanization, net inflow of foreign direct investment, trade openness, and technological advances (Adekoya et al. 2022; Su et al. 2022; Chandio et al. 2021; Ponce and Alvarado 2019).

Accordingly, most of the explanations for environmental degradation considered in the existing literature are associated with the activities through the formal economy. Therefore, this research explicitly focuses on the potential non-linear impact of the shadow economy on the \({\mathrm{CO}}_{2}\) emissions (Sultana et al. 2022; Köksal et al. 2020; Zhou 2019). Besides controlling for variables such as economic growth, manufacturing value-added, urbanization, and fossil fuel rents, the effect of technological progress on the \({\mathrm{CO}}_{2}\) emissions is also investigated. This contribution is handled by disentangling the direct impact of time on emissions from its indirect effects (González et al. 2005), which is rare in the recent literature.

Theoretical background

Linking the shadow economy and the CO2 emissions

The role of factors related to the shadow economy, such as the informal economy, corruption, and deregulation in environmental pollution, has been studied by a limited number of articles, which is recognized as relevant and in need of innovative investigation (Ortiz et al. 2022; Chu 2022; Tillaguango et al. 2021; Shao et al. 2021).

The first strand of reviewed articles refers to the deregulation effect and concludes that the shadow economy increases pollution and damages environmental sustainability. Through the nexus between environmental regulation and pollution, the literature has explained some features of the shadow economy that underlie its relationship to environmental quality (Sultana et al. 2022; Ahmad et al. 2022, among others).

Through the nexus between environmental regulation and pollution, Ortiz et al. (2022) apply a panel data set of the “spatial autoregressive models,” “spatial lag models,” and “Durbin spatial models” and show that pollution has a large spatial dependence on environmental pollution processes. Pang et al. (2021) first utilize the “Multiple Indicators Multiple Causes” technic to calculate the shadow economy. Then, they used the “Generalized Method of Moments” and found a statistically significant bidirectional positive relationship between the shadow economy and pollution across 31 provinces of China from 2005 to 2015.

From the other viewpoint, Canh et al. (2021) emphasize that informal economic activities are less environmentally friendly and more energy-intensive, given that environmental rules and regulations are largely ignored. They indicate that compliance with the official requirements is stronger in the formal sector than in the informal sector. Hence, the formal sector avoids costly rules and regulations by outsourcing parts of the production process to the informal sector. Also, Baksi and Bose (2016) identify that the informal sector can act as an origin of leakage by allowing the formal sector to facilitate the outsourcing of polluting production.

From another perspective, Bento et al. (2018) show that energy taxes may create welfare gains through easing substitution from the shadow economy. They mention two mechanisms by which the informal sector may lower the costs of energy taxes and environmental policies: (i) energy taxes impose indirect costs on the informal sector as informal firms order energy sources through the formal sector, e.g., natural gas, electricity, and gasoline, and (ii) a revenue-neutral shift in tax bases toward energy sources that could mitigate the tax burden on the goods substituted via the informal sector. These mechanisms are determined to have remarkable potential in the welfare-improving substitution from the informal sector to the formal sector. Accordingly, a well-judged policy portfolio is required to reduce the damaging effect of the informal sector on environmental quality.

The second category of recent studies is associated with the scale, composition, and technique effects mechanisms where the underground economy is related to environmental pollution (Chu 2022; Sohail et al. 2021; Nkengfack et al. 2021).

In this respect, Chu (2022) employs an “advanced panel quantile regression” to study the effect of the shadow economy on the ecological footprint of the OECD economies from 1995 to 2015. The empirical findings indicate that due to the composition and technique effects, the shadow economy negatively impacts the mid to high quantiles of the ecological footprint. Furthermore, technological innovation exerts a favorable effect on lowering the ecological footprint, but it is non-monotonic across quantiles. The study concludes that higher informality mitigates the favorable impact of technological innovation on environmental quality.

In a group of countries with highly informal economies and as a consequence of the composition effect, Sohail et al. (2021) found a significantly negative nexus between the informal economy and \({\mathrm{CO}}_{2}\) emissions in Nepal and Bangladesh. However, the beneficial and detrimental role of the informal economy is suggested through a non-linear relationship in Sri Lanka and Nepal, respectively. Also, Nkengfack et al. (2021) find that higher informal activities in Sub-Saharan African countries with lower middle income reduce the \({\mathrm{CO}}_{2}\) emissions in the long run.

In conclusion, the nexus between the shadow economy and environmental degradation can be non-monotonic when associated with the pollution level. Besides, taking into account the two distinct impacts of informality, it is required to investigate the comprehensive effect of the shadow economy on the \({\mathrm{CO}}_{2}\) emissions, especially from the left to the right tail of the emissions’ conditional distribution. To this end, the impacts of explanatory variables on low, medium, and high quantiles of \({\mathrm{CO}}_{2}\) emissions are examined in this paper by the “panel quantile regression model” (Canay 2011). Furthermore, the evolution of the covariates’ slopes over time is analyzed by the “panel smooth transition regression model” (González et al. 2005).

Linking the controlling variables and the CO2 emissions

In parallel, there is a growing literature that examines the role of economic growth on \({\mathrm{CO}}_{2}\) emissions (Sultana et al. 2022; Ahmad et al. 2022; Onifadea et al. 2021, among others). In this regard, Li and Wei (2021) use the panel smooth transition regression method (PSTR) and indicate the existence of non-linear relationships between economic growth, innovation, trade openness, financial development, and carbon emissions.

Wang and Wei (2020) also apply the PSTR method and show that emerging economies associate with a strict level of environmental regulation, which will cause serious “green paradox” effects and delay economic development. Furthermore, they illustrate that the rebound effects of technology on energy consumption increase the CO2 emissions in the OECD countries. Notably, technological improvements enhance economic growth and promote efficiency, decrease the cost of energy consumption, and, therefore, increase energy usage, which is called the rebound effect (Lin and Du 2015). Panayotou (2016) utilizes a set of “reduced form single-equation” models and shows that the potential inverted U-shaped relationship is justified through the “scale, composition, and technique effects.”

Particularly, scholars underscore that economic growth requires more input volume, producing more natural resource usage. In other words, during the inefficient industrialization process, the environmental consequences are neglected, which refers to the scale effect and leads to increasing pollution (Ahmed et al. 2020a; Zafar et al. 2019; Banday and Aneja 2019). However, as economic growth increases, the structure of an economy changes called the composition effect, which gradually expands the cleaner activities, producing lower pollution (Dinda 2004) due to more efficient human capital accumulation. At high levels of economic growth, the underdeveloped and developing technologies are replaced by the developed and cleaner technologies, called the technique effect, which can enhance the environmental quality (Bano et al. 2018).

Concerning other controlling variables, urbanization and industrialization require a higher volume of inputs that leads to more natural resource usage in both consumption and production activities (Langnel and Amegavi 2020), which causes more natural resource extraction (Ahmed et al. 2020a) and \({\mathrm{CO}}_{2}\) emissions (Su et al. 2022; Langnel et al. 2021; Ullah et al. 2021; Ahmed et al. 2020b; Danish 2019).Footnote 1

Finally, most studies argue that postponing the switch from the carbon-based fuel portfolio to clean sources does not reduce fossil fuel consumption and lowers the \({\mathrm{CO}}_{2}\) emissions, suggesting the resource rents raise the environmental degradation (Adekoya et al. 2022; Ali et al. 2021; Langnel et al. 2021; Shahzad et al. 2021; Ullah et al. 2021; Ebrahimi Salari et al. 2021; Nathaniel and Khan 2020; Zafar et al. 2020; Ahmed et al. 2020a,b; Destek and Sinha 2020; Danish 2019; Alola et al. 2019; Ozcan et al. 2019; Li and Sun 2018; Ulucak and Lin 2017). On the other hand, Ulucak and Khan (2020) reveal that renewable energy and resource rents reduce the ecological footprint across the BRICS economies, suggesting their interesting role in reducing environmental degradation.

From the survey of the reviewed literature, it is detected that the contradictions remain in the nexus between the controlling variables, such as economic growth, urbanization, technological progress (Trend), fossil fuel rents, and environmental degradation, which necessitate further investigation. Accordingly, this study fills in the gap and aims to explore new insights by considering homogeneous groups of countries to derive relevant policy implications adapted to the different groups that promote their sustainable economic growth with a low environmental cost.

Research hypothesis

Underlying gross domestic product measurement errors lead to biases in assessing the relationship between economic activity and environmental degradation. This problem is particularly acute for less developed countries, where these measurement errors tend to be higher. Thus, it is crucial to include estimates of the underground economy when testing the EKC hypothesis, which leads us to our first hypothesis.

  • Hypothesis 1. The shadow economy affects carbon dioxide emissions, and we expect it to cause an increase in emissions for poorer countries for which gross domestic product measurement errors are typically high.

To have a big picture of the role of the shadow economy in environmental degradation behavior, the researchers can use large panels of countries. However, the indiscriminate inclusion of countries in panel estimations leads to huge heterogeneity that blurs the quality of these estimations, allowing conclusions that do not cope well with the idiosyncratic features of countries. The computationally intensive econometric technique proposed by Lin and Ng (2012) was used to circumvent the problem of heterogeneity in longitudinal data, which allows for identifying groups with homogeneous parameters. Indeed, this technique can be used to identify groups of countries with homogeneity of parameters for an econometric specification made a priori.

A proper assessment of the EKC hypothesis requires the inclusion of control variables that account for the different characteristics of countries. Otherwise, we might incur an omitted variables problem that leads to biased estimates and erroneous conclusions. We choose manufacturing value-added as a percentage of GDP, the urbanization rate, and fossil fuel rents as a portion of GDP as control variables. We expect higher urbanization and industrialization to translate into higher CO2 emissions, as production activities require more inputs and resources. Regarding fossil fuel rents, we conjecture that countries that rely heavily on fossil fuels have few incentives to promote their transitions to renewable energy sources. Thus, we expect their pollution levels to be higher.

  • Hypothesis 2. Manufacturing value-added as a percentage of GDP, the urbanization rate, and fossil fuel rents as a fraction of GDP contribute to raising CO2 emissions.

Another neglected factor in prior studies focusing on the EKC hypothesis is time. Past research implicitly assumes that countries that share the same values for GDP and other control variables are expected to have the same level of CO2 emissions. In our view, this conjecture is incorrect, as late-developing countries benefit from more advanced and clean technologies that allow them to achieve the same GDP level while polluting less. This fact leads us to formulate our third hypothesis.

  • Hypothesis 3. Time, as a proxy for technological development, mitigates carbon dioxide emissions, given the development level of countries.

Time may also influence the linkages between the explanatory variables and carbon dioxide emissions. As technology evolves, the marginal impact of manufacturing, urbanization, and other variables on emissions may change. Thus, we resort to the panel smooth transition regression model, proposed by González et al. (2005), to evaluate whether the marginal effects of the regressors on emissions are time varying.

  • Hypothesis 4. The marginal effects of the explanatory variables on carbon dioxide emissions change over time.

Data and methods

In this section, the data and method are explained in detail.

Data

In this research, we select the annual carbon dioxide emissions in metric tonnes per capita (CO2), retrieved from the World Bank Open Data (2021), as our measure of environmental degradation. The extensive use of this variable in studies focusing on the environmental Kuznets curve motivates our choice. We consider several drivers of emissions, listed in Table 2.

Table 2 Data description

The availability of the shadow economy variable drove our choice of the list of countries under study and the period. Medina and Schneider (2019) define the shadow or informal economy as all economic activities hidden from official authorities for monetary, regulatory, and institutional reasons. The unobserved economy is noticeably difficult to measure, as the entities that engage in these activities want them to remain unnoticed. In general terms, there are two different types of methods to assess the size of the informal economy: direct approaches and indirect approaches. The first type relies on surveys and samples, and their results are sensitive to the manner in which the questions are formulated and the willingness of the respondents to reveal their activities, rendering them unreliable truthfully.

Indirect methods resort to indicators to assess the size of the shadow economy, such as the difference between national expenditure and income, the difference between the actual and official labor force, and electricity consumption, among others. Medina and Schneider (2019) use the “Multiple Indicators, Multiple Causes” method (MIMIC), which is more robust than single indicator methods, as it considers several causes for the extent of the shadow economy and various effects. This method links the observable causes to their effects on an unobservable variable to estimate the size of the shadow economy. Specifically, they consider eight causes for the informal economy (trade openness, GDP per capita, unemployment rate, size of government, fiscal freedom, rule of law, control of corruption, and government stability) and three effects (currency as a fraction of broad money, labor force participation, and the growth rate of GDP per capita). Using the MIMIC method with the causes and effects described above, these authors obtain annual estimates for the shadow economy covering 157 countries from 1991 to 2017. We excluded twelve countries that did not have information on all the variables pertinent to this study for at least 10 years. Thus, our final sample includes annual data for 145 countries from 1991 to 2017.

Figure 1 reveals the world mean and standard deviation of three shadow/informal economy measures estimated by Medina and Schneider (2019) and Elgin et al. (2021). One can see a trend for decreasing the average and the dispersion in the phenomena. The decoupling of estimates starts with the event of the financial crisis of 2007 and is more pronounced for the estimates (Shadow) of Medina and Schneider (2019) than for the ones (DGE and MIMIC) of Elgin et al. (2021).

Fig. 1
figure 1

Three measures of the shadow economy. Shadow denotes the size of the shadow economy in % of GDP as estimated by Medina and Schneider (2019); DGE denotes dynamic general equilibrium model-based (DGE) estimates of informal output (% of official GDP), and MIMIC multiple indicators multiple causes model-based (MIMIC) estimates of informal output (% of official GDP), both as estimated by Elgin et al. (2021); this figure was created by the authors based on the data from Medina and Schneider (2019) and Elgin et al. (2021)

Table 3 shows the descriptive statistics for all the countries in the sample, group 1 and group 2. The following section describes the method used to create the country groups. Finally, in Appendix A, we offer the complete country list and groups 1 and 2 composition. Curiously, even though the country selection procedure is based exclusively on the data and includes no aprioristic view about the groups’ compositions, despite its smaller size (42 countries in group 1 versus 103 in group 2), group 1 includes most countries of the European Union plus the UK, and the largest developed economies, such as the US and Japan.

Table 3 Descriptive statistics

The average per capita CO2 emissions are substantially higher for group 1 countries than for group 2. The first country group is wealthier, and the size of its informal economy, measured in constant 2010 USD, is higher. However, its average size relative to the gross domestic product is lower (16.24% and 20.38% for groups 1 and 2, respectively). Group 1 countries have a more urban population and a slightly higher gross domestic product and manufacturing share. There is a sizable difference regarding fossil fuel rents, those of the first group more than tripling of the second group.

Methods

We begin this subsection by detailing preliminary tests to assess the data properties. Then, we present the main model and estimation methods: (i) Canay (2011) panel quantile regression, (ii) panel fixed effects, and (iii) panel smooth transition regression (González et al. 2005). Finally, we describe the procedure proposed by Lin and Ng (2012) to deal with the potential problem of parameter heterogeneity, which involves forming country groups based on the fixed-effects estimates.

Preliminary tests

We perform the following tests to check the properties of the variables used in our analysis:

  1. (I)

    The Shapiro–Wilk normality test (Royston 1983). This test verifies whether the variables are normally distributed. The null hypothesis states they are.

  2. (II)

    The variance inflation factor (VIF) is used to evaluate the degree of multicollinearity between the explanatory variables. High multicollinearity may render parameter estimates unstable.

  3. (III)

    Cross-sectional dependence test (Pesaran 2021). According to the null hypothesis, there is no correlation between different units. If this hypothesis does not hold, the traditional fixed effects estimator is inefficient, and the standard errors are biased.

  4. (IV)

    Panel unit root test (CIPS) analyzes the stationarity of the variables (Pesaran 2007). The null hypothesis of this test states that the series is non-stationary. Spurious regression is a common consequence of using integrated variables.

  5. (V)

    Cointegration test (Pedroni 1999). According to the null hypothesis, variables are not cointegrated. A set of integrated variables of order one is cointegrated if a linear combination of these variables is stationary. Cointegration avoids the spurious regression problem and renders parameter estimates super-consistent.

  6. (VI)

    The Hausman test (Hausman 1978) compares the fixed and random effects estimates. Under the null hypothesis, the random effects estimator is consistent and efficient. However, if the unobserved effect is correlated with the independent variables, the null hypothesis fails, and the fixed effects estimator should be chosen.

Panel quantile regression

Next, we proceed to the estimation stage. We resort to panel quantile regression as our main estimation method. This method has several advantages compared to traditional least-squares: (i) it is robust in the presence of outliers, which may generate large swings in least-square parameter estimates; (ii) it works well with non-normally distributed data; and (iii) we can get a thorough image of the impact of the carbon emission’ drivers on all its distribution because, unlike least-square methods, quantile regression does not model the conditional mean exclusively.

We conjecture the relationship between per capita carbon dioxide emissions and their drivers following the specification below:

$${CO2}_{i,t}={{\alpha }_{i}+X}_{i,t}^{\prime}\beta \left({U}_{i,t}\right)$$
(1)

where the index \(i=\mathrm{1,2},\dots ,145\) identifies the country; \(t=\mathrm{1991,1995},\dots ,2017\) corresponds to the observation year; \({\alpha }_{i}\) is the country’s fixed effect; \({X}_{i,t}{\prime}=\left[{1, GDP}_{i,t}, {GDP2}_{i,t}, {Shadow}_{i,t}, {Manuf}_{i,t}, {Rent}_{i,t},{Urban}_{i,t},{trend}_{t}\right]\) is the explanatory variables’ vector; \(\beta \left({U}_{i,t}\right)=\left({\beta }_{0},{\beta }_{1},\dots ,{\beta }_{7}\right)\) is the coefficients’ vector; and \({U}_{i,t}\) is a uniformly distributed random variable on the interval \(\left[\mathrm{0,1}\right].\)

The relationships between CO2 emissions and economic growth, the share of manufacturing in GDP, and urbanization are extensively documented in the literature. However, including the remaining covariates in our model deserves further justification. First, countries that draw high rents from fossil fuels may be unwilling to invest in environmentally friendly technologies and renewable energy sources. They have vast non-renewable energy resources to ensure their energetic security and develop their economies. Thus, we expect the variable Rent to affect CO2 emissions positively. The inaccuracy of gross domestic product explains the inclusion of the shadow economy among the covariates as a measure of the true level of economic activity. Even though countries should include all non-observable economic activities in their GDP estimates, some choose not to.Footnote 2 Many others do not exhaustively measure the shadow economyFootnote 3 (United Nations Economic Commission for Europe 2008; Gyomay 2014). Also, the least developed countries often lack the means and/or the willingness to obtain reliable estimates of the underground economy. Finally, we believe time is a very important and often neglected variable in environmental Kuznets curve studies. The traditional approach to the environmental Kuznets curve assumes that two countries that share the same characteristics (GDP and other control variables) in two different years should have the same carbon dioxide emissions. This conjecture is incorrect, as late developers have access to innovative and eco-friendly technologies. Furthermore, the rising worldwide pressure to reduce greenhouse gas emissions to mitigate global warming is likely to drive a decrease in carbon emissions in the most recent years, ceteris paribus. Thus, we expect the passage of time to contribute to decreasing emissions.

Simple quantile regression may lead to inconsistent parameter estimates in Eq. (1) when the independent variables are correlated with the unobserved fixed effects. Canay (2011) designed a simple two-step estimation procedure to overcome this problem. Let \({u}_{i,t}\equiv\) \({X}_{i,t}^{\prime}\left[\beta \left({U}_{i,t}\right)-{\beta }_{\mu }\right]\), where \({\beta }_{\mu }\) is the conditional mean of \(\beta \left({U}_{i,t}\right)\). From Eq. (1), we get

$${CO2}_{i,t}={{\alpha }_{i}+X}_{i,t}{\prime}{\beta }_{\mu }+{u}_{i,t}$$
(2)

The method proposed by Canay to estimate the quantile τ’s parameters proceeds as follows:

  1. (1)

    Get a consistent estimate of \({\beta }_{\mu }\), using Eq. (2), and let \({\widehat{\alpha }}_{i}\equiv {T}^{-1}\sum\limits_{t=1}^{T}\left[{CO2}_{i,t}-{X}_{i,t}^{\prime}{\widehat{\beta }}_{\mu }\right]\), where \({\widehat{\beta }}_{\mu }\) is the estimate of \({\beta }_{\mu }\).

  2. (2)

    Estimate the coefficients of the explanatory variables for quantile τ by solving the following problem:

    $$\widehat{\beta }\left(\tau \right)\equiv {argmin}_{\beta }\frac{1}{T\times N}\sum\limits_{t=1}^{T}\sum\limits_{i=1}^{N}\left[{\rho }_{\tau }\left({\widehat{CO2}}_{i,t}-{X}_{i,t}^{\prime}\beta \right)\right]$$
    (3)

where \({\widehat{CO2}}_{i,t}={\mathrm{CO}2}_{i,t}-{\widehat{\alpha }}_{i}\), \({\rho }_{\tau }\) is the check function for quantile τ, T, and N denote the number of years and the number of countries in our sample, respectively. Canay (2011) shows that the estimates obtained using this method are consistent and asymptotically normal. Therefore, we use the fixed-effects estimates in the first stage of our application. Then, we compute the standard errors through a simple bootstrap procedure involving 1000 replications.

We compare the fixed-effects estimates for Eq. (1) with the quantile regression results.

Panel smooth transition regression

After confirming the impact of time on carbon dioxide emissions, we dig further into the analysis and study the channels through which time affects emissions. Note that it may change the sensitivity of emissions to the covariates. As technology advances, manufacturing may become less carbon-intensive, and improving building techniques may make dwellings more energy-efficient, thus weakening the linkage between these variables and emissions. Assessing the evolution of the GDP coefficients is also interesting, as late-developing countries may follow different paths to achieve the same level of well-being as early developers.

Beyond the direct effect of time on carbon dioxide emissions, it may also affect the strength of the linkages between the covariates and CO2. That is, as technology evolves and populations increase their demands for a reduction in emissions, the marginal effect of manufacturing and urbanization on emissions may diminish as industries and construction change to cleaner production techniques. To assess the indirect impact of time on emissions from its indirect effects, we resort to the panel smooth transition regression (PSTR) model (González et al. 2005). We may see this model as a generalization of the panel threshold regression (PTR) model developed by Hansen (1999), which allows for abrupt changes in the regression coefficients that depend on the value of an observed variable. However, the PSTR model is a nonlinear homogeneous panel model that contemplates rough parameter changes contrary to Hansen’s PTR. In this research, we use the following PSTR specification:

$${CO2}_{i,t}={{\alpha }_{i}+X}_{i,t}^{\prime}{\beta }_{0}+\sum\nolimits_{j=1}^{r}{\mathrm{X}}_{i,t}^{\prime}{\beta }_{j}{g}_{j}\left({q}_{i,t}^{\left(j\right)};{\gamma }_{j};{c}_{j}\right)+{u}_{i,t}$$
(4)

where \({X}_{i,t}^{\prime}\) is the covariates vector for country i in year t; \({\alpha }_{i}\) denotes the fixed individual effect for country i; \({\beta }_{0}\) is the constant part of the coefficients’ vector for the covariates; \({\beta }_{j}\) is the coefficients’ vector corresponding to the transition function j; \({g}_{j}\left({q}_{i,t};{\gamma }_{j};{c}_{j}\right)=\left[1+\mathrm{exp}\left(-{\gamma }_{j}\left({q}_{i,t}^{\left(j\right)}-{c}_{j}\right)\right)\right]\) is the logistic transition function j; \({q}_{i,t}\) is the corresponding transition variable; and \({\gamma }_{j}\) and \({c}_{j}\) are the slope and location parameters for the transition function j, respectively. Next, we select the trend as the transition variable, allow for a maximum of three regimes (r = 2), and estimate Eq. (4) by nonlinear least squares using the MATLAB code provided by Fouquau et al. (2008) and Colletaz and Hurlin (2006). Finally, we choose the optimal number of regimes using the homogeneity and the no remaining heterogeneity tests (see González et al. (2005) for further details).

Lin and Ng (2012) estimation procedure

Parameter heterogeneity is a problem that plagues most macro panels with a large number of countries, such as ours and compromises the validity of the estimates. It is well known that ignoring parameter heterogeneity in a panel data framework leads to inconsistent parameter estimates (Campello et al. 2019). A method to circumvent this problem is to assume complete parameter heterogeneity. However, this solution turns it into a time series estimation that forfeits the benefits of the panel data structure. Lin and Ng (2012) propose an alternative solution, which assumes there are various groups of countries with homogeneous slope parameters within each group but allows for slope heterogeneity for different groups. This method establishes a compromise between full parameter homogeneity and full parameter heterogeneity and maintains the advantages of using a panel data structure without suffering from the estimation bias that afflicts fixed-effects estimates for the full sample. The procedure designed by these authors takes an agnostic view of the number and composition of the groups and uses a modified K-means algorithm to attain conditional clustering. First, the algorithm requires choosing the number of groups, G, and the random assignment of all countries to one of the groups. Then, we repeat the following two stages until convergence is achieved:

  1. 1-

    Estimate the fixed effects slope coefficients, \({\beta }_{g}\), for each group.

  2. 2-

    Assign country i to group g', where g' is the solution to the following minimization problem:

$$g'={argmin}_g\sum\nolimits_{t=1}^T\left[{C\overset\dots O2}_{i,t}-{\overset\dots{X'}}_{i,t}{\widehat\beta}_g\right]^2$$
(5)

where \({C\overset\dots O2}_{i,t}\) and \({\overset\dots{X'}}_{i,t}\) are the demeaned dependent variable and independent variables vector for the country i in year t, respectively, and \({\widehat{\beta }}_{g}\) denotes the estimate of the slope for group g. Step 2 must be applied to every country in the sample.

In this study, we consider a maximum of five groups and repeat the algorithm described above one million times for each choice of G, as Lin and Ng (2012) show that the final estimates are sensitive to the initial group allocations. Then we compute the following modified BIC criterion for each replication and number of groups (one million times five, which equals five million):

$$BIC\left(\widetilde{G}\right)=ln\left[\frac{1}{NT}{\sum }_{g=1}^{\widetilde{G}}\sum\nolimits_{i\in {I}_{g}}\sum\nolimits_{t=1}^{T}{\left[{C\overset\dots{O2'}}_{i,t}-{\overset\dots{X'}}_{i,t}^{\prime}{\widehat{\beta }}_{g}\right]}^{2}\right]+\widetilde{G}K\frac{{c}_{NT}ln\left(NT\right)}{NT}+\left(\widetilde{G}-1\right)\frac{ln\left({N}^{2}\right)}{{N}^{2}}$$
(6)

where K denotes the number of regressors, \({c}_{NT}=\sqrt{min\left(N,T\right)}\), and \({\widehat{\beta }}_{g}\) is the vector of estimates for group g. Finally, we choose the group compositions and corresponding estimates that minimize this criterion.

Robustness check

Panel data estimates are sensitive to outliers, which may produce distorted estimates and lead to erroneous conclusions. To assess whether outliers drive our results, we control for their presence by introducing dummy variables and repeating all the estimations. We consider an observation an outlier whenever the estimated residual is more than three standard deviations away from the mean.

Empirical results

In the first part of this section, we show the results of the preliminary tests.Footnote 4 Then, we analyze the fixed effects and quantile regression estimates. Finally, in the last part, we present a figure that depicts the time variation of the parameter estimates implied by the PSTR model.

Preliminary tests

Table 4 shows the Shapiro–Wilk test for normality. We reject the null hypothesis of normality for all variables. Furthermore, our primary method (quantile regression) is robust even when the data distribution is not normal. Thus, the absence of normality is not a problem.

Table 4 The Shapiro–Wilk test for normality

All variance inflation factors (Table 5) are lower than 10, except for GDP. This finding is unsurprising: by construction, we should expect a high correlation between GDP and GDP2. Moreover, the mean–variance inflation factor is well below the commonly accepted threshold of six. Thus, we need not be concerned that serious multicollinearity affects our estimates.

Table 5 Variance inflation factor

Cross-sectional dependence is a common feature of all variables (Table 6). This result is hardly surprising. The rising interdependence between economies implies that shocks propagate swiftly across the world. An undesirable consequence is that standard errors from the regular fixed effects estimator are biased. Thus, we resort to Driscoll and Kray’s (1998) standard errors in the fixed-effects estimates and bootstrapped standard errors in the panel quantile regressions.

Table 6 The Pesaran CD test

Cross-sectional dependence affects the size and power of the first-generation panel unit root tests. Therefore, we choose to test the order of integration of the variables through the Pesaran (2007) panel unit root test robust to this feature. In the specification without a trend, only the variable Manuf is stationary, whereas when a trend is included, none is (Table 7).

Table 7 Panel unit root test (CIPS test)

When the variables are not stationary, regression methods may find fallacious relations between the variables driven by their trending nature. This phenomenon, known as spurious regression, may lead to erroneous conclusions about the impact of the covariates on the dependent variable. This problem is absent when the non-stationary variables are co-integrated when a linear combination of the variables is stationary. In this scenario, the estimates are super-consistent (Hamilton 1994). Therefore, we use the Pedroni (1999) cointegration test to ensure that our estimates are reliable. All versions of this test reject the null hypothesis of no cointegration (Table 8). Thus, we do not need to worry about drawing wrong inferences from our estimates because the errors are stationary.

Table 8 Pedroni cointegration test

Finally, the Hausman test (Table 9) strongly rejects the null hypothesis that the random effects estimator is consistent. Thus, we must resort to the fixed effects estimator.

Table 9 Hausman test

Fixed effects and panel quantile regression

Table 10 displays the full sample estimation results. The fixed-effects estimates for the gross domestic product do not contradict the environmental Kuznets curve hypothesis, but they do not offer strong support either: the coefficients have the expected sign but are not statistically significant. The shadow economy increases CO2 emissions, which confirms hypothesis 1. The manufacturing share, fossil fuel rents, and urbanization also raise CO2 emissions (hypothesis 2), confirming our conjecture that countries with vast fossil fuel resources have fewer incentives to invest in green technologies and renewable energy. The passage of time contributes to the mitigation of environmental degradation, driven by the adoption of innovative clean technologies and the increasing pressure to combat global warming (hypothesis 3). The quantile regression estimates for the gross domestic product confirm the hump-shaped relation between emissions and economic growth in the two highest quantiles. The remaining coefficients’ signs are coherent with the fixed-effects estimates, except for Rent in the 10th quantile. The impact of GDP, the shadow economy, manufacturing share, urbanization, and fossil fuel rents on CO2 increases as we move from the left to the right tail of the emissions’ conditional distribution. On the contrary, the mitigating effect of the trend of environmental degradation is stronger in the middle quantiles.

Table 10 Estimates for_CO2

The split of countries into two groups (Table 11) uncovers the hump-shaped pattern between CO2 emissions and economic growth characteristic of the environmental Kuznets curve, which was concealed by parameter heterogeneity in the complete sample. However, the turning points of emissions, 41,555 USD for group 1 and 72,062.5 USD for group 2, are well above the current GDP of most countries. We can also observe that the CO2 emissions of countries from group 1 are more sensitive to economic growth than those of group 2. The differing impact of the unobserved economy on CO2 for the two groups—negative for group 1 and positive for group 2—deserves further study. The remaining coefficients agree with the complete sample estimates, except for the manufacturing coefficient for group 2. The effects of manufacturing and urbanization on CO2 emissions are stronger for group 1 than group 2, while the reverse happens for fossil fuel rents. Combining these two later results is evidence of the resource curse for group 2 (e.g., Fuinhas and Marques 2013). The passage of time mitigates emissions the most for group 1. The quantile regression estimates agree with the fixed-effects estimates in sign and significance, except for Rent (10th and 25th quantile for group 1) and Trend (90th quantile for group 2).

Table 11 Estimates for CO2 for the two groups

Appendix B shows the fixed effects and quantile estimates for the full sample (Table 12) and the two groups (Table 13), when we use dummy variables to control for outliers. These new estimates show slight changes relative to the previous ones, which asserts the robustness of our results.

Panel smooth transition regression

Figure 2 depicts the evolution of the covariates’ slopes over time, implied by the PSTR estimates. In all cases, three regimes were chosen (r = 2). The linearity and no remaining heterogeneity test results and the final model estimates are available from the authors on request. For the complete sample (blue line), we observe a weakening tendency for the linkage between CO2 emissions and GDP, whereas the impact of fossil fuel rents strengthens over time, which is consistent with hypothesis 4. The other slopes remain broadly stable. For group 2 (green line), we detect the same tendencies for GDP and Rent, but the environmental degradation caused by the manufacturing share disappeared in the most recent years. Indeed, the direct mitigating effect of variable Trend (i.e., the effect of time not mediated by changes in other covariates' slopes), identified in Table 9, completely vanishes. The only clear patterns for group 1 (red line) are the increasing impacts of manufacturing and fossil fuel rents. Moreover, unlike group 2, the effect of time on CO2 emissions was more pronounced and showed a decreasing intensity in the middle of the 2000s. It does not operate through changes in the sensitivities of CO2 emissions to the remaining covariates.

Fig. 2
figure 2

The evolution of the covariates’ slopes over time implied by the PSTR estimates. The blue line represents the full sample, the red line corresponds to the first group and the green line to the second group

Figure 3 in Appendix B shows the evolution of the estimates when we control for outliers. We only observe minor changes relative to the Fig. 2 in the main text. Thus, our conclusions remain essentially unchanged.

Discussion

The adverse environmental impacts of the shadow economy in the second group of countries, which has lower average per capita CO2 emissions than group 1, are evident in all quantiles of the empirical findings, confirming our first hypothesis. Furthermore, similar results are detected by Pang et al. (2021) and Zhou (2019) in China, Baksi and Bose (2021) in developing countries, and Canh et al. (2021) but are inconsistent with Sultana et al. (2022), Chu (2022), and Sohail et al. (2021). The underlying theoretical mechanism of the noxious impacts is that the scale effect dominates the composition and technique effects. Specifically, the informal sectors mainly contain small, unlicensed agents characterized by underdeveloped or developing technologies and high CO2 emissions, which are hard to discover and regulate. This matter facilitates increasing underground production, especially through manufacturing activities, and hence is considered one of the main contributors to the improper environmental pressures in such economies (Elgin and Mazhar 2013). Also, the formal sector may avoid costly rules and regulations by outsourcing parts of the production process to the informal sectors. In addition, the informal transportation sector also leads to huge exhaust emissions in this group due to inefficient vehicles in clandestine transportation activities, which often hardly satisfy environmental standards (Zhou 2019).

In contrast, in the first group of countries, which includes most countries with a smaller average size of their informal economies relative to GDP, the adverse effects of the informal economy turn into favorable effects that enhance the environmental quality (Bano et al. 2018). This finding is consistent with Chu (2022). It can be justified by the fact that the composition or technique effect dominates the scale effect in the countries with less participation in the underground economy, which have higher levels of institutional quality and government effectiveness. In low environmentally degraded economies, the stringent environmental regulations and policies cause the informal sectors to protect the environment rather than pollute it. This process relates to less capital-intensive techniques, making the informal sector less motivated to participate in environmental destruction (Elgin and Oztunali 2014).

Moreover, from the left to the right tail of the emissions’ conditional distribution in each group of countries, the shadow economy’s positive or negative effects on CO2 emissions are relatively unchanged in intensity. Meanwhile, the full sample results indicate the overall positive response of CO2 emissions in reaction to the development of the shadow economy, which shows the dominant role of the countries with the greater average size of their informal economies relative to the gross domestic product in explaining CO2 emissions. It means that the scale effect neutralizes the appropriate impacts of the composition effect and technique effect across the full sample. The potential explanations to justify this relationship are as follows: first, technological development and decreasing CO2 emissions associated with underground production need considerable private and public investment. The informal sectors may not have enough financial resources for efficient technological transformation, especially through manufacturing activities requiring long-term and large-scale investment to capture the benefits (Elgin and Mazhar 2013). Second, technology transfer associated with switching from the scale effect toward the composition and technique effect requires the connectedness network to share and adopt knowledge, which entails institutional development and formal procedures (Lassa 2019). Third, the higher levels of informality, which hold for individual participants and enterprises that run outside the institutional framework, make them uninterested in engaging in this formal process to keep away from the government’s regulations. Last, human capital is a major factor affecting the adoption level of environmental innovations (De Marchi and Grandinetti 2013). However, the informal economy does not support and facilitate human capital capacity building regarding research, development, and innovative technologies.

Accordingly, the higher quality of government regulation significantly intensifies the government’s ability to establish procedures for environmental protection (Ortiz et al. 2022). Therefore, switching from underground activities to the formal economy relates to lower environmental degradation, ceteris paribus. This finding may motivate environmental policymakers to focus more on the policies related to the shadow economy.

Besides the common elements examined in the previous literature with similar viewpoints on structural, technological, and economic effects, this paper introduces the shadow economy and Trend as the significant determinants of pollution. Also, we compare how the interaction between these two factors affects CO2 emissions across two groups of countries with different levels of environmental regulations and technologies. To the best of our knowledge, the previous studies (Chu 2022; Adekoya et al. 2022; Alvarado et al. 2021; Deng et al. 2020; Wang et al. 2019; Canh 2019; Shahbaz et al. 2017) focused on the production process as the key factor affecting pollution and on how the structural and technological changes helped to reduce emissions. In this regard, the upward impacts of manufacturing value-added and fossil fuel rents on the emissions in group 1 are just the clear patterns in this study, corroborating our second hypothesis (Fig. 2).

Our study confirms the scale effect in group 2 (Pang et al. 2021; Zhou 2019; Baksi and Bose 2021; Canh et al. 2021; Wang and Wei 2020; Dinda 2004), while the composition and technique effects in group 1 (Chu 2022; Bano et al. 2018). As the other contribution of this paper, the suggested outcome shows that innovation and technological progress contribute to the mitigation of environmental degradation (Sultana et al. 2022; Chu (2022);; Sohail et al. 2021; Ulucak 2021; Hussain & Dogan 2021; Hussain et al. 2020), which confirms our third hypothesis. Particularly, it is driven by the adoption of the escalation of the industrialized energy system, energy efficiency, the policies attracting investment for new and renewable energy sources, and the technological and structural advancement for exhaust gas emissions reduction, which is in line with studies by Su et al. (2022), Zhou (2019), Wang et al. (2016), and Zhang and Lin (2012).

Notably, the average size of technological improvement in both sub-groups does not seem large enough to significantly support the shadow economy’s proper impact and considerably reduce environmental concerns. Therefore, weak institutional and corrupted economies should avoid deepening the shadow economy due to stricter environmental terms and regulations (Rafique et al. 2021; Chaudhuri & Mukhopadhyay 2006; Fleming et al. 2000).

Consequently, structural changes and technological improvement are suggested to dominate the improper impacts of scale effect and capture the benefits of the composition and technique effects by the economies, lowering the CO2 emissions. Also, research and development projects, innovation, and technological improvement can lead to environment-friendly production processes and techniques across urban areas. These proceedings can mitigate the adverse environmental effects of growing urban infrastructure and contribute to fewer CO2 emissions.

Moreover, explicitly and implicitly, policies to switch from a carbon-based fuel portfolio to new and renewable energy technologies will impact emissions reduction. Specifically, the explicit impact relates to the non-carbon-based fuel portfolio, while the implicit effect refers to reduced rent for fossil fuels. Besides, efficient government performance to promote environmental regulations and shrink the size of the informal economy helps the countries increase the environmental benefits achieved by the structural, technological, and resource portfolio policies, especially in developing and underdeveloped economies.

Conclusion and policy implications

The linkage between the shadow economy and carbon dioxide (CO2) emissions was evaluated for 145 countries from 1991 to 2017. In assessing the effect of the shadow economy on CO2 emissions, the estimation methods of Canay’s (2011) panel quantile regression, panel fixed effects, and panel smooth transition regression were used. In addition, to deal with the problem of parameter heterogeneity, we used the procedure of Lin and Ng (2012) to identify groups of countries that share homogeneous parameters.

Identifying groups of countries that share homogeneity of parameters is an advised procedure when we are researching the impact of the shadow economy on the environmental degradation of the panel of countries. Indeed, this research reveals important differences in the impact of variables, otherwise masked if estimation was done for the panel of countries. For example, no environmental Kuznets curve was found for all countries. Nevertheless, we found that this environmental Kuznets curve has statistically significant parameters for the groups of countries with homogeneous parameters. This result supports different turning points for different groups of countries.

The procedure of Lin and Ng (2012) identified two groups sharing parameter homogeneity. Group 1 comprises 42 countries and includes most countries of the European Union plus the UK and the largest developed economies, such as the US and Japan. Group 2 comprises 103 countries. However, the two groups also reveal differences, as the average per capita CO2 emissions of Group 1 countries are higher than those of Group 2. Furthermore, group 1 is richer, has a higher informal economy dimension, a smaller relative average GDP, a more urban population, and a slightly higher manufacturing share of GDP. Nevertheless, the most sizable difference between the groups regards fossil fuel rents for group 1 more than triple that of group 2.

Shadow economy was revealed to contribute to reducing CO2 emissions in group 1 and aggravate it in group 2. Manufacturing value-added as a percentage of GDP was only statistically significant for the countries of group 1. Fossil fuel rents as a percentage of GDP aggravate the CO2 emissions, but more markedly in group 2 countries. Urban population as a percentage of the total population contributes to the aggravation of CO2 emissions in both groups of countries, but much more intensely for group 1. We also found evidence of a tendency to decrease CO2 emissions, reflecting efficiency gains over time.

The evolution over time of the covariates’ slopes implied by the PSTR estimates for the complete sample supports a weakening tendency for the linkage between CO2 emissions and GDP and an impact of fossil fuel rents strengthening over time (hypothesis 4). The same tendencies for GDP and fossil fuel rents were detected for the groups, but the environmental degradation caused by the manufacturing share has disappeared in recent years. The direct mitigating effect of time did not result from changes in other covariates’ slopes entirely fading away. For group 1, an increasing impact of manufacturing and fossil fuel rents on CO2 emissions was found. For group 2, the effect of time on CO2 emissions is more pronounced and shows a decrease in intensity in the middle of the 2000s.

Moreover, the multilateral impacts of the informal economy on the broad concept of environmental quality ask for interdisciplinary research. Furthermore, it is due to a wide range of hidden aspects through underground activities that can significantly intensify undesirable market effects and environmental concerns, e.g., methane emissions and marine pollution, habitat destruction, and local anomalies, which are suggested as the major limitations of this work and hence, considered to study by further research.

We can draw several conclusions from our results that may guide policymakers in their policy design. First, we have clearly shown that the shadow economy distorts the true relationship between carbon dioxide emissions and economic development. Therefore, decision-makers should strive to properly account for the informal activities in the measured gross domestic product to accurately assess their EKC stage and propose appropriate policies to reduce emissions. Second, late-developing countries may benefit from developing innovative and clean technologies. To achieve this goal, they should open their economies to foreign investment and design policies that encourage the adoption of these advanced techniques. Third, we have shown that manufacturing and urbanization contribute to environmental degradation. Countries should counteract these effects by promoting the transition to higher value-added products and services to avoid being caught in the middle-income trap and reduce emissions. They should also control and adequately plan the growth of cities to escape the environmental degradation driven by the urbanization process. Finally, countries that rely heavily on fossil fuel rents should use the resources generated by their sale to transition to a green economy. Fossil fuels are depleting resources, and these rents will not last forever. Therefore, these countries must design policies that foster sustainable development before these rents start declining.

Future research should focus on the accurate measurement of true economic activity and its relationship with environmental degradation. Countries take different approaches regarding including informal activities in the official gross domestic product reports, and the procedures adopted are often unclear, especially in less developed countries. Therefore, we think it is essential to create a comprehensive database that accurately measures the economic development of all countries to avoid distortions in assessing the EKC hypothesis.