Estimating different order polynomial logarithmic environmental Kuznets curves

This paper contributes to the environmental literature by (i) demonstrating that the estimated coefficients and the statistical significance of the non-leading terms in quadratic, cubic, and quartic logarithmic environmental Kuznets curve (EKC) specifications are arbitrary and should therefore not be used to choose the preferred specification and (ii) detailing a proposed general-to-specific type methodology for choosing the appropriate specifications when attempting to estimate higher-order polynomials such as cubic and quartic logarithmic EKC relationships. Testing for the existence and shape of the well-known EKC phenomenon is a hot topic in the environmental economics literature. The conventional approach widely employs quadratic and cubic specifications and more recently also the quartic specification, where the variables are in logarithmic form. However, it is important that researchers understand whether the estimated EKC coefficients, turning points, and elasticities are statistically acceptable, economically interpretable, and comparable. In addition, it is vital that researchers have a clear structured non-arbitrary methodology for determining the preferred specification and hence shape of the estimated EKC. We therefore show mathematically and empirically the arbitrary nature of estimated non-leading coefficients in quadratic, cubic, and quartic logarithmic EKC specifications, being dependent upon the units of measurement chosen for the independent variables (e.g. dependent upon a rescaling of the variables such as moving from $m to $bn). Consequently, the practice followed in many previously papers, whereby the estimates of the non-leading terms are used in the decision to choose the preferred specification of an estimated EKC relationship, is incorrect and should not be followed since it potentially could lead to misleading conclusions. Instead, it should be based upon the sign and statistical significance of the estimated coefficients of the leading terms, the location of turning point(s), and the sign and statistical significance of the estimated elasticities. Furthermore, we suggest that researchers should follow a proposed general-to-specific type methodology for choosing the appropriate order of polynomials when attempting to estimate higher-order polynomial logarithmic EKCs. Supplementary Information The online version contains supplementary material available at 10.1007/s11356-021-13463-y.

relationship between economic activity and environment. Dinda (2005) presents a feasible theoretical justification for EKC in the framework of an endogenous growth model. Kijima et al. (2010) review different theoretical approaches that explain the EKC hypothesis and emphasize the need for developing economic models from different (theoretical and empirical) points of view.

A1.3 Empirical issues
There has been an increasingly growing body of empirical literature that has investigated the PIR and concluded different and even contradictory results. As an example, Yang et al. (2015a) tested over 140 million models using Chinese data and concluded that the environmental indicator is positively and linearly related to the income variable, rejecting the quadratic case.
However, given that China is a developing economy, the results of Yang et al. (2015a) arguably support the EKC literature, suggesting that economic growth increases environmental pollution during the development stage up to some threshold level of income. In contrast, Arshad et al. countries. In addition, Khanna and Plassmann (2004) Auci and Becchetti (2006), using data for 54 countries, examined the adequacy of the quadratic functional specification with and without other explanatory variables. They concluded that the model, which does not include all theoretically justified variables. might produce misleading results. Roca et al. (2001), investigating the EKC phenomenon for Spain, negate the EKC, concluding that the PIR depends on many factors, and economic growth by itself cannot solve environmental problems. Furthermore, whether total or per capita income should be used has been discussed as an issue of concern in the literature (see for example, Selden and Song, 1994;Friedl and Getzner, 2003).

A1.4 Mathematical/statistical issues
In addition to the above-mentioned theoretical and empirical issues, there have also been concerns related to mathematical/statistical grounds on which the EKC hypothesis was based upon. They encompass the inclusion of the trend in the specification, issues related to the turning point, level variables versus logged variables, among others (see Lieb 2003, inter alia). Moreover, Stern et al. (1996) and Stern (2004), inter alia, highlighted the importance of testing the variables utilised and the relationships for integration-cointegration properties, the lack of which was subject to spurious regression results (Engle and Granger, 1987). Romero-Avila (2008), using data for 86 countries, concluded that per capita world GDP is non-stationary, while CO2 is found to be regime-wise trend stationary, which terminates the potential long-run relationship between the two. This, in turn, questions many EKC studies that have employed Moreover, some research attempting to investigate the PIR from different aspects, specifications, and econometric/economic points of view, address concerns such as using different functional forms, non-parametric techniques, embedding the non-linearity of the PIR etc. (see Galeotti et al., 2006;Liddle and Messinis, 2016;Apergis, 2016;Moosa, 2017;Mikayilov et al., 2018, inter alia). These studies again produced varying results, including a positive increasing environmental impact for many developed countries.
The reviewed literature shows that there is still some way to go before we have a clearer idea on the PIR and the theories/techniques to properly reveal it. Although there is some debate about the PIR, the response of environmental quality to economic growth, especially for the long term, is most likely to be non-linear. 1 This argument can be rationalized from both theoretical and mathematical/statistical points of view. First, from the theoretical perspective, as mentioned in the related literature (Lieb, 2003;Dinda, 2004, inter alia), there is a demand Environmental Kuznets Curves for environmental quality, whether it is regarded as being a normal or a luxury good. At the early stage of a country or society's development, meeting the first items on the top of the demand pyramid is preferred, so controlling environmental degradation is not a top priority.
Later, while the first necessities are met and society has improved its environmental awareness, a cleaner environment becomes a concern. This likely results in a change in the PIR. Therefore, from a theoretical point of view, it is rational to consider the non-linear PIR.
From the mathematical/statistical point of view, a linear relationship (constant response) is a special case of the non-linear curve (varying response), whose existence needs testing (Park and Hahn, 1999; Castle and Hendry, 2019, inter alia). As proposed by Lobachevskian geometry, the constant response (linear relationship) is the result of our inability to capture the big picture (non-linear relationship), which nests the constant as a shorter-period response. To put it differently, as discussed in Juselius (2006), a researcher having the same observations from 1 to 6 will not be able to distinguish the case in Figure A1, constant mean and variance, from the case in Figure A2  Hence, the linear relationship might result from two channels. First, the country is still 'climbing the hill' to a better quality of life, prioritizing income and other targets over environmental quality. This can be interpreted as part of the nonlinear relationship before the peak; the PIR might not be quadratic though. Second, the investigator may be faced with a poor sample as mentioned above (example from Juselius, 2006)

. Environmental Kuznets Curves
Considering the above-mentioned points, it seems quite feasible to model the PIR as a nonlinear relationship. In doing so, in addition to employing some other rare methods to capture non-linearities, the use of quadratic (cubic) functional specifications is still widely used to investigate the PIR (as discussed in Lieb, 2003;Dinda, 2004;Kijima et al., 2010, inter alia).
Considering the broader use of quadratic and other higher order polynomial functional specifications, this study uncovers, to the best of our knowledge, one of the untouched issues related to these specifications in logged variables. Namely, we investigate whether the coefficients of the higher order polynomial functional specifications are (in)variant to the rescaling of the independent variables since the existence and shape of a PIR rely on the sign, size, and statistical significance of the estimated coefficients and we therefore believe that this issue is a crucial contribution to the EKC literature. Environmental Kuznets Curves

Appendix 2: Cubic and Quartic EKC Specifications
This appendix addresses the same issue as those in the main text but focuses on the cubic and Quartic logarithmic EKC specifications, to show that the findings from the quadratic and cubic specification equally applies.

A2.1.1 Estimated parameters and statistical significance
Utilising the same scaling as in Eq. (3) suggests that caution is needed when interpreting such estimated coefficients, given that a switch from positive to negative, for example, could be solely from using a rebased activity variable. Like the quadratic case, the standard errors, and t-values (and consequently the significances) of the non-leading coefficients would also change with the rebased data, while Environmental Kuznets Curves they remain invariant for the leading term coefficient. To show these, the impact of rescaling on the t-values of 1 , 2 and 3 are considered below: Considering Eqs. (A4)-(A6) in Eqs. (A10)-(A12) results in the following: with Eq. (A15) gives us: Therefore, the t-values and hence the statistical significances of the coefficients of the nonleading terms are unit dependent, while the significance of the coefficient of the leading term is invariant to the rescaling. This implies that the only necessary condition for a cubic (Nshaped) EKC is that the leading term, 3 is positive and statistically significant. 3

A2.1.3 Estimated elasticity and statistical significance
Despite this, the estimated elasticity of with respect to is not unit dependent. The elasticity for Eq. (2b) is given by: And the elasticity for Eq. (A1) is given by: * = * * = * = 1 * + 2 2 * * + 3 3 * 2 * (A22) However, substituting Eq. (2b) and Eqs. (A3) -(A6) into Eq. (A22) and rearranging gives: * = 1 + 2 2 + 3 3 2 (A23) thus: so that the estimated elasticity of with respect to is invariant to the units used for the activity variable. The significance of the elasticity for the cubic case is also unit independent, like the quadratic case, as shown by the proof below.

A2.1.4 Sufficient condition
Mathematically, the sufficient condition for a cubic (N-shaped) EKC like Eq. (2b) is that the discriminant of the derivative function is positive, that is 2 2 − 3 1 3 > 0 with 3 > 0 (and statistically significant) and the turning points are within the sample range. In other words, the sufficient condition for an estimated cubic (N-shaped) EKC is that the turning point is within the sample range with the estimated pairwise elasticities positive and significant for the initial upward sloping part of the estimated curve, but they approach zero and become insignificant at the first turning point, thereafter, becoming negative and significant on the downward sloping part. And that after the first turning point, the estimated pairwise elasticities continue to be negative and significant but approach zero and become insignificant at the second turning point, thereafter, becoming positive again and significant on the next upward sloping part of the estimated curve.

A2.2.2 Estimated turning points
The formula for turning points can be derived equating the derivative of (2c) to zero, which gives the cubic equation below: Since the formulas are highly involved, we will not derive the relationship between the two cases. Rather, we will derive the effect of rescaling done using software and provided in the empirical section.