1 Introduction

In the context of the COVID-19 pandemic, the inner link between urbanized population, economic development and atmospheric environment is undergoing profound changes (Wang & Su, 2020). This is a multiple challenge for emerging countries. Since the Reform and Opening up in the late 1970s, the most remarkable result of China’s transformation from an agricultural country to an industrial country is that a large number of rural people have migrated to cities and towns and obtained urban household registration. The data from the National Bureau of Statistics of China show that in 1978, China’s urban population was only 172 million, while the rural population was 790 million, with an urbanization rate of 17.9%. However, by 2018, China’s urbanized population had reached 831 million, with an urbanization rate of 59.6%. Rapid urbanization has provided tremendous potential market and development impetus for economic growth, but at the same time it has produced a series of by-products that people have to bear, the most prominent of which is environmental degradation and atmospheric pollution, which in recent years has received widespread attention of the whole society. According to a State Council report on environmental quality in 2018, only 121 of 338 prefecture-level and above cities in China met the air quality standards, accounting for 35.8%, and the annual rate of severe and above pollution days was 2.2%. The average annual concentration of fine particulate matter (PM2.5) was 39 ug/m3, 11.4% exceeding the standard. The average annual concentration of inhalable particulate matter (PM10) was 71ug/m3, 1.4% exceeding the standard.Footnote 1 Figure 1 shows that the average PM2.5 concentration in Chinese cities from 1998 to 2016 shows an overall rising trend. In order to solve the deteriorating environment, the Chinese government has taken many measures, such as controlling the population flow to the city through the household registration system, restricting the traffic of vehicles in urban central areas, and evacuating the citizens who are over-concentrated to the suburbs (National Development and Reform Commission, 2022). However, the comprehensive effects of these measures need to be evaluated, among which the determination of a reasonable range of urbanization population is the key.

Fig. 1
figure 1

The average PM2.5 concentration in cities of China over the period 1998–2016

The role of urbanization was tested in Turkey’s traditional environmental Kuznets curve (EKC) for its economic development and rapid urbanization, while the EKC is not inverted U-shaped in Turkey (Katircioğlu & Katircioğlu, 2018). Autoregressive distribution lag (ARDL) technique is used to test cointegration and short-term and long-term estimates, and vector error correction model (VECM) is used to analyze the directional causality between time series data. Long-term parameter estimates show that energy intensity, real GDP, industrialization and urbanization increased by 1%; carbon dioxide emissions increased by 1.1%, 0.6%, 0.3% and 1.0%, respectively (Liu & Bae, 2018). Urbanization and industrialization have significant impacts on energy consumption and carbon dioxide emissions, but the relationship between them varies at different stages of economic development. Considering the dynamics and heterogeneity of national samples, urbanization has an inverted U-shaped relationship with carbon dioxide emissions, which is consistent with the higher environmental pollution observed in underdeveloped areas. The mechanism of anthropogenic factors affecting the concentration of PM2.5 remains unclear. But it is undeniable that if China adheres to the current development model, economic growth, industrialization and urbanization will inevitably lead to an increase in annual PM2.5 concentration (Li et al., 2016).

Urbanization brings environment-related structural changes, which constitute the endogenous driving force of urban economic growth (Adams & Klobodu, 2017; Al-Mulali et al., 2013; Arvin et al., 2015; Chen et al., 2018). The Kyoto Protocol marks the importance and concern of governments to climate change, which was a policy constraint that had to be considered in the past. Using a variety of methods to estimate panel data within the framework of the STIRPAT model, Bargaoui et al. (2014) found that urbanization and Kyoto Protocol had significant impacts on emission levels. The elasticity of CO2 emissions urbanization was positive in the early stage of urbanization and negative in the later stage of urbanization (Bekhet & Othman, 2017). A cointegrating relationship is between fossil fuel energy consumption, foreign direct investment, urbanization, and CO2 emission in middle-income countries of the South and Southeast Asian (SSEA) region, although the inverted U-shaped relationship between them has not been confirmed (Behera & Dash, 2017). An analysis of the cross-city panel of 64 cities from four large urban agglomerations in China by applying STIRPAT model presents that the proportion of urban population has a positive impact on residents’ carbon dioxide emissions, and even a 75% demarcation point has been pasted in China’s urban agglomerations (Bai et al., 2019). Based on the threshold model, the relationship between urbanization and greenhouse gas (GHG) emissions was examined using the threshold panel data of 60 countries from 1971 to 2012, it is found that the relationship between urbanization rate and greenhouse gas emissions is always positive, and when the urbanization rate exceeds 23.59%, the impact of urbanization on greenhouse gases will be greater (Du & Xia, 2018).

Policies to encourage vehicle mileage reduction and more efficient modes of transport are often seen as means of reducing GHG emissions (Mishalani et al., 2014). Ouyang and Lin (2017) compared the urbanization stages of China and Japan and analyzed the similarities and differences of influencing factors of carbon dioxide emissions, which indicates that although carbon dioxide emissions in Japan and China show similar rigid growth characteristics in the process of urbanization. Based on the ARDL boundary test method to test the long-term relationship between structural fracture variables, the results show that under the EKC hypothesis, the relationship between urbanization and carbon dioxide emissions is positive (Shahbaz et al., 2014). The econometric analysis of the driving factors of carbon emissions in 32 provinces of China shows that the relationship between urbanization rate and carbon emissions is a three-stage dynamic relationship. In the provinces with the largest proportion of service industry and high urbanization rate, the three-level curve shows growth, positive decline and negative growth successively (Shi & Li, 2018). After checking the integral property of variables by unit root test, the Bayer–Hanck combined cointegration method is used to test the cointegration relationship between variables, and the robustness of the long-term relationship with structural fracture is tested by ARDL boundary test method (Shahbaz et al., 2015). The study found that the relationship between urbanization and carbon dioxide emissions is U-shaped, that is, urbanization initially reduced carbon dioxide emissions, but after the threshold level, it increased carbon dioxide emissions. Wang et al. (2019) used the geographically weighted regression (GWR) model to analyze the impact of urbanization quality on carbon dioxide emissions, revealing the spatial differences of 30 provinces in China. Another study found an inverted U-shaped relationship between urbanization and carbon dioxide emissions in central and Western China, while it is difficult to determine the environmental Kuznets curve relationship between urbanization and carbon dioxide emissions in eastern China, where carbon dioxide emissions monotonously increase with urbanization (Xu et al., 2016).

To summarize, both the theory of ecological modernization and urban environmental change recognize that urbanization has positive or negative effects on the natural environment, and the net effect is difficult to determine (Sadorsky, 2014). A great number of literature have contributed to the research of urbanization and environmental pollution, and some of them have found and confirmed the nonlinear relationship between urbanization and pollution emissions (Du & Xia, 2018; Li et al., 2021; Rong et al., 2015; Wu, 2011). Although urban areas cover only a small portion of the earth’s surface, and the effects of water, heat and movement in cities only extend a few kilometers downwind, in the process of urban construction and operation, GHG emissions are increasing, and the gas in urban areas is the main anthropogenic source (Grimmond, 2007; Zhang et al., 2018). Warm conditions in many cities cause residents to consume more energy and resources to offset this impact, while also making urban residents more vulnerable to heat waves and other extreme conditions. Urbanization is an important cause of pollutant emissions, for example, a study shows that rapid urbanization in Tianjin during the period 1997–2012 had resulted in a 74.1% increase in household consumption-related carbon dioxide emissions (Zhu et al., 2017). The energy gap between urban and rural areas constitutes the basic condition for environmental deterioration (Fan et al., 2015; Liu et al., 2011; Zhou et al., 2015). The cumulative emission of atmospheric pollutants in urbanization is the inherent mechanism of their nonlinear association, which involves a wide range of sustainable urbanization issues, including ecological environment protection, land development, energy use, population growth and migration, housing and policy (Tan et al., 2016).

To this end, this paper focuses on the nonlinear effects of urbanization population on PM2.5 concentration in China with GDP per capita as the threshold of regime transition. With the panel data of 227 prefecture-level and above cities from 1998 to 2016, the main contributions of this paper are as follows. First of all, since the government-led urbanization makes the urbanization rate of China’s prefecture-level cities converge, while urban heterogeneity is mainly reflected in population and economic scale, this paper sets the urbanization population rather than the urbanization rate as the core explanatory variable, and set GDP per capita of corresponding cities as the regime-dependent variable in the panel threshold model (PTM), so that this paper can examine the heterogeneous effect of the urbanized population on PM2.5 concentration with estimated threshold of city size quantified by urban GDP per capita. Although there are not a few related literatures using the PTM method (Du & Xia, 2018; Li & Lin, 2015; Ouyang et al., 2019; Zi et al., 2016), studies on the environmental performance of urbanized population have not appeared in the existing literature. Secondly, when considering the nonlinear environmental effect of urbanization population, this paper not only sets its linear term but also its quadratic term as a regime-dependent variable, which is used to explain the inflection point of PM25 concentration in different economic development stages, so that the dynamic differentiation of urban agglomeration can also be distinguished. Thirdly, the urban GDP per capita appears not only as the threshold variable, but also as one of the control variables in the PTM, and verifies the environmental Kuznets hypothesis. In addition, on the premise that the panel data are available, this paper also fully considers control variables that affect urban PM2.5 concentration, including government expenditure on research and development (R&D) (Strandholm et al., 2021), population density (Chen et al., 2020; Rahman & Alam, 2021), foreign direct investment stock (Marques & Caetano, 2021) and electricity consumption in cities (Chen et al., 2018; Li & Lin, 2015; Li et al., 2022; Wang et al., 2022a), thus enhancing the interpretation of the econometric model.

The remainder of the paper is organized as follows. In Sect. 2, the PTM extended to the quadratic term of urbanization population including related statistics of threshold effect test is introduced and the sample data for the empirical analysis are described. Section 3 is devoted to the empirical findings from the single- and double-threshold PTM with the linear and quadratic terms of urbanization population as the regime-dependent variables. The final section proposes concluding remarks and policy implications.

2 Methodology and data

2.1 The econometric model

To estimate the nonlinear relationship between the urbanization population and PM2.5 concentration, and to create conditions for finding the appropriate scale of urbanization population that interrupts the growth of PM2.5 concentration, the empirical model adopted in this paper is the fixed-effect panel threshold model proposed by Hansen (1999). Based on the number of thresholds, there are many extensions of the PTR model. For the single-threshold PTR model, the specification is as follows,

$$ \ln {\text{PM}}25_{{{\text{it}}}} = \alpha_{i} + \beta_{1} \ln {\text{Urbn}}_{{{\text{it}}}} I\left( {\ln {\text{PGDP}}_{{{\text{it}}}} \le \gamma_{1} } \right) + \beta_{2} \ln {\text{Urbn}}_{{{\text{it}}}} I\left( {\ln {\text{PGDP}}_{{{\text{it}}}} > \gamma_{1} } \right) + \varphi X_{{{\text{it}}}} + \varepsilon_{{{\text{it}}}} $$
(1)

where \(i\) and \(t\) represent cross sections and time, respectively.\( \alpha_{i}\) denotes individual effects; \(\ln {\text{PM25}}\) denote the logarithmic PM2.5 concentration. The representativeness of \(\ln {\text{Urbn}}\) in this paper is twofold: the logarithmic urbanization population and squared term of the logarithmic urbanization population. \(I\left( \cdot \right) \) denotes the indicator function with \(\gamma_{1}\) as the threshold value. \( \varepsilon_{{{\text{it}}}}\) is the error term. \(X\) represents the set of control variables, including the logarithmic gross domestic production (GDP) per capita(\(\ln {\text{PGDP}}\)), the logarithmic squared term of GDP per capita(\(\ln {\text{PGDP}}^{2}\)), the logarithmic government public expenditure on R&D (In RD)the logarithmic stock of inward foreign direct investment (\(\ln\) FDI), logarithmic urban population density (\(\ln\) DEN) and the logarithmic urban electricity consumption (\(\ln {\text{ELE}}\)). The purpose of incorporating GDP per capita and its squared term into the panel threshold model is not only to be the economic driver of pollution emissions, but also to verify the environmental Kuznets hypothesis. As mentioned above, there has been a lot of empirical evidence that shows an inverted U-shaped relationship between GDP per capita and pollution emissions, so the estimated coefficient of GDP per capita is expected to be positive, while the estimated coefficient of the quadratic term of GDP per capita is negative. Referring to the estimation method of physical capital stock, the calculation formula of foreign direct investment stock is as follow,

$$ {\text{FDI}}_{{{\text{it}}}} = \left( {1 - \delta } \right){\text{FDI}}_{{{\text{it}} - 1}} + {\text{Inv}}_{{{\text{it}}}} $$
(2)

where \({\text{Inv}}_{{{\text{it}}}}\) refers to the annual flow of foreign direct investment at city \(i\) in \(t\) year. \({\updelta }\) is the depreciation rate of the stock of foreign direct investment.

For the double-threshold PTR model, the specification is as follows,

$$ \begin{aligned} \ln {\text{PM}}25_{{{\text{it}}}} & = \alpha_{i} + \beta_{1} \ln {\text{Urbn}}_{{{\text{it}}}} I\left( {\ln {\text{PGDP}}_{{{\text{it}}}} \le \gamma_{1} } \right) + \beta_{2} \ln {\text{Urbn}}_{{{\text{it}}}} I\left( {\gamma_{1} < \ln {\text{PGDP}}_{{{\text{it}}}} \le \gamma_{2} } \right) \\ & \quad + \beta_{3} \ln {\text{Urbn}}_{{{\text{it}}}} I\left( {\ln {\text{PGDP}}_{{{\text{it}}}} > \gamma_{2} } \right) + \varphi X_{{{\text{it}}}} + \varepsilon_{{{\text{it}}}} . \\ \end{aligned} $$
(3)

Referring to Hansen (1999), for parameter estimation of the threshold value \(\gamma\), it can be obtained by minimizing the sum of squares of errors, \(S_{1} \left( \gamma \right) = \hat{e}\left( \gamma \right)^{\prime } \hat{e}\left( \gamma \right)\), where \(\hat{e}\left( \gamma \right)\) is the vector of regression residuals. Thus, the least squares estimate of the threshold value γ is as follows,

$$ \hat{\gamma } = \mathop {\arg \min }\limits_{\gamma } S_{1} \left( \gamma \right) $$
(4)

However, whether the threshold value can reach statistical significance is a question to be considered. For this reason, the single-threshold model of Eq. (1) is taken as an example, and the null hypothesis of no threshold effect is that, \(H_{0} :\beta_{1} = \beta_{2}\). The likelihood ratio test is used for checking the acceptability of the null hypothesis \(H_{0}\) with the F-statistics: \(F_{1} = \left( {S_{0} - S_{1} \left( {\hat{\gamma }} \right)} \right)/\hat{\sigma }^{2}\), where \(S_{0}\) denotes the sum of squared errors under the alternative hypothesis which is calculated by residual error obtained by regression parameter of the no threshold econometric model after the fixed-effect transformation. \(\hat{\sigma }^{2} \) is the residual variance. Since the null-asymptotic distribution of the likelihood ratio test is noncritical, it is best to use the bootstrap process to approximate the sample distribution and then derive the bootstrap asymptotically effective p-value of the corresponding F-value under \(H_{0}\). If the p-value is less than the desired critical value, then a null hypothesis of no threshold will be rejected. Besides, we consider the construction of the threshold parameter confidence intervals and then test whether the estimated threshold value is a consistent estimator. Due to the interference of these parameters, traditional statistical methods will be nonstandard. To overcome this problem, Hansen (1999) constructed a "no-rejection region" of an asymptotically effective confidence interval using the maximum likelihood ratio (LR) statistic as follows,

$$ LR_{1}^{\gamma } \left( \gamma \right) = \frac{{S_{1} \left( \gamma \right) - S_{1} \left( {\hat{\gamma }} \right)}}{{\hat{\sigma }^{2} }} $$
(5)

The above LR statistics and their confidence intervals only consider the single-threshold condition. In many specific applications, there may be two or more thresholds. In these cases, similar methods proposed by Hansen (1999) can be used to search them out and ensure their robustness.

2.2 Data

There are two main sources of the sample data in this paper. One is the China Urban Statistical Yearbook published by the Department of Urban Social and Economic Survey of the National Bureau of Statistics (https://data.cnki.net/yearBook/single?id=N2022040095), which contains the data of urbanized population, total population, urban GDP, inward foreign direct investment, government expenditure on R&D and electricity consumption. Another is the annual world PM2.5 density map released by the Columbia University (https://sedac.ciesin.columbia.edu/data/set/aqdh-pm2-5-concentrations-contiguous-us-1-km-2000-2016). Based on the original information provided by satellite simulation and monitoring, the annual average concentration of PM2.5 in China’s prefecture-level cities was derived. Specific estimates of PM2.5 were obtained from global geophysical satellites using the GWR model.

3 Empirical findings

Before nonlinear parameter estimation, this paper adopts mainstream methods to carry out the stationary test and the cointegration test, and the results are exhibited in Table 1 and Table 2, respectively. The unit root test is to test whether the series is stationary, and the existence of unit root is a nonstationary series. Nonstationary series can be obtained by eliminating unit root by difference method. Apparently, the series of all variables is a first-order stationary series I (1). As for the cointegration test, panel-specific average and panel-specific time trend can be included in the panel cointegration regression model. Most cointegration tests have a common null hypothesis, that is, there is no cointegration. In recent years, panel cointegration technology has received extensive attention in studying the long-term relationship between the integration variables of time series dimension and cross-sectional dimension. One of the most important reasons for this concern is that considering not only the time series dimension, but also the cross-sectional dimension may increase robustness. However, many studies have not denied the null hypothesis of nonintegration. In response, Westerlund test developed a panel cointegration test, which is based on structure rather than residual dynamics, so no co-factor constraint is imposed. Conclusively, various tests in Table 2 have confirmed the existence of cointegration.

Table 1 Panel unit root tests
Table 2 Panel cointegration tests

According to the empirical method described above, firstly, we take the urbanization population as the regime-dependent variable and the GDP per capita as the threshold variable to estimate the parameters. As shown in Table 3, we proceed from the single-threshold model in order to find the appropriate PTR model for the relationship between urbanization population and PM2.5 concentration. The F-statistic and its p-value show that the single-threshold effect meets the significant requirement, and there is at least one threshold between urbanized population and PM2.5 concentration. The logarithmic threshold value (\({\gamma }_{1}\)) is 8.8213, which indicates that the inflection point appears when the GDP per capita of the city in mainland China reaches ¥ 6777 Yuan. Figure 2 shows the trend of LR statistics in the single-threshold model with urbanization population as the regime-dependent variable. The LR statistic approaches the axis about the two times. The first one occurs which equals to zero when the threshold value is 8.8213, that is, the LR statistic approximates zero on this threshold value. The signs of the coefficients of the linear and quadratic terms of the urbanization population reveal that the urbanization population and PM2.5 concentration are U-shaped in the overall view of the major cities in China. After the urbanization population reaches a certain level, its driving effect on the increase of PM2.5 concentration will continue to expand, although the level of urbanization population at the inflection point is restricted by the GDP per capita of the city. When the threshold value is inserted into the level item of urbanization population, only when the urban per capita GDP reaches a higher threshold can the population scale realize the transition of the relationship with pollution emissions at a lower inflection point.

Table 3 The estimated parameters of single threshold
Fig. 2
figure 2

LR statistics of single threshold with urbanization population as the regime-dependent variable

As for the GDP per capita of cities, it is introduced into the model not only as a threshold variable but also as a control variable to support the environmental Kuznets hypothesis. The results of the single-threshold model show that the inverted U-shaped relationship between urban population and PM2.5 concentration is obvious, and when urban GDP per capita reaches ¥ 142,007 Yuan, amounting to 21,460 $ at 2018 exchange rates, the inverted U-shaped relationship enters the inflection point. That is to say, the driving force of the increase of per capita GDP to PM2.5 concentration reaches the peak. As the coefficients of urbanization on CO2 emissions increase initially and then decrease as a factor of increasing industry share in GDP (Zi et al., 2016), the relationship between PM2.5 concentration and GDP per capita is inverted U-shaped (Chen et al., 2018; Ji et al., 2018; Wang et al., 2022b). By comparison, Zhao et al. (2018) found that when real GDP per capita reached $5942, PM2.5 concentration reaches its peak, which is much smaller than the inflection point we found. The main reason may be that they use provincial panel data rather than prefecture-level cities. Besides, apart from the fact that the estimated coefficients of government R&D expenditure and urban electricity consumption are not in line with expectations, other estimated parameters such as inward FDI and urban population density show a significant positive correlation with PM2.5 concentration.

Taking the possibility of the double-threshold model into account, Table 4 reports the effect test and estimation parameters of the double-threshold model. As the single-threshold model, the bootstrap was also used for the threshold-effect test in the double-threshold model, and the number of bootstrap replications of both the models is 300. Apparently, the F-statistic and p-value of single-threshold model rather than double-threshold model satisfy the requirement of the threshold effect test. However, even in the double-threshold model which does not meet the threshold test, the urbanized population and PM2.5 concentration still show a robust inverted U-shaped relationship. The value of the two thresholds in the double-threshold model is 8.7571 and 9.2395, which means that the per capita GDP of cities reaches the inflection point at ¥ 6355 Yuan and ¥10,295 Yuan, respectively. Figure 3 further exhibits the trend of LR statistics under the first and second thresholds, respectively, from which we can find the corresponding threshold value that LR statistics equal to zero. For the relationship between GDP per capita and PM2.5 concentration, the environmental Kuznets hypothesis is also confirmed. The urban PM2.5 concentration peaks at the urban GDP per capita of ¥19,3024 Yuan, amounting to 29,169 $ at 2018 exchange rates.

Table 4 The estimated parameters of double threshold
Fig. 3
figure 3

LR statistics of double threshold with urbanization population as the regime-dependent variable

In fact, the threshold effect involves not only linear item of urbanized population, but also the quadratic item. Therefore, the quadratic term of urbanization population is introduced into the panel threshold model as a regime-dependent variable. The corresponding threshold effects and parameter estimates are reported in Tables 5 and 6, respectively. For the single-threshold model, F-statistic is validated with p-value of 0.0567, so the threshold effect test indicates that there is at least one threshold. The linear and quadratic items of urbanization population share the same threshold value, \({\gamma }_{1}\)= 9.2395, which can be observed in more detail in the trend chart of LR statistics in Fig. 4. The LR statistics hit the axis at the threshold of 9.2395, which implies that when the GDP per capita of the city is 10,296 Yuan, the linear and quadratic items of the urbanization population are at the turning point of the threshold. On both sides of the threshold, the symbols of estimating parameters of the primary and secondary terms of urbanization population further describe the actual situation of the U-shaped relationship between urbanization population and PM2.5 concentration. The difference between the two sides is that the threshold effect leads to slightly different locations of the inflection point of the U-shaped curve. In addition, the peak value of the inverted U-shaped curve of the Kuznets hypothesis confirmed by GDP per capita here is 120,592 Yuan, equivalent to 18,223 US dollars at 2018 exchange rates, which implies that only those cities whose economic development level has entered the international standard of developed economies seem to step into the overall improvement path of environmental pollution represented by PM2.5 concentration.

Table 5 The estimated parameters of single threshold
Table 6 The estimated parameters of double threshold
Fig. 4
figure 4

LR statistics of single threshold with both linear and quadric terms of urbanization population as the regime-dependent variables

Furthermore, in the case that the linear and quadratic items of urbanization population are regime-dependent variables, there is also necessary of the determination of threshold number to find the appropriate threshold effect. For the double-threshold PTR model, the threshold effect test rejects the null hypothesis that there is no threshold and only one threshold, which indicates that the double-threshold model is suitable. Hence, Table 6 further gives the parameter estimates of the double-threshold PTR model with the linear and quadratic terms of urbanization population as regime-dependent variables. From the trend of LR statistics presented in Fig. 5, we can see the specific positions of the two thresholds. The LR statistics in the subgraph above hit the axis at the threshold of 9.2395, indicating that the LR statistics at this position of this subgraph are equal to zero, while the LR statistics in the subgraph below hit the axis at the threshold of 8.8213, that is, the LR statistics at this position of the subgraph are equal to zero. Thus, it can be concluded that the two thresholds of the double-threshold PTR model are 6777 Yuan and 10,296 Yuan, respectively. When urban GDP per capita reaches these two values, the impact of urbanized population on PM2.5 concentration undergoes a regime transition, although this regime transition does not change the nonlinear features of the relationship between the two, but only causes the inflection point to move of the U-shaped curve. The relationship between urban GDP per capita and PM2.5 concentration is still robust inverted U-shaped. According to the estimation coefficients of the linear and quadratic terms of urban GDP per capita exhibited in Table 6, when the urban GDP per capita reaches US $ 23,950, the growth rate of PM2.5 concentration begins to enter a downward range. The economy of most prefecture-level cities in China is still in the developing stage, which makes the per capita GDP of only a few coastal cities except the central municipalities reach this critical point. Many cities are facing environmental problems such as the continuous deterioration of smog, and the cost of environmental governance is rising, especially those cities at the peak of energy consumption. The widespread haze problem in local cities poses a serious challenge to urban environmental governance.

Fig. 5
figure 5

LR statistics of double threshold with both linear and quadric terms of urbanization population as the regime-dependent variables

4 Concluding remarks

The most prominent finding of this paper is the nonlinear relationship between urbanized population and PM2.5 concentration in cities at prefecture-level and above in China. Obviously, the specific characteristics of the U-shaped relationship between urban population and PM2.5 concentration are nonlinear adjusted by urban GDP per capita, which can be divided into three groups: When urban GDP per capita is lower than 6777 Yuan, the urbanized population at the U-shaped turning point is 7.4 million people; when the urban GDP per capita is between 6777 Yuan and 10,296 Yuan, the urbanized population at the turning point of the U-shaped curve is 905,000 people. When urban GDP per capita exceeds 10,296 Yuan, the urbanized population at the inflection point of the U-shaped curve is 12.967 million people, which can only be achieved by Beijing, Tianjin, Shanghai, Chongqing and several megacities. At the medium level of urban GDP per capita, urban PM2.5 concentration is the most sensitive to urban urbanized population. Even when the size of the urbanized population is very low, the increase of the urbanized population will significantly increase the PM2.5 concentration. Therefore, urbanized population size and income per capita are two factors contributing to a city’s PM2.5 heterogeneous footprint. China’s urban layout is far from adaptable to rapid social and environmental changes, with hundreds of millions of rural people migrating to towns and suburbs due to rapid and unexpected urbanization. The increasingly prominent crowding effect of metropolises and megagglomerations has prompted the government to put forward the road of urbanization dominated by small towns. However, the supporting public infrastructure and services are relatively backward. Thus, controlling the size of urban population and establishing an environmentally friendly urban system are the most important tasks for cities with uncoordinated population and economic development.