1 Introduction

With rapid development of economy, air pollutants emissions in China have increased dramatically during the past several decades, especially in urban regions [1, 2]. Urban emissions mainly come from power generation, industrial facilities, transportation and residential sources [3]. Strict Clean Air Action Plan has been implemented by central and local government since early 2013 to ameliorate serious air pollution across China. Chinese Ministry of Ecology and Environment has tightened emission limits of power plants since then, especially the limits of the coal-fired ones.

On the other hand, environmental, chemical and economical studies have demonstrated a strong evidence base for associations between energy production and air pollution, in which power generation can have adverse effects on air quality, particularly by burning coal [4]. Emission datasets at the level of individual generating units (power plants) have often been established (to calculate an emission index which might be generally applied) to search for specific opportunities for reducing undesirable air pollutants emissions in countries and the globe [5,6,7,8]. However, despite a growing concern that power generation might be linked to air pollution at a regional level, we are aware of only one published study that attempts to identify this relationship [9].

In this paper, we use a different identification strategy to explore the effect of electricity generation on air pollution at a city scale. To do so, we combine four highly detailed datasets in four provincial cities in China. We merge the electricity production data, meteorologic data and economic data with monthly city-level air pollution (air quality index (AQI)Footnote 1 and PM2.5, PM10, SO2, CO, NO2, O3) measurements in Beijing, Tianjin, Chongqing and Shanghai from December 2013 to December 2019. We process an Ordinary Least Squares Regression of the above-mentioned panel data with city–month fixed effects, with consideration of industry and city heterogeneity.

We have several important findings. First, we find that a unit (100 million kwh) increase in power output is associated with a 0.3-unit (representing value in Table 2) increase in air pollution (AQI), nearly all of which is driven by increases in thermal power output. The results are robust to a battery of tests and alternative specifications and are not explained by corresponding weather or economy conditions. Second, we show that PM2.5, PM10, SO2 and NO2 are quantitatively affected by electricity generation, in which PM2.5 and PM10 effects are more obvious in thermal power industries, while SO2 and NO2 effects are relatively significant in other power industries. In contrast, changes in power output have no statistically significant impact on CO and O3. Finally, we display that pollution effects of electricity generation are heterogeneous not only in different industries, but also in different cities.

Our results have some meaningful implications for future research and policy. Power industry is the backbone of the industrial world; hence, supplying essential energy and cutting emissions simultaneously have become a substantial global issue. Our results indicate that comprehensive consideration of multiple pollutants beyond single ones is important but understudied in power industries. And our understanding of pollution emissions of power industry should take account of regional-level factors besides meteorology and economy, which might be limited by the current scope of research.

The remainder of the paper is structured as follows: In the following section, we investigate a large group of relevant literature and discuss the reasonable mechanisms driving our results. We then introduce our target area and outline the data used in this paper in Sect. 3. Section 4 presents our econometric model and describes the identification assumptions. In Sect. 5, we provide summary statistics, unit root test and regression results. In Sect. 6, we, respectively, discuss comparative impacts and policy implications, as well as limitations and future work. At last, we provide the conclusion in Sect. 7.

2 Literature review

Energy consumption and generation contribute majorly to both direct and indirect causes of air pollution, and their links have been established across a broad range of disciplines [10,11,12]. Researchers agreed on links between energy consumption and air pollution based on various analysis frameworks [13,14,15]. Wang et al. showed spatial autocorrelation between energy consumption and air pollution in Beijing–Tianjin–Hebei and surrounding areas by global spatial correlation index and local Morans’I scatter chart [16].

More specifically, the mechanisms that link electricity production to adverse air quality outcomes have been explored using diversified methods, such as an impact pathway approach [17], chemical meteorology [18] and an emission factor approach [19]. They concentrate on similar subjects with our study through different mechanistic pathways. The one published study we notice, Zaman and Abd-el Moemen (2017), combines data of electricity production from renewable sources, permanent cropland, high technology exports and health expenditures with carbon dioxide (CO2) emissions at a regional (country) level. The authors run similar experiments with us for different dependent variables and different target areas in different time.

Our study differs from and is a complement to Thanh and Lefevre (2000), Slanina (2004), Sonibare (2010), and Zaman and Abd-el Moemen (2017) in three important ways. First, we construct a monthly, city-level dataset of electricity generation, economy, meteorology and air pollution spanning a sample of four biggest cities in China from December 2013 to December 2019. Second, we exploit several unique properties of our data by including AQI and criteria pollutants involving PM2.5, PM10, SO2, CO, NO2 and O3 in each regression process, comparing their results for further discussion. Third, we find remarkably similar effects to those of Lefevre (2000), Slanina (2004) and Sonibare (2010); and together, our studies provide compelling evidence of pollution impacts of power generation.

3 Data

We merge data on electricity production, air pollution, economy and weather to introduce a dataset in four only provincial cities, including Beijing, Tianjin, Chongqing and Shanghai, in China from December 2013 to December 2019. Our study area is selected on account of the following reasons. Those four cities have political and economical advantages when coping with air pollution, due to their larger scales and higher administrative ranks [20]. The “air pollution” problem has aroused earliest attentions, and the energy-generation technology has been innovated with the fastest speed in these four cities among the whole country [21]. More importantly, Beijing, Tianjin, Chongqing and Shanghai represent different geography and climate conditions including typical north-inland, north-coastal, south-inland and south-coastal types in China, respectively [11]. Given all that, they have completely different air pollution levels, weather conditions, energy structures and socioeconomic status. These dramatically distinct characteristics may result in disparities in the contemporaneous associations between power generation and air pollutants.

The combined dataset concerns a sample of four provincial cities in China. For reference, population of cities in our sample accounts for 5.25% of the total population, and the referred cities’ GDP makes up 11.36% of the whole country. Table 1 displays general descriptions of the cities’ information.

Table 1 Sample description

Air pollution data China National Environmental Monitoring Centre, known as CNEMC, maintains an air pollution database about the history data of cities’ air quality in China, where we acquire the needed air pollution data. The air pollution data analyzed in this paper includes monthly city PM2.5 (µg/m3), PM10 (µg/m3), SO2 (µg/m3), CO (mg/m3), NO2 (µg/m3), O3 (µg/m3) concentration and AQI, that is, the higher the related figure, the more serious the air pollution. The CNEMC provides pollutants’ information by category based on daily reports by the monitoring points.Footnote 2 And we calculate averaging daily mean air pollutant concentrations over the study period to get monthly mean concentration of each air pollutant.

Electricity generation data Our second data source provides information on electricity generation, in denomination of 100 million kwh. The National Bureau of Statistics in China (NBSC) produces nationwide monthly statistical data with items including total power output, thermal power output and renewable power output, which is referred as hydroelectric power, nuclear power, wind power and solar power.Footnote 3 Unfortunately, the electricity generation data are missed in January and February of each year, which is probably due to the Spring Festival of China. And we do the data-cleaning process to ensure a balanced panel dataset.

Economic data We also use monthly city-level economic data which involve investment actually completed in fixed assets (accumulated growth rate), written IACFA growth, value added of industry (accumulated growth rate), written Va.I growth, and purchasing price index for industrial producers (accumulated growth rate of overall prices for raw materials), written PPIIP.Footnote 4

Meteorologic data Lastly, we collect meteorologic data from China Meteorological Data Service Center (CMDC), authoritatively developed by the China Meteorological Administration.Footnote 5 Specifically, we extract surface climate data on monthly minimum temperatures (°C), monthly precipitation days (counting days when precipitation is greater than zero) and wind velocity (calculating an average of wind velocity m/s measured in every 2 min) in the four cities.

4 Model and identification

We estimate the following model to identify the effect of electricity generation on air pollution:

$${\text{pollution}}_{{{\text{cm}}}}^{j} = \lambda_{{{\text{TPO}}}}^{j} {\text{TPO}}_{{{\text{cm}}}} + \lambda_{{{\text{THPO}}}}^{j} {\text{THPO}}_{{{\text{cm}}}} + {\mathbf{X}}_{{{\text{cm}} }} {{\varvec{\upbeta}}}^{j} + \varphi_{{\text{c}}} + \gamma_{{\text{m}}} + \varepsilon_{{{\text{cm}}}}^{j}$$
(1)

where \({\text{pollution}}_{{{\text{cm}}}}^{j}\) is the air pollution level of pollution type (within AQI and PM2.5, PM10, SO2, CO, NO2, O3) j in city c in month m (an observation is a city–month), \({\text{TPO}}_{{{\text{cm}}}}\) is total power output in city c in month m, \({\text{THPO}}_{{{\text{cm}}}}\) is thermal power output in city c in month m, \({\mathbf{X}}_{{{\text{cm}}}}\) is a vector of control variables including temperature, precipitation, wind velocity, investment actually completed in fixed assets, value added of industry, purchasing price index for industrial producers, \(\phi_{{\text{c}}} + \gamma_{{\text{m}}}\) is a city-by-month fixed effect, and \(\varepsilon_{{{\text{cm}}}}^{j}\) is random error.

Explained variables are PM2.5, PM10, SO2, CO, NO2, O3 and AQI. Explanatory variables are total power output and thermal power output. The pollution variables are highly correlated with each other in different ways; hence, each one appears singly to avoid multicollinearity and endogeneity problems, as is the case with total power output and thermal power output.

Electricity generation and air pollution may have common correlations with location and time-varying unobservables. For example, pollution levels and power output may be correlated with city-level covariates such as traffic density, population density, and demographics. Failing to control for such covariates will lead to biased estimates of \(\lambda_{{{\text{TPO}}}}^{j}\) and \(\lambda_{{{\text{THPO}}}}^{j}\). For this reason, we first show endogeneity with respect to electricity production and air pollution can be addressed by comparing pooled regression results with two other fixed-effect regression results. We argue that changes between cities and months, conditional on weather and economy controls, are random and thus exogenous to air pollution.

Second, air pollution has been shown to result from changes in industry growth [22] and fixed assets construction [23], which are commonly correlated with electricity generation. Thus, failure to adequately control for those mentioned variables more generally will lead to biased estimates. To address this concern, we include investment actually completed in fixed assets, value added of industry, purchasing price index for industrial producers as controlled variables in each specification.

We also select meteorologic variables controlled in our model to study and compare the relationships between various electricity generation and air pollution levels, and further do robustness test by adding or subtracting some of them. As wind, rain and temperature help to change air quality through complicated mechanisms, which have been discussed in several studies [24,25,26]. Notably, we choose minimum temperature to avoid repeated calculation of temperature effects, because maximum temperature (and further, average temperature) should have a big impact on the electricity needed and generated by air cooling systems. And we do city-specific regressions to explore relationships of air pollution and electricity production in different weather conditions.

5 Results

We first show summary statistics for each of the variables included in our model and then results of unit root test. Next, we present a series of specifications to demonstrate the strength and consistency of our primary model using different pollutants and AQI as dependent variables, with total power output and thermal power output as core independent variable (one at a time). We then report the potential discrepancies between pooled regression results and two fix effect regression results, including city and city–month fixed effects. And we compare the coefficient estimates in population models to city-specific models.

6 Summary statistics

Table 2 exhibits summary statistics for each of the variables used in estimation. All variables are presented in monthly counts. For example, the average concentration of PM2.5 per month is 55.68 μg/m3, while the maximum is 152 μg/m3, which was in Beijing in December 2015.

Table 2 Summary statistics

AQI is an index that comprehensively indicates the air pollution level in a city, and it is a unitless number. And CO concentration is measured in mg/m3, which is different from other pollutants; this will lead to smaller estimated coefficients for CO effect in following regression results.

6.1 Unit root test

We do autoregressive unit root test using four methods—that is, the Levin–Lin–Chiu test (LLC test), Breitung test, Hadri Lagrange multiplier test (LM test), and Im–Pesaran–Shin test (IPS test)—to avoid the limitation of a single-unit root test (Table 3). The null hypothesis of the four unit root tests was H0: Panels contain unit roots. According to the test results, few individual variables showed unit roots in only one or two test, which might not create unstable sequences. Therefore, we avoid spurious regression. And the following regression results will support our test results.

Table 3 Unit root test

6.2 Population regression results

We do population regression using pooled regression model (Table 4), fixed effects estimation model (fixed effects, written as FE in following tables) with a city fixed effect (Table 5) and a city–month fixed effect (Table 6) specifically. It should be noted that total power output and thermal power output appear singly for comparison, while all the controlled variables presented in Table 1 are included in each regression process. (Regression results of controlled variables are omitted for space efficient.)

Table 4 Pooled regression results
Table 5 Regression results (city FE)
Table 6 Regression results (city and month FE)

Table 4 displays the pooled regression results of estimating Eq. (1) with AQI and air pollutants as outcome variables. The number presented in column 1 includes values for estimated parameters \(\lambda_{{{\text{TPO}}}}^{j}\) and \(\lambda_{{{\text{THPO}}}}^{j}\), where j is specified as AQI, while in column 2 j is specified as PM2.5, and so on. Results are with robust standard errors clustered to individual-level observations to keep heteroskedasticity and autocorrelation consistent. Table 5 presents fixed effects estimation results with a city fixed effect, and the number in the same location represents the same estimator as in Table 4. Table 6 shows fixed effects estimation results with a city–month fixed effect, and the number in the same location also represents the same estimator as in Table 4. Results in Tables 5 and 6 are also with clustering robust standard error. The results mentioned above indicate a positive relativity between electricity generation and air pollution at various average-marginal-effect levels.

If air pollution, either AQI level or other pollutants, and electricity generation are positively correlated with omitted unobservables, then coefficient estimates in the same position should decline from Tables 4, 5 and 6. In fact, the same position’s coefficient estimate does not decline substantially between Table 4 and 6 when introducing a city or a city–month fixed effect to the regression model, indicating that the majority of the endogeneity is largely controlled for by the controlled variables in our basic regression model.

6.3 City-specific regression results

Our primary estimates suggest that monthly increases in power generation have a positive effect on air pollution in general. In the following section we investigate the effect of changes in power generation on each category of air pollution and explore the mechanisms driving our results in different cities. Also, all the controlled variables presented in Table 1 are included in each regression process. (Regression results of controlled variables are omitted for space efficient.)

Table 7 displays the results of estimating our primary model (in Eq. 1) in different cities. Column 1 shows that AQI positive effect is driven entirely by Tianjin, Chongqing and Shanghai, which is indicative of Beijing’s cleaner electricity generation behavior. Columns 3, 4 and 5 show that changes in electricity generation of Beijing and Tianjin do not statistically significantly affect PM2.5, PM10 or SO2, which are a subset of criteria air pollutants. The CO increasing effect is more significant in Beijing and Tianjin, and NO2 rising effect is more noticeable in Tianjin and Shanghai. We do find a relationship between increases in power output and declining O3 in Beijing and Shanghai, which contributes to the negative estimators in column 7 of Tables 4, 5 and 6.

Table 7 City-by regression results

7 Discussion

7.1 Comparative impacts

We do discover that air pollution level, involving AQI and PM2.5, PM10, SO2, CO, NO2 levels, is positively correlated with power generation. These results provide new corroboration of the results reported in previous studies that energy sector is one of the main air pollution sources [27]. The CO coefficients are positive but not significant, indicating that most of the poor combustion problem might have been resolved in power industry. In fact, researchers have found close relationship between coal and oil-based electricity production and CO2 emissions [28].

While estimators for AQI and other pollutants are positive, it is negative for O3 in almost all regression models. One possible explanation for these results is that O3 and other pollutants are motivated in different ways. For instance, volatile organic compounds might react with NO to prevent it from breaking down O3, and meanwhile induce the formation of O3 [29]. And scientists speculated that the reduction of PM2.5 would make it easier for sunlight to penetrate the air, providing more energy for surface ozone production process [30]. The significantly negative correlation between O3 and electricity production might indicate new approaches for O3 reduction anyway.

In city-specific regressions, the results show that AQI is positively correlated with power generation at various levels in cities. Electricity generation is more contaminating in Chongqing, Tianjin and Shanghai in general. Southern cities as Chongqing and Shanghai produce more PM2.5, PM10 and SO2 emissions, while northern cities as Beijing and Tianjin produce more CO emissions during electricity production process. Effects on NO2 are more obvious in coastal cities such as Tianjin and Shanghai, and the negative effect for O3 majorly comes from richer cities like Beijing and Shanghai.

Differences between total power output and thermal power output are our next concern. In the studied four cities, thermal power output accounts for more than 80% of total power output. Sustainable ways of generating electricity including solar power, wind power and hydropower generation are rare in our sample. While hydropower output is about 10–20 unit (100 million kwh as in our dataset) in Chongqing in some months, sustainable power generation in other cities are below 1 unit.

The regression results suggest that changes in electricity generation have steadily significant effects on air pollution with a particular emphasis on thermal power generation, which is likely indicative of high polluting behavior. In fact, we find no relationship of air pollution (particularly PM2.5 and PM10) with other power output (total power output minus thermal power output), which might be also hinted in our results tables (the estimated coefficient of thermal power output is much higher than total power output for most pollutants), except SO2, NO2 and O3. As to SO2, existing research findings that air pollution net SO2 emission intensity of thermal power generation has dropped significantly since 2006 might be possible causes [21]. The NO2 results indicate that even the renewable power generation, referring hydroelectric power, nuclear power, wind power and solar power, may lead to some air pollution in different ways. It is revealed that renewable energy generation might be not as clean as expected, which is inconsistent with some reported results [31, 32] and need further considerations. What is important, the effect of thermal power generation on air pollution is proved by each of our regression process.

7.2 Policy implications

Our results also provide valuable policy implications. In particular, the positive correlation between air pollution (AQI and PM2.5, PM10, SO2, CO, NO2) and electricity production suggests that power plants are not so clean as we expected after implementation of 2013 Clean Air Action Plan. It is much more serious in Shanghai, revealing from our city-specific results. Thus, researchers and policy makers can expect increases in air pollution level even when electricity production is only slightly elevated. Beijing’s power plants are cleaner than three other cities despite of its vulnerable climate conditions, which implicates that technology instead of geography factors could be of most importance for cleaner production in power industries. That is also proved by Chongqing, where thermal power output contributes a smaller proportion of total power output than other cities, whereas air pollution also highly correlated with electricity production.

Given that negative correlation of O3 and positive correlation of other pollutants with power generation stay simultaneously, joint control of air pollutants is also extremely important in power industries for clean air actions. After years of efforts, we see reducing particulate matter levels in many cities of China, which is the government’s concern at the beginning, while now increasing O3 levels becomes a new question [33]. Hence, it is essential to control O3 and other pollutants at the same time referring to cleaner actions in power industries. Also relationships of different pollutants (such as CO, the product of the imperfect combustion) and CO2 (deriving from completely burning) with electricity generation exist in different and complicated forms, joint control of air pollution emissions and CO2 need to be explored [34].

7.3 Robustness, limitation and future work

Model misspecification is a concern given the complex relationships between power generation, climate conditions, economic development, and air pollution. We test the robustness of our results by estimating a variety of alternative specifications, and including alternative sets of fixed effects while using ordinary least squares (OLS) functions. In each of the mentioned specifications, we focus only on total power output and thermal power output as the core independent variables.

We begin by comparing estimates using AQI to estimates using 6 criteria pollutants in each of our regression pathways. The results are displayed in Tables 4, 5 and 6. The steadily positive coefficients are statistically significant, except for O3, which has been discussed above. When we regulate our sample into specific cities, estimators are similar to those of population regression. This indicates that the estimating process does not introduce significant additional measurement error. Our results confirm an exact response of air pollution to changes in electricity production.

Despite the efforts of quantifying polluting impacts of power generation in this study, there are some limitations and uncertainties, which need further investigation. Perhaps the most troubling issue with our data is its lack of geographic coverage. Notably absent from our analysis are 30 of 34 province-level administrative regions in China. Given that the CNEMC data report information on city-level air pollution, it is puzzling for us to directly generate pollution estimates for provinces which containing several cities. Future research might explore alternative methods for matching provincial data of air pollution level and electricity production, or collecting city-level power generation data to get more applicable results.

Uncertainty within our framework is classified into two aspects. First, it is the dataset lack of covering area, which discussed above. Second, though we have chosen OLS as our basic regression model after comparing some different function forms, we still might not confirm its optimality certainly without an exhaustive search. More importantly, it is necessary to explore a more specific theoretical mechanism implicated in our function forms. We expect future environmental and economic studies could fill this gap.

8 Conclusion

This paper identifies the effect of changes in electricity production on air pollution levels. We have four primary data sources at the monthly city-level spanning from December 2013 to December 2019. First, we use air pollution (AQI and PM2.5, PM10, SO2, CO, NO2, O3) measures from the CNEMC. Second, we acquire a series of monthly power outputs from the NBSC. Third, we include monthly wind, temperature and precipitation data. Fourth, the economy data also come from the NBSC. Our identification strategy employs OLS model for panel data with different fixed effects, and we perform several tests by introducing alternative dependent and independent variables to ensure our results are not confounded by variation in economy or weather.

Our primary findings are that one unit (all units are displayed in Table 2) increase in power output is associated with a 0.3-unit increase in AQI, a 0.2-unit increase in PM2.5, a 0.2-unit increase in PM10, a 0.1-unit increase in SO2, and a 0.14-unit increase in NO2 in each month per city, nearly all of which is driven by increases in thermal power output. Alternatively, changes in sustainable electricity generation have no statistically significant effects on air pollution, which indicates that an increase in thermal power output can still act as a dirty production behavior, which can increase multiple contaminates. We estimate city-specific average marginal effects to exhibit that our effect estimates are statistically differentiated among cities. We find evidence that our results are robust to several tests and alternative specifications. Overall, our results suggest a positive relationship between electricity production and air pollution, which highlights a city-specific external cost of power generation that is currently absent from policy discussions.