Introduction

Over the past decades, China has witnessed rapid economic development, industrial expansion, and urbanization, which have led to serious air pollution (Huang et al. 2014). Regional air pollution caused by fine particulate matter (PM2.5, with an aerodynamic diameter smaller than 2.5 μm) occurs frequently in several key regions of China, e.g. Beijing-Tianjin-Hebei (BTH) (Gao et al. 2018; Ding et al. 2019), Yangtze River Delta (YRD) (Ming et al. 2017), Sichuan Basin (Tian et al. 2019), and Fenwei Plain (Zhai et al. 2019; Cao and Cui, 2021). A high level of PM2.5 damages human health, impacts regional climate changes, and affects agricultural ecosystems (Feng et al. 2018; Cohen et al. 2017; Zhou et al. 2018). Frequent haze and increasing PM2.5 concentrations have affected sustainable socioeconomic development, which draws public anxiety and extensive concerns. To address this difficult problem, many studies have been conducted by scientists to understand the levels, distribution, and sources of regional pollution (Liang et al. 2019; Ye et al. 2018). In response to extremely severe and persistent haze pollution, many government-backed measures have been taken to reduce haze events and improve air quality.

Recently, extensive studies have mostly focused on metropolises and regions in northern China, such as Beijing, Tianjin, Shijiazhuang, and BTH, to clarify the spatial and temporal distribution of PM2.5 (Xu et al., Xu and Zhang, 2020; Yan et al. 2018). Thus, only a few studies have focused on the Central Plains Urban Agglomeration (CPUA) region, which is an important growth pole of China’s economy and is burdened by serious air pollution problems (Fu et al. 2020). As the largest urban agglomeration in Central China, CPUA includes 30 prefecture-level cities, namely all 18 cities in Henan Province, Changzhi, Jincheng, and Yuncheng in Shanxi Province; Liaocheng and Heze in Shandong Province; Huaibei, Bengbu, Suzhou, Fuyang, and Bozhou in Anhui Province; and Xingtai and Handan in Hebei Province (Liu et al. 2019; Fu et al. 2020). According to previous studies, Zhengzhou (the capital city of Henan Province), Jiaozuo, Pingdingshan, Xinxiang, and Anyang in Henan Province were all heavily polluted by PM2.5, especially in autumn and winter (Feng et al. 2018; Jiang et al. 2018; Liu et al. 2021; Wang et al. 2021; Su et al. 2021). Xingtai and Handan in Hebei Province are two typical industrial cities with massive anthropogenic emissions and high PM2.5 concentrations (Yang et al. 2018; Liu et al. 2020). Yuncheng in Shanxi Province is an intensive energy-consuming city and frequently experiences haze. Fu et al. (2020) reported that in CPUA, the health effect damage of PM2.5 pollution was 11.1 million, and the health effect economic loss was 94.5 billion RMB in 2017. However, studies on the spatiotemporal variations in PM2.5 from the perspective of CPUA are rare. Since 2013, the Chinese government has made a firm decision to reduce the PM2.5 concentration and taken some measures to address air pollution (Chen et al. 2019; Jiang et al. 2020). Under such circumstances, the PM2.5 concentration in CPUA has varied in recent years. Therefore, it is urgent to conduct relevant research to understand the PM2.5 characteristics and sources in CPUA.

The present study aims to investigate the spatial and temporal variations and the potential geographical source of PM2.5 in CPUA. Multiple transdisciplinary methods, including classical statistics, geographical analysis, spatial statistics, and potential source analysis, are employed in this study. Specifically, we (1) systematically demonstrate annual, seasonal, monthly, and diurnal variations in PM2.5, (2) reveal the spatial distribution and variation, and (3) identify the potential geographical source regions and regional transport of PM2.5.

Materials and methods

Study area and data sources

The CPUA (31.4° N ~ 37.8° N, 110.2° E ~ 118.2° E) is located in central and eastern China (Fig. 1a), including 30 cities (Fig. 1c). It covers 287,000 km2 with a population of more than 160 million. CPUA is seriously constrained by resources and population agglomeration caused by its rapid urbanization that continues to pressure the ecological environment in this area and causes high aerosol loading (Shen et al. 2019).

Fig. 1
figure 1

a Location of the Central Plains Urban Agglomeration (CPUA) in China, b The mean aerosol optical depth (AOD) in China during 2017. The AOD values were retrieved from the MYD04_L2_C6 product. c Cities of CPUA (AYA: Anyang, BBU: Bengbu, BZH: Bozhou, CZH: Changzhi, FYA: Fuyang, HBE: Huaibei, HBI: Hebi, HDA: Handan, HZE: Heze, JCH: Jincheng, JYU: Jiyuan, JZU: Jiaozuo, KFE: Kaifeng, LCH: Liaocheng, LHE: Luohe, LYA: Luoyang, NYA: Nanyang, PDS: Pingdingshan, PYA: Puyang, SMX: Sanmenxia, SQI: Shangqiu, SZH: Suzhou, XCH: Xuchang, XTA: Xingtai, XXI: Xinxiang, XYA: Xinyang, YCH: Yuncheng, ZKO: Zhoukou, ZMD: Zhumadian, ZZH: Zhengzhou)

A total of approximately 4.3 million data points were collected from January 1, 2017, to December 31, 2020. The pandemic of coronavirus disease 2019 (COVID-19) resulted in a stringent lockdown in China to reduce the infection rate. The distinct decrease in anthropogenic source emissions led to an improvement of air quality in CPUA. The data include the hourly monitoring values of PM2.5 from the 29 aforementioned cities. Due to the absence of the national ambient monitoring station in JYU, the PM2.5 concentrations in JYU were represented by the average PM2.5 concentrations from its surrounding cities (JCH, JZU, LYA, SMX, and YCH). All the data were collected from the Data Centre of the PRC Ministry of Ecology and Environment (http://datacenter.mep.gov.cn) and the National Urban Air Quality Real-Time Publishing Platform, China Environmental Monitoring Station (http://106.37.208.233:20035/). At national ambient monitoring stations, the mass concentration of PM2.5 is measured by using the beta absorption method and micro oscillating balance method. Daily, monthly, seasonal (spring: March to May, summer: June to August, autumn: September to November, winter: January, February, and December), and annual mean PM2.5 concentrations for the cities and urban agglomerations were obtained according to the arithmetic mean method.

Spatiotemporal variation analysis methods

Empirical orthogonal function analysis

The empirical orthogonal function (EOF), which is a form of principal component analysis, can be used to decompose space–time data into a set of orthogonal standing signals (Xu et al. 2019). EOF detects both the spatial and temporal patterns of variability and measures the contribution of each pattern. A matrix Xmn (m represents the number of sites, n represents sampling time) can be decomposed into the sum of the product of the orthogonal space matrix V and the orthogonal time matrix T by EOF:

$${X}_{mn}=VT=\left(\begin{array}{ccc}{v}_{11}& \cdots & {v}_{1n}\\ \vdots & \ddots & \vdots \\ {v}_{m1}& \cdots & {v}_{mn}\end{array}\right)\left(\begin{array}{ccc}{t}_{11}& \cdots & {t}_{1n}\\ \vdots & \ddots & \vdots \\ {t}_{m1}& \cdots & {t}_{mn}\end{array}\right)$$
(1)
$$X{X}^{T}=VT{T}^{T}{V}^{T}=V\Lambda {V}^{T}$$
(2)

where the superscript T represents the transpose of the matrix, Λ is a diagonal matrix composed of the eigenvalues of the matrix, and V is a matrix composed of the matrix eigenvectors.

Thus, the time coefficient can be defined as:

$$T={V}^{T}X$$
(3)

In this research, we use the EOF analysis method to decompose the monthly PM2.5 concentrations of the CPUA.

Calculation of centroid migration

The centroid migration of PM2.5 concentration in a region, namely the arithmetic mean position of all the points, can reflect the development characteristics on a spatiotemporal scale, which can help policy-makers better understand highly polluted areas (Jiang et al. 2020; Su et al. 2020). The PM2.5 pollution centroid of the CPUA is expressed as follows:

$$X=\frac{{\sum }_{i=1}^{n}{X}_{i}{P}_{i}}{{\sum }_{i=1}^{n}{P}_{i}}Y=\frac{{\sum }_{i=1}^{n}{Y}_{i}{P}_{i}}{{\sum }_{i=1}^{n}{P}_{i}}$$
(4)

where X and Y denote the longitude and latitude coordinates of the centroid for the observed PM2.5, respectively. Xi and Yi are the longitude and latitude of the centroid of city i, respectively. Pi represents the mass concentration of PM2.5 in city i, and n is the city number of the CPUA, i.e. 30.

Global and local spatial autocorrelation analysis

The spatial distribution of PM2.5 involves complex spatiotemporal and geospatial processes. Previous studies indicated that PM2.5 pollution displays some spatial autocorrelation in geographical space (Cheng et al. 2017; Shen et al. 2019; Liu et al. 2017; Ye et al. 2018). In this study, the global Moran’s I was employed to discuss the degree of global autocorrelation of PM2.5 in the CPUA, and the local Moran’s I was used to determine the local spatial autocorrelation and agglomeration patterns. The global Moran’s I is expressed as:

$${I}_{\mathrm{Global}}=\frac{\sum_{i=1}^{n}{\sum }_{j\ne i}^{n}{W}_{ij}\left({x}_{i}-\overline{x}\right)\left({x}_{j}-\overline{x}\right)}{{S}^{2}{\sum }_{i=1}^{n}{\sum }_{j\ne i}^{n}{W}_{ij}}$$
(5)

where n represents the number of cities; xi, xj is the observed PM2.5 of spatial location i, j; \(\overline{x}=\frac{1}{n}\sum_{i=1}^{n}{x}_{i}\); and S denotes the standard deviation of the samples. In the calculation, the Rook contiguity matrix was used to determine the spatial weight between cities Wij.

The Z value was used to test the significance of spatial autocorrelation and is calculated as follows:

$$Z=\frac{I-E\left(I\right)}{\sqrt{VAR(I)}}$$
(6)

where E(I) and VAR(I) are the expected value and variance of Moran’s I, respectively.

The scope of IGlobal is [-1, 1], and the higher the absolute value is, the stronger the spatial agglomeration. A positive (negative) value of IGlobal suggests a positive (negative) correlation, while an IGlobal of 0 indicates no spatial autocorrelation.

The local Moran’s I can reveal the features of an urban spatial agglomeration within a region and is calculated as:

$${I}_{\mathrm{Local}}=\frac{{x}_{i}-\overline{x}}{{S}^{2}}{\sum }_{j=1, j\ne i}^{n}{W}_{ij}\left({x}_{j}-\overline{x}\right)$$
(7)

where xi, xj, S2, \(\overline{x}\), Wij are the same as above. Based on the calculated ILocal, the spatial association modes can be classified into four types (Su et al. 2020): high-high clustering type (hereinafter HH), low-low clustering type (LL), low–high clustering type (LH), and high-low clustering type (HL). In this study, the HH (LL) type suggests that cities with high (low) PM2.5 concentrations are surrounded by others with high (low) PM2.5 concentrations. The LH (HL) type suggests that cities with low (high) PM2.5 concentrations are surrounded by others with high (low) PM2.5 concentrations. ILocal that fails the significance test is classified as not significant.

Kernel density estimation

Kernel density estimation was employed to determine the PM2.5 density function. Kernel density estimator is defined as:

$$f(x)=\frac{1}{nh}\sum_{i=1}^{n}K\left(\frac{{x}_{i}-x}{h}\right)$$
(8)

where n denotes the number of samples, h is the bandwidth, and K is the kernel weighting function. As in previous studies (Jiang et al. 2020), the Epanechnikov kernel and Silverman’s bandwidth were used in the present study.

Pollution transport analysis

Backward trajectory

Backward trajectory analysis can be used to identify the potential transport pathways of air masses (Liu et al. 2021). Using the Hybrid Single Particle Lagrangian Integrated Trajectory (HYSPLIT) model (Stein et al. 2015), 72-h back trajectories starting at an arrival level of 100 m from the different cities were calculated in autumn and winter of 2017 ~ 2020. The backward trajectory model was run every hour of the day. About 17,000 trajectories were obtained. FNL global analysis data were obtained from the National Centre for Environmental Prediction’s Global Data Assimilation System (GDAS) wind field reanalysis (http://www.arl.noaa.gov/) to drive the HYSPLIT model. Then, the backward trajectories having similar geographic origins and histories were classified by k-means clustering (Stunder, 1996).

Potential sources analysis

The potential source contribution function (PSCF) is a conditional probability describing trajectories with pollutant concentrations larger than a given threshold passing through the receptor site (Liu et al. 2019). PSCF is defined as:

$${PSCF}_{ij}=\frac{{m}_{ij}}{{n}_{ij}}$$
(9)

where nij denotes the total number of trajectory endpoints falling in the ijth cell, and mij is the number of trajector endpoints with pollutant concentrations higher than the threshold criterion in the same cell. The uncertainty of PSCF increases when nij is too small. Therefore, a weighting function Wij is multiplied into the PSCF value to reduce the uncertainty (Zhang et al. 2017; Liu et al. 2019). Wij is described as follows:

$${W}_{ij}=\left\{\begin{array}{c}1.00, {3n}_{ave}<{n}_{ij}\\ 0.70, {1.5n}_{ave}<{n}_{ij}\le {3n}_{ave}\\ 0.40, {n}_{ave}<{n}_{ij}\le {1.5n}_{ave}\\ 0.17, {n}_{ij}\le {n}_{ave}\end{array}\right.$$
(10)

where nave denotes the average number of endpoints in each grid cell. In this study, the trajectories associated with hourly PM2.5 concentrations were used for PSCF analysis, with the threshold criterion being set at 75 μg/m3.

Spatiotemporal variations of PM2.5 pollution

Temporal variations of PM2.5 pollution

Annual variation

Overall, the PM2.5 concentrations of CPUA decreased gradually. From 2017 to 2020, the yearly mean concentrations of PM2.5 were 68.3, 61.5, 58.7, and 51.5 μg/m3. The annual PM2.5 concentration was reduced by 24.7% in the 4 years. This suggests that drastic measures aimed at improving air quality in CPUA, e.g. pollution emissions reduction, coal combustion control, and clean energy use, worked well. Notably, the reductions in anthropogenic emissions due to stringent quarantine and lockdown measures during the COVID-19 pandemic in 2020 could dramatically improve the air quality in CPUA. Du et al. (2021) revealed that PM2.5 in Zhengzhou decreased by 19% in response to the COVID-19 lockdown. However, it still significantly exceeds the China National Ambient Air Quality Standards II (CAAQS grade I; 35 μg/m3). The estimated kernel density of annual mean PM2.5 concentrations is displayed in Fig. 2. From 2017 to 2020, the peak kernel density curves steepened and gradually moved to the left, demonstrating that concentrations of PM2.5 decreased in most cities of the CPUA. Obviously, the area covered by density curves decreased with time at PM2.5 concentrations ranging from 60 to 80 μg/m3, which indicated that city reduction with high PM2.5 concentrations benefitted PM2.5 pollution alleviation in CPUA.

Fig. 2
figure 2

Kernel density estimates of annual mean PM2.5 concentrations from 2017 to 2020

Seasonal variation

The seasonal means of PM2.5 concentrations from 2017 to 2020 are shown in Fig. 3. PM2.5 mass concentrations exhibited seasonal variation, with the highest concentration in winter (100.8 μg/m3, 4-year average), followed by autumn (54.4 μg/m3) and spring (51.4 μg/m3), and the lowest in summer (33.5 μg/m3). This variation is similar to other cities in North China (Shen et al. 2020). In winter, less precipitation leads to weakened wet scavenging. Meanwhile, weak winds and shallow planetary boundary layer heights cause a stable atmospheric structure, which is adverse to the dilution and diffusion of pollution (Wang et al. 2019; Fan et al. 2021). In addition, most of the cities in the CPUA need coal burning for heating in winter, emitting massive air anthropogenic pollutants (Wang et al. 2007). Under the coupling effect of the factors mentioned above, the CPUA suffers from serious PM2.5 pollution in winter. The highest decrements in PM2.5 concentration appeared in summer, with a reduction of 36%, from 43.1 μg/m3 in 2017 to 27.5 μg/m3 in 2020. In contrast, the PM2.5 concentration decreased by 18.4%, from 110.1 μg/m3 in 2017 to 89.8 μg/m3 in 2020 in winter. In the winter of 2019, the PM2.5 concentration even increased. This indicates that PM2.5 reduction measures taken in winter need to be more intensive than those in other seasons.

Fig. 3
figure 3

Seasonal average values of PM2.5 concentrations from 2017 to 2020

Monthly variation

The monthly variations in PM2.5 concentrations in the CPUA from 2017 to 2020 are illustrated in Fig. 4. According to the daily PM2.5 concentrations, the classification categories were designated as follows: excellent (PM2.5 ≤ 35 μg/m3), fine (35 < PM2.5 ≤ 75 μg/m3), slight pollution (75 < PM2.5 ≤ 115 μg/m3), moderate pollution (115 < PM2.5 ≤ 150 μg/m3), heavy pollution (150 < PM2.5 ≤ 250 μg/m3), and severe pollution (PM2.5 > 250 μg/m3). The monthly average PM2.5 concentrations exhibited a U-shaped trend, with the highest value in January (119.8 μg/m3) and the lowest value in August (30.7 μg/m3). In June, July, August, and September, the daily PM2.5 concentrations were all under 75 μg/m3. In January, February, November, and December, the frequencies of days with heavy pollution were 29.8%, 12.4%, 3.3%, and 9.3%, respectively.

Fig. 4
figure 4

Monthly average values of PM2.5 concentrations from 2017 to 2020

The EOF method was employed to decompose the monthly mean PM2.5 concentrations of the cities in the CPUA. The first EOF model accounted for 92.1% of the total variance, which suggested that the decomposition of PM2.5 was successful. In this study, we paid more attention to time coefficients decomposed by EOF, which can reflect the variation trend of monthly PM2.5 concentrations in CPUA. The decomposed time coefficients were standardized to have zero mean and unit variance (Jiang et al. 2020). As shown in Fig. 5, the time coefficients in the 4 years all displayed a U-shaped curve, which is similar to the variation trend of PM2.5 mass concentrations. In winter, most of the time coefficients are out of the scope of [-1, 1], indicating extreme events, namely high PM2.5 pollution. In February, there were obviously fewer extreme events in 2020 than in other years, which could be due to the lockdown for dealing with the outbreak of coronavirus disease 2019 (COVID-19) (Silver et al. 2020; Kwak et al. 2021).

Fig. 5
figure 5

Time coefficients of the monthly average PM2.5 concentrations from 2017 to 2020

Diurnal variation

As displayed in Fig. 6, the diurnal variations in PM2.5 in the four seasons had low concentrations during the daytime and high concentrations during the nighttime, which was related to the diurnal changes in the boundary layer (Liu et al. 2021). The PM2.5 concentration had a peak at 9:00 in spring, 8:00 in summer, 10:00 in autumn, and 11:00 in winter, corresponding to anthropogenic emissions in rush hours (Wang et al. 2016). From 00:00 to 6:00, the PM2.5 concentration decreased slowly in winter but increased in other seasons. PM2.5 exhibited the lowest concentrations at 16:00 ~ 17:00, which could be related to the highest planetary boundary layer height (PBLH) occurring at this time.

Fig. 6
figure 6

Diurnal average values of PM2.5 concentrations from 2017 to 2020

Spatial distributions of PM2.5 pollution

Spatial Variation of PM2.5

Figure 7 displays the distribution of PM2.5 in CPUA. It was clear that the spatial distribution of PM2.5 was heterogeneous. In 2017, the annual mean PM2.5 concentrations of Handan, Anyang, Xingtai, and Jiaozuo exceeded 75 μg/m3. These highly polluted cities are all characterized by industry. In addition, Luoyang, Suzhou, Liaocheng, Yuncheng, Zhengzhou, Heze, and Puyang also have higher PM2.5 concentrations. PM2.5 concentrations presented a gradual decline over time, presenting a convergence trend. From 2017 to 2020, cities with PM2.5 concentrations above 60 μg/m3 accounted for 89.7%, 62.1%, 44.8%, and 34.5% of all cities in the CPUA, respectively. The three cities with the highest reduction in PM2.5 concentration were Suzhou (37.2%), Xingtai (34.2%), and Handan (33.6%) from 2017 to 2020. In 2020, cities in the southeastern CPUA, e.g. Xinyang, had low PM2.5 concentration levels.

Fig. 7
figure 7

Spatial distributions of yearly average values of PM2.5 concentrations from 2017 to 2020

Centroid migration route

The centroids of the annual mean PM2.5 concentration from 2017 to 2020 were located at the borders between western Kaifeng and eastern Zhengzhou (Fig. S1). The PM2.5 pollution centroids were also located in the southeastern CPUA geographical centroid, with a distance of approximately 10 km. The centroid of PM2.5 concentration exhibited a trend of moving towards the west from 2017 to 2019 (Fig. 8), which indicates that the west CPUA had more serious PM2.5 pollution. The PM2.5 pollution centroids in 2018 ~ 2020 were all in southern 2017, which may be related to the sharp reduction in PM2.5 in Xingtai and Handan.

Fig. 8
figure 8

Centroid migration route of PM2.5 from 2017 to 2020

Spatial autocorrelation of PM2.5 concentrations

To further discuss the spatial characteristics of PM2.5 spatial correlation, the global Moran’s I was calculated. Moran’s I statistics from 2017 to 2020 are 0.34, 0.20, 0.28, and 0.26, respectively, passing the significance test, which indicates that PM2.5 concentrations have a positive spatial autocorrelation in CPUA. This suggests that PM2.5 in a city can be affected by its neighbouring cities. Previous studies also indicate that regional transport of PM2.5 plays an important role in regional haze episodes (Hu et al. 2021; Wang et al. 2014). Hence, regional joint cooperation across cities is necessary to improve air quality. In general, the global Moran’s I decreased over time, which indicated the weakening spatial autocorrelation of PM2.5. In summer, the global Moran’s I of PM2.5 concentrations was higher than that in other seasons (Fig. S2). We deduced that in summer, intensive atmospheric activity is conducive to the diffusion and mixing of anthropogenic pollutants, which leads to a relatively homogeneous distribution of PM2.5 concentrations. In winter, the lowest global Moran’s I suggested that PM2.5 heterogeneity increased, which was related to the stable atmosphere and local emission sources. As shown in Fig. S3, the global Moran’s I increased gradually during 02:00 ~ 9:00 and increased sharply during 9:00 ~ 13:00 due to the enhanced solar radiation. Then, the global Moran’s I decreased continuously, implying that the interplay of PM2.5 pollution between cities in the CPUA decreased.

Local spatial autocorrelation analysis was employed to identify the distribution and agglomeration patterns of PM2.5 pollution in each city of the CPUA. As displayed in Fig. 9, the agglomeration patterns displayed varying but similar spatial distributions over time. Overall, the most polluted cities were distributed in northern CPUA, which are all industrial cities with intensive anthropogenic emissions, forming PM2.5 pollution clusters. Cities with good air quality were located in the southern CPUA, which has abundant precipitation and small anthropogenic emissions, forming a low-low (LL) agglomeration type of PM2.5 concentration. The area with high-high (HH) agglomeration of PM2.5 pollution tended to migrate southward. In 4 years, Handan and Xinyang were the high-high (HH) and low-low (LL) PM2.5 concentration centres, respectively. In summer, the proportion of LL and HH clusters among CPUA was the largest in the four seasons (Fig. S4). In winter, this proportion was the smallest, which indicated a decreasing spatial autocorrelation among CPUA.

Fig. 9
figure 9

Spatial agglomeration of PM2.5 concentrations in the Central Plains Urban Agglomeration from 2017 to 2020

Regional transportation of PM2.5

Backward trajectory analysis

To investigate the PM2.5 transportation of CPUA, three typical cities, namely Xingtai (the most PM2.5 polluted city), Zhengzhou (the capital of Henan province and the largest city in CPUA), and Xinyang (the lowest PM2.5 polluted city), were selected to calculate the backward trajectories in autumn and winter for each year. As displayed in Fig. 10a, all trajectories of Xingtai were classified into four categories, C1 (28.3%), C2 (30.3%), C3 (24.4%), and C4 (17.1%), corresponding to PM2.5 concentrations of 110.5, 107.8, 79.7, and 81.6 μg/m3, respectively. C1 started in Inner Mongolia and passed through northern Shanxi Province. C2, a short-distance transport, crossed northwestern Shandong Province and southern Hebei Province. This suggested that the northern CPUA could be influenced by PM2.5 transmission from northwestern Shandong, southern Hebei, and northern Shanxi provinces. The trajectories of Zhengzhou were also classified into four categories (Fig. 10b), among which C2 (26.9%), C3 (23.4%), and C4 (14.2%) came from northwestern Zhengzhou, and C1 (35.5%) originated from northeastern Zhengzhou. C2, passing through southwestern Shanxi and northwestern Henan provinces, was the most polluted, with a PM2.5 concentration of 115.6 μg/m3. The PM2.5 concentrations corresponding to C1, C3, and C4 were 88.0, 78.6, and 52.7 μg/m3, respectively; the PM2.5 concentrations of the four trajectory clusters for Xinyang were lower than those of Xingtai and Zhengzhou (Fig. 10c). C3 (32.9%), the most polluted cluster with a PM2.5 concentration of 78.3 μg/m3, started in southern Henan Province and moved a short distance before arriving at Xinyang. C2 (31.0%) was the second PM2.5 pollution (76.8 μg/m3), crossing northern Anhui Province. This indicated that Xinyang could be influenced by southern Henan and northern Anhui provinces. In three typical cities of the CPUA (Xingtai, Zhengzhou, and Xinyang), all short-distance trajectories correspond to high PM2.5 concentrations, which suggested that internal transport played a key role in PM2.5 pollution over the CPUA. Meanwhile, the PM2.5 transport crossing region was important as well. Air masses starting from the northwest and having long-distance movement all corresponded to low PM2.5 concentrations.

Fig. 10
figure 10

Back trajectory clusters and the mean mass concentration of PM2.5 under the corresponding cluster in a Xingtai, b Zhengzhou, and c Xinyang

Potential sources

The PSCF model was used to identify the potential source areas for PM2.5 (> 75 μg/m3) in Xingtai, Zhengzhou, and Xinyang. The study domain in the three cities was 85 ~ 120° E, 26 ~ 52° N, with a spatial resolution of 0.2° × 0.2°. As shown in Fig. 11a, cells with high WPSCF values were mainly located in Lvliang, Changzhi, southeastern Shandong Province, and northern Anhui Province, which suggested that those regions were strong potential source areas influencing the PM2.5 concentration of Xingtai. For Zhengzhou (Fig. 11b), the strong potential source areas mainly included southeastern Henan Province, northern Anhui Province, and northeastern Hubei Province. In Xinyang (Fig. 11c), there were fewer grids with WPSCF values above 0.5 than in the other two cities due to the good air quality in Xinyang. The grids of WPSCF greater than 0.4 were mainly located at Changzhi, Anyang, Xinxiang, and Linyi. In summary, potential source analysis for three cities all indicated that PM2.5 transmission in CPUA and cross-boundaries was important, emphasizing the necessity of joint efforts among cities and regions.

Fig. 11
figure 11

Results of PSCF analysis for PM2.5 concentration above 75 μg/m3 in a Xingtai, b Zhengzhou, and c Xinyang

Conclusions

In this study, multiple transdisciplinary methods, including geographical analysis, spatial statistics, and potential source analysis, were employed to investigate the spatial and temporal variations and the potential geographical source of PM2.5 in the CPUA. During 2017 ~ 2020, the annual mean concentrations of PM2.5 were 68.3, 61.5, 58.7, and 51.5 μg/m3, respectively. The kernel density estimation results suggested that city reduction with high PM2.5 concentrations benefitted PM2.5 pollution alleviation in the CPUA. PM2.5 exhibited the highest concentration in winter (100.8 μg/m3, 4-year average) and the lowest concentration in summer (33.5 μg/m3). From 2017 to 2020, the PM2.5 concentration decreased 36% in summer and 18.4% in winter. PM2.5 concentrations showed a U-shaped trend with month, with the highest value appearing in January (119.8 μg/m3) and the lowest in August (30.7 μg/m3). The time series of PM2.5 concentrations were decomposed by EOF, and the results indicated that high PM2.5 pollution mainly occurred in winter. Small different diurnal variations in PM2.5 were observed over the season due to anthropogenic emissions and PBLH variation. The spatial distribution of PM2.5 in CPUA was heterogeneous. The centroid of PM2.5 concentration was located in western Kaifeng and moved towards the west over time. The spatial autocorrelation analysis revealed that PM2.5 concentrations exhibited a positive spatial autocorrelation in CPUA. The spatial autocorrelation was the strongest in summer and the lowest in winter. In the diurnal variation, the global Moran’s I increased during 02:00 ~ 13:00 and then increased. The most polluted cities were distributed in the northern CPUA, forming a high-high agglomeration, and the cities located in the southern CPUA formed a low-low agglomeration. Handan and Xinyang were the centres of high-high and low-low agglomeration, respectively. The HYSPLIT model and PCSF were used to discuss the regional transportation of PM2.5 in the CPUA. The results suggested that internal transport played a key role in PM2.5 pollution over the CPUA and that the PM2.5 cross-boundary of the CPUA was also important. Our findings suggest that drastic measures, including pollution emissions reduction, coal combustion control, and vehicle restriction, are necessarily taken in winter. Joint efforts across cities and regions are needed to further improve the air quality of CPUA.