Introduction

Water environment degradation is a big issue in watershed management, which poses severe threat to surface water security. Natural and anthropogenic activities deteriorate surface water quality (Sundaray et al. 2006), i.e., rock–water interactions (Li et al. 2013), nutrient losses, municipal and industrial wastewater emissions, excessive sluice and dam constructions (Zhang et al. 2010). Huaihe River water quality is degraded due to rapid industrialization, urban sprawl and agriculture intensification (Di-hu 2005; Hu et al. 2009; Li and Pu 2003; Li et al. 2003; Xiao et al. 2011). In 2012, 31% water quality monitoring sites located at seven major rivers of China were reported having inferior (grade IV) water quality. Among the degraded monitoring sites 50% stations located at Huaihe River were severely polluted. Water quality of Huaihe River is degrading in that last decades, and major areas across the province do not meet the national surface water quality standards. Therefore, identification of NPS pollution will excavate potential causes of water quality degradation caused by natural and anthropogenic interventions which will help in sustainable water resource management.

Rivers are the major source for carrying off point source pollution and NPS pollution. NPS pollution mainly occurs through surface runoff from various kinds of land uses (Whitehead et al. 2009). Overall, different types of land uses, which include urban, industrial, forest, agricultural and grass area, have shown relationships with NPS pollution. For example, agricultural land use has positive correlation with NH3-N, while forest and grass areas have positive association with dissolved oxygen (DO) (Huang et al. 2013). Urban land use has positive relationship with chemical oxygen demand (COD) and pH (Huang et al. 2013; Pratt and Chang 2012). Industrial land use has been linked with COD (Zhao et al. 2015). Generally, urban and industrial land uses have negative impacts, while forest and grass land uses have positive impacts on surface water quality.

Multivariate statistical techniques which include factor analysis (FA), principal component analysis (PCA), cluster analysis (CA) and discriminant analysis (DA) are excessively used to analyze the complex multidimensional water quality dataset for the identification of NPS pollution in surface waters to better understand the ecological and water quality status of the studied systems (Alberto et al. 2001; Ali et al. 2016; Banerjee et al. 2016; Bilgin and Konanç 2016; Khound and Bhattacharyya 2017; Kuang et al. 2016; Singh et al. 2004; Wang et al. 2017; Yin et al. 2005). Multivariate statistical approaches have been applied to assess spatiotemporal patterns associated with anthropogenic and natural factors (Helena et al. 2000; Singh et al. 2004). PCA is widely used to identify spatiotemporal factors influencing stream health (Bengraı̈ne and Marhaba 2003; Parinet et al. 2004).

Literature shows that PMF model is successfully applied for ambient particulate matter source appointment (Henry and Christensen 2010; HOPKE 2010). Besides the ambient air pollution, this technique is also used to investigate the contribution of various sources in wet pollutant deposition by utilizing TOC, NH4-N, NO3-N, Mg, K and Ca as source tracers (Anttila et al. 1995). Recently, its application has been extended for pollutant source apportionment in aqueous environment using source markers such as TP, NH3, heavy metals and BOD5 (biochemical oxygen demand) from combined sewer overflows (Li et al. 2015; Soonthornnonda and Christensen 2008). Factor analysis (FA) and principal component analysis (PCA) are widely used for surface water quality source apportionment (Huang et al. 2011; Singh et al. 2004; Zhou et al. 2007). However, source apportionment for surface water quality based on water quality data matrix of monitoring stations using PMF is very rare which motivates the author for the current study.

PMF is an emerging technique with some of the advantages over the other multivariate techniques such as PCA, FA and UNMIX. These advantages include the inclusion of uncertainty calculation that allows individual treatment of the data matrix and the integration of nonnegativity constrains on the resulting factors in the computational process. PMF has been demonstrated to be superior to eigenvector-based FA in many environmental applications. The estimation of uncertainties in PMF modeling allows better treatment of below detection and missing values (Hoinaski et al. 2013), since environmental data matrix usually contains data points below detection limit and missing values (Carrer and Leardi 2006). Henry and Christensen (2010) have introduced PMF in the aqueous environment to down weight the uncertainties in the dataset to bring reasonable results (Henry and Christensen 2010).

Huaihe River, 7th largest river of China, poses serious threats to the national security and socioeconomic development of the country owing to frequent water pollution incidents. Spatiotemporal pattern of NPS pollution identification is an effective approach for water quality evaluation. The main aims of this research work were to identify and apportion spatiotemporal NPS pollution which originates from different land uses and to describe distribution patterns of the pollutants of Huaihe River. A combined approach was proposed in this work. Namely, the data were subjected to hierarchical clustering analysis (HCA) followed by modeling with PMF to identify potential NPS pollution. This research will provide technical foundation for water quality management based on identified NPS pollution in Huaihe River basin to bring sustainability in the region and provide scientific basis for the implementation of water pollution control plan in future.

Materials and methods

Huaihe River basin and its pollution control project

Huaihe River basin (117°36′E–118°57′E, 29°21′E–30°13′E; Fig. 1) is overcrowded in terms of both water projects (1dam/50 km2) and population (614 persons/km2) and highly contaminated basin in China. It passes through five major provinces, i.e., Hubei, Henan, Anhui, Shandong and Jiangsu. It is a very important river owing to its big catchment area of 270,000 km2. The above-stated basin receives 894 mm annual precipitation, almost 70% of which is received during the rainy season from June to October. Significant spatial variation is observed in precipitation which ranges from 1400 mm in the southern mountain region to less than 700 mm in the northern region near the Yellow River. In total, 51% annual runoff is stored by dams and sluices (Liu et al. 2017). Both point and NPS pollution degrade surface water quality. More than 50% water quality monitoring stations are below Grade III of the country standards. Inferior surface water quality damage water environment and ecology (Zhang et al. 2010; Xia et al. 2018). In 1994, Huaihe River pollution control project was initiated whose aim was to control Huaihe River pollution till the end of the last century. Some improvement was brought by closing industries and enhancing sewage treatment facilities. In 2004, sudden paroxysmal chemical spill incidents dramatically impact the effects of pollution control (Wang et al. 2017). According to the regulations on the prevention and control of water pollution in the Huai River basin of Anhui Province, passed at November 2018, the local government is making efforts to achieve the goal that the proportion of good water quality monitoring section reached more than 57.5% in 2020.

Fig. 1
figure 1

Spatial distributions of water quality monitoring stations of Huaihe River basin

Data treatment and multivariate statistical methods

Online monitoring stations were nationally installed by Chinese government since 2008. In this work, two-year (2011–2012) weekly monitoring data were collected from 27 water quality monitoring stations located along the main stretch of Huaihe River. The data included four water quality parameters, namely COD, DO, pH and NH3-N. In the year 2011–2012, the water quality is not as good as 2018 and it makes the pollution apportionment more sensitive. Table 1 showed the basic statistic index of the water quality time series.

Table 1 Mean and standard deviation of water quality parameters at 27 monitoring stations of Huaihe River during 2011–2012

In this study a combined approach of HCA-PMF was conducted to apportion NPS pollution (Simeonov et al. 2003). Firstly, the data were standardized by z-scale transformation (normalized to zero mean and unit variance) to avoid misclassification owing to variables measured on different scales (Liu et al. 2003). Standardization put different water quality variables values on the same scale. This process allows comparing values between different types of water quality variables. Standardized values of water quality variables were obtained by subtracting mean from observed values and dividing by standard deviation. Next, HCA-PMF was used to classify water quality monitoring stations and to identify NPS pollution. Thereafter, multiple correspondence analysis (MCA) was used to visualize spatial distribution patterns of the water quality monitoring stations. Individual approaches were explained as below.

Hierarchical cluster analysis (HCA)

HCA is a statistical technique which groups up objects based on similar water quality characteristics (Chen et al. 2015; Li et al. 2009; Singh et al. 2004). HCA is based on sequential grouping which starts from the most similar characteristics pair and develops higher groups step by step. The process of forming and linking groups is repeated until a single group comprising all samples is acquired. The result can be displayed in the form of dendrogram which shows the graphical summary of the clustering process.

HCA used Ward’s method with Euclidean distance as a measure of similarity. HCA clustered similar sampling stations (spatial variability), based on four water quality parameters, located along the river stretch. HCA resulted in a dendrogram where linkage distance (rescaled) is represented on y-axis (Huang et al. 2014).

Positive matrix factorization (PMF)

PMF is widely used for source apportionment because it is one of the most important receptor models (Al-Dabbous and Kumar 2015; Hajigholizadeh 2016; Li et al. 2015; Mohammed et al. 2016). PMF is a multivariate receptor model that decomposes a matrix X of n by m dimensions, in which n is the number of samples and m is chemical species, into two matrices: factor contributions (G) and factor profiles (F), and the residual (E) (see Eq. 1). Two input files were inserted into PMF model: a file containing concentrations of the four examined water quality parameters and a file containing uncertainty values that is calculated as per Eq. 2. Optimum numbers of factors were obtained by performing several runs of the model and the subsequent selection of the best run/solution with the lowest values of Q (robust) where the later parameter shows the model fitting capability as mentioned earlier (Bzdusek et al. 2006). To minimize Q, this parameter has been defined as per Eq. 3. The main task of factor analysis by PMF is to minimize the objective function (Q) with respect to G and F under a constraint that all or at least some of the elements of the G and F were constrained to nonnegative values (Paatero 1997).

$$X = GF + E$$
(1)

The uncertainties were calculated using Eq. 2.

$${\text{Uncertainty }} = \frac{5}{6}*{\text{MDL}}$$
(2)

where MDL is method detection limit of each chemical species included as input in the modeling (Norris et al. 2014).

$$Q = \sum\limits_{{i = 1}}^{n} {\sum\limits_{{j = 1}}^{m} {(e_{{ij}} /s_{{ij}} )^{2} } }$$
(3)

Q = sum of squares of the difference (eij) between the original data matrix (X) and the PMF output (GF), divided by the computed uncertainties (sij).

The study was carried out using EPA PMF 5 which is based on Pateros’s PMF model. Optimum numbers of factors were found by the value of Q which shows the model fitting capability. A global minimum was computed by changing the seed value from 1 to 20 for each model run (Bzdusek et al. 2006).

Multiple correspondence analysis (MCA)

Correspondence analysis is a statistical visualization technique illustrating association between the members of two sets of data. MCA is the advanced form of correspondence analysis, which analyzes multiway tables. MCA allows the establishment of relationships between two and more than two variables. The main purpose of MCA is to find categories and distinguish them by separating. The same category variables are plotted close to one another, while different categories variables are plotted far apart (Ambarita et al. 2016; Pacheco 1998). Here, we performed MCA to categorize 27 water quality monitoring stations located along the river stretch. MCA categorized water monitoring sites to visualize its spatial distribution based on four water quality parameters (Fig. 3).

Results

Spatiotemporal variation of water quality data matrix was evaluated using different multivariate statistical techniques along with PMF modeling. Water sampling stations were broadly classified into four significant clusters as shown by dendrogram in Fig. 2. To assess spatial similarity among clusters, HCA was applied to 27 water quality monitoring sites. HCA classified the above-stated monitoring stations into four groups of similar water quality characteristics based on the four water quality parameters being analyzed in this study. The results are demonstrated by dendrogram as obvious from Fig. 2. HCA helps in declining the number of monitoring stations with minor loss of information (Simeonov et al. 2003).

Fig. 2
figure 2

Dendrogram of 27 water quality monitoring stations using Ward method based on water quality parameters of the Huaihe River basin

MCA was carried out to find out spatial distribution patterns of water quality monitoring stations at Huaihe River basin. Some water quality monitoring stations (3, 9 and 21) are located away from the rest of sites. Apart from these three water quality monitoring stations, the remaining stations are close to each other as demonstrated in Fig. 3. This gives an idea that the majority features of the above-stated three water quality monitoring stations are different from the remaining stations owing to unique pollutant emission sources. Those stations which have similar characteristics lie close to each other as they are exposed to approximately the same NPS pollution (Ambarita et al. 2016; Zhao et al. 2015).

Fig. 3
figure 3

MCA categorizes water quality monitoring stations at Huaihe River basin based on mean values of water quality parameters

PMF analysis was carried out for the apportionment of NPS pollution originating from different land uses. It suggested certain number of NPS pollution factors based on the underlying principle mentioned earlier in the materials and methods’ section. Each time the model was run using different initial seed value. The model was run 20 times. Four NPS pollution factors were identified per each group and per each season (winter, summer (wet), spring and autumn (wet)).

Four pollution sources are identified for winter season. All parameters are contributing in Factor 1(pH,35.5%; DO,61.6%; COD,21.5%; and NH3-N,5.8%) and Factor 3(pH,43.2%; DO,34.4%; COD,29.9%; and NH3-N,12.4%). Factor 1 and Factor 3 are identified as diffused land use with multiple NPS pollution. This may be due to closing of gates in dry season to store water for local supply which retains all kinds of pollutants produced from agricultural, industrial and urban land uses (World Bank, 1997). Factor 2(pH,3.3%; COD,8.8%; and NH3-N,75.5%) is characterized by unique high loading of NH3-N and negligible loadings of the remaining three water quality parameters. Hence, this factor is identified as agricultural land use. Huaihe River basin is the main grain-producing area of China. This may be due to excessive application of fertilizers and pesticides used for crop production in the river valley (Zhong 2006). Factor 4 is dominated by pH and COD as can be seen from Fig. 4. Factor 4(pH,18%; DO,4%; COD,39.8%; and NH3-N,6.3%) is then suggested to be related to urban land use (Huang et al. 2013; Pratt and Chang 2012; Xiao et al. 2016; Zhao et al. 2015).

Fig. 4
figure 4

Temporal factor loadings obtained from PMF analysis of water quality parameters of Huaihe River basin. Factors along with its corresponding NPS are given. (For winter season: Factor 1 and Factor 3 = diffused land use, Factor 2 = agricultural land use and Factor 4 = urban land use. For summer season: Factor 1 = agricultural land use, Factor 2 = industrial land use, Factor 3 and Factor 4 = diffused land use. For spring season: Factor 1 and Factor 4 = diffused land use, Factor 2 = agricultural land use and Factor 3 = urban land use. For autumn season: Factor 1 and Factor 4 = diffused land use, Factor 2 = industrial land use and Factor 3 = agricultural land use.)

Four pollution sources are identified for summer season. Factor 1(pH,4.6%; DO,0.4%; COD,7.7%; and NH3-N,80.3%) is exclusively ruled by NH3-N as evident from the source profile shown in Fig. 4. Hence, this factor is most likely to be linked to agricultural land use. The profiles of agricultural land use in summer and winter are very similar. Rains in summer sweep fertilizers and pesticides from fields to the river. Factor 2(pH,1.6%; COD,31.6%; and NH3-N,1.5%) is dominated by COD and negligible loading of the rest water quality parameters and therefore classified as industrial land use. High loading of COD may be due to routine industrial activities accompanied by seasonal agricultural product processing enterprises at village and township level. It has aggravated the situation owing to outdated technology and lack of pollution treatment facilities (Zhong 2006). All parameters are contributing in Factor 3(pH,43.2%; DO,64.9%; COD,21.7%; and NH3-N,7%) and Factor 4(pH,50.6%; DO,34.7%; COD,38.9%; and NH3-N,11.2%). Factor 3 and Factor 4 are identified as diffused land use with multiple NPS pollution. It may be attributed to high water consumption for bathing due to rise in temperature and heavy rains in those particular areas. Storm water sweeps all kinds of contaminants from different land uses to the river (Huang et al. 2013; Pratt and Chang 2012; Xiao et al. 2016; Zhao et al. 2015).

Four pollutant sources are identified for spring season. All parameters are contributing in Factor 1(pH,17.8%; DO,39.9%; and COD,9.6%) and Factor 4(pH,59.2%; DO,54.5%; COD,39.9%; and NH3-N,16.6%). Factor 1 and Factor 4 are identified as diffused land use with multiple NPS pollution owing to high anthropogenic activities due to favorable working temperature. Moreover, water storage for local usage in dry season has aggravated the situation by retaining all kinds of pollutants including agricultural, municipal and industrial (World Bank 1997). Factor 2(pH,1.3%; DO,0.1%; COD,1.6%; and NH3-N,70.7%) is classified as agricultural land use based on the unique elevated levels of NH3-N and negligible loadings of the rest of the water quality parameters. It shows moderately low agricultural activities during spring season as compared to winter and summer. This may be due to application of insecticides and fertilizers used for crops growth (Zhong 2006). Factor 3(pH,21.7%; DO,5.5%; COD,48.9%; and NH3-N,12.6%) is dominated by pH and COD as evident from the profile illustrated in Fig. 4. Factor 3 is then identified as urban land use (Huang et al. 2013; Pratt and Chang 2012; Xiao et al. 2016; Zhao et al. 2015).

Four pollution sources are identified for autumn season. All parameters are contributing in Factor 1 and Factor 4 as apparent from the profile as shown in Fig. 4. Factor 1(pH,58%; DO,78.3%; COD,35.8%; and NH3-N,10%) and Factor 4(pH,31.8%; DO,19%; COD,24.7%; and NH3-N,3.4%) are identified as diffused land use with multiple NPS pollution owing to rains. Storm water sweeps all kinds of contaminants from different land uses to the river. Factor 2(pH,5.7%; DO,2.7%; COD,29.2%; and NH3-N,4.3%) is characterized by high concentrations of COD, and negligible loading of the remaining three water quality parameters is possibly an industrial land use. Factor 3(pH,4.5%; COD,10.3%; and NH3-N,82.3%) is identified as agricultural land use with high loading of NH3-N and negligible loading of the rest water quality parameters. It shows agricultural activities accompanied by leaf fall during autumn season (Huang et al. 2013; Pratt and Chang 2012; Xiao et al. 2016; Zhao et al. 2015).

Fig. 5
figure 5

Spatial factor loadings obtained from PMF analysis of water quality parameters of Huaihe River basin. (For group#01: Factor 1 and Factor 4 = diffused land use, Factor 2 = agricultural land use and Factor 3 = urban land use. For group#02: Factor 1 = urban land use, Factor 2 = agricultural land use, Factor 3 and Factor 4 = diffused land use. For group#03: Factor 1 and Factor 3 = diffused land use, Factor 2 = agricultural land use and Factor 4 = urban land use. For group#04: Factor 1 = agricultural land use, Factor 2 = urban land use, actor 3 = industrial land use and Factor 4 = diffused land use.)

Spatial PMF analysis was carried out using the same data matrix of 4 water quality parameters clustered in four classes using HCA. Four NPS pollution has been identified per group. It is obvious from the profile of group#01 that Factor 1(pH,20.5%; DO,35%; COD,28.7%; and NH3-N,8.9%) and Factor 4(pH,38.1%; DO,45.7%; COD,16.2%; and NH3-N,9.1%) are identified as diffused land use as all the parameters are contributing. It is composed of various kinds of land uses including lakes (Taodian and Kao Tang), forest land use (Laoxin River), agriculture land use, towns and villages. Factor 2(pH,6.9%; DO,7%; COD,7%; and NH3-N,76.1%) is identified as agricultural land use as obvious from the high effluent of NH3-N produced due to fertilizer application. Agricultural activities are high in Hung Tse Lake area and Wang jiaba dam area due to abundant water for irrigation purposes. Factor 3(pH,34.4%; DO,12.3%; COD,48.2%; and NH3-N,5.9%) is identified as urban land use with high chemical flux as obvious from high contribution of pH along with COD as obvious from Fig. 5. Pizhou, Fengyang, Shangqiu and Xinyi are thickly populated cities having multiple NPS pollution (Huang et al. 2013; Pratt and Chang 2012; Xiao et al. 2016; Zhao et al. 2015).

Fig. 6
figure 6

Temporal variations of water quality parameters for Huaihe River basin

For group#2, four NPS pollution is identified. It is obvious from the profile of group#02 that Factor 1(pH,10.3%; DO,2.1%; COD,33.8%; and NH3-N,4.6%) is most likely an urban land use. Dagoucun is thickly populated urban area having multiple NPS pollution. Factor 2(pH,10.7%; DO,6.3%; COD,7.9%; and NH3-N,76.4%) is identified as agricultural land use due to high effluent of NH3-N produced from widespread growing area due to application of fertilizers, with this claim supported by evidence that agricultural activities are high in Luoma lake area and Heilongtan reservoir area due to easy availability of crops water. Corn and peanut are the most popular crops of this area. Factor 3(pH,42%; DO,67.5%; COD,30.7%; and NH3-N,6.7%) and Factor 4(pH,37%; DO,24.1%; COD,27.6%; and NH3-N,12.3%) are identified as diffused land use. It is composed of various kinds of land uses including mountainous area (Yunlong), forest (Yunlong), water reservoir (Huang chao guan), corn fields and villages (Huang et al. 2013; Pratt and Chang 2012; Xiao et al. 2016; Zhao et al. 2015).

Four possible pollution sources are recognized for group#03. Factor 1(pH,56.9%; DO,79.2%; COD,41.5%; and NH3-N,10.1%) and Factor 3(pH,9.1%; DO,10%; COD,22.5%; and NH3-N,4.3%) were identified as diffused land use with multiple emission sources. It is comprised of various kinds of land uses including agriculture land use, towns (Tashan, Chen Lou, etc.) and grass land use. Factor 2(pH,3.6%; DO,4.2%; COD,7.9%; and NH3-N,73.7%) is identified as agricultural land use due to high effluent of NH3-N. Along with grain crops, rape flowers (Linquan and Fujing area) are grown in this area. Factor 4(pH,30.4%; DO,6.5%; COD,28.1%; and NH3-N,11.8%) is suggested to be an urban land use as obvious from high contribution of pH along with COD. Nieloucun, Pengxinzhuang and Chenglizhuang are thickly populated urban areas producing various kinds of pollutants (Huang et al. 2013; Pratt and Chang 2012; Xiao et al. 2016; Zhao et al. 2015).

Four sources of pollution are suggested for group#04. Factor 1(pH,16.1%; DO,4.2%; COD,17%; and NH3-N,82.5%) with highest loading of NH3-N is probably an agricultural land use. This claim is supported by a fact that this area is popular for corn crops and soya bean growth. Agricultural activities are higher in Hua Jia Lake area owing to excessive amount of water for irrigation purpose. Factor 2(pH,38.5%; DO,13.3%; COD,31.5%; and NH3-N,9.3%) is identified as urban land use. Huaibei City is a thickly populated area with multiple NPS pollution. Factor 3(pH,2.7%; DO,1.8%; COD,39.9%; and NH3-N,1%) is identified as industrial land use. Industrial estate is located in this area. Large numbers of industries are located in it which includes steel factories and its wire houses, chemical factories, electricity factory and mechanical factories, etc. Coal mine (Yonggu Mine) has also aggravated the situation. Factor 4(pH,42.7%; DO,80.7%; COD,11.6%; and NH3-N,7.2%) is identified as diffused land use. It is composed of various kinds of land uses including corn fields, aspen forests, pigeon forms and towns (Xincun, Shenqiu, etc.) (Huang et al. 2013; Pratt and Chang 2012; Xiao et al. 2016; Zhao et al. 2015).

It is safety to note from the above discussion that urban, industrial and agricultural land uses are the main contributor for Huaihe river water quality impairment (Xiao et al. 2016). Anthropogenic activities play a vital role in river water quality impairment at urban communal level (Yin et al. 2005).

Box and whisker plot is a convenient graphical tool that helps to assess different patterns related to spatiotemporal variations in water quality as shown in Figs. 6 and 7. An uphill trend was observed in the concentration of DO owing to decreasing temperature in autumn, spring and winter as compared to summer. pH, COD and NH3-N show increase in winter and spring over autumn and summer owing to dilution caused by excessive rains during summer and autumn seasons (Singh et al. 2004). pH shows minor uphill spatial gradient from group#01 to group#04 as compared to other three water quality parameters as demonstrated in Fig. 7, owing to pollution load entering at some regions from several urban sources (Singh et al. 2004). Similarly, spatial pattern was observed in the concentration of DO as group#02 shows uphill trend, while group#03 and group#04 show downhill trend as compared to group#01 which shows its contamination level as shown in Fig. 7. Heavy loads of pollutants make entry from urban, industrial and agricultural land use in all those regions contained in group#03 and group#04. These regions show downhill trend in DO owing to oxygen consumption in organic matter decay. Similarly, spatial toxic uphill gradient was observed in both COD and NH3-N from group#01 to group#04 as shown in Fig. 7. It shows that these regions are vulnerable to intense toxicity from multiple NPS pollution originating from urban, industrial and agricultural land uses.

Fig. 7
figure 7

Groupwise spatial variations of water quality for Huaihe River basin

Discussions

Contribution of land use to spatial water quality

Literature shows that land uses are associated with point and NPS pollution (industrial and domestic sewage, livestock wastewater, irrigation tail water discharges, storm runoff) which degrade surface water quality (Huang et al. 2016). NPS pollution is prominent in rainy season, while point source pollution is dominant in dry season. Urban land continuously deteriorates river water quality throughout the year with strongest impacts in dry season (summer and autumn) due to lower dilution (Bu et al. 2014; Yu et al. 2016). This could be exemplified by high loading of COD across different seasons.

High loading of nitrogen parameters in surface water is attributed to overfertilization and erosion caused by surface runoff and irrigation tail water discharges (King et al. 2005; Tong and Chen 2002; Unwin et al. 2010). The elevated concentration of NH3-N in surface waters may be a consequence of irrigation tail water discharges due to favorable environmental condition in the region (Afed et al. 2017). Agricultural land continuously impairs surface water quality throughout the year with strongest impacts in dry season (summer and autumn) due to lower dilution (Liu et al. 2017). Industrial land use has positive association with COD. Water quality of industrial area is highly contaminated as compared to urban and suburban area (Zhao et al. 2015). Lacks of treatment facilities in industrial and urban area badly deteriorate surface water quality (Ding et al. 2015; Ho and Hui 2001; Sun et al. 2013).

Contribution of land use to seasonal contamination risk

Seasonal variation of surface water quality is associated with land use composition. The results of the current study suggested that water quality exhibits seasonality for a distinct land use composition for a particular site. Rainy season alters surface water quality due to instream flushing effect and dilution (Park et al. 2011) resulting in seasonal variations of point and NPS pollution (Ye et al. 2014). NH3-N and COD loading is higher in rainless season due to lower dilution effect. NH3-N concentration is higher in spring which may be due to high agricultural activities in the region (Liu et al. 2017). The study area is characterized by double cropping system, i.e., wheat and maize. Fertilizer application is common in March and April for winter wheat. Irrigation tailwater discharges due to conventional irrigation system in the study area deteriorate surface water quality via nutrient loss with soil erosion (Yu et al. 2016). Furthermore, seasonal first flush of rainfall is the second potential cause of water quality degradation (Liu et al. 2016). Late spring rainfall, after long dry season, drains fertilizers and pesticides to nearby watercourses which degrade surface water quality (Liu et al. 2017).

Local management implication for Huaihe River

NPS pollution strongly depends on land use which varies from one monitoring site to another monitoring site. Here, NPS pollutants are identified via multiple water quality variables. Therefore, it is necessary to implement comprehensive best management practices (BMPs) at various land use level to address multiple concerns of water quality deterioration. Seasonal variation of NPS pollution explains seasonal behavior of pollutants emission. NH3-N and COD concentration is higher in dry season in agricultural region and urbanized area. Agricultural and urban areas pose high risk of contamination during spring. The seasonal variability of NPS pollution can be beneficial in controlling seasonal contamination risk via BMPs (Liu et al. 2017). It is utmost important to control urban runoff in order to fulfill the country discharge standards, while sluices regulation, precision farming, terraced fields and buffer zone will be helpful in water quality improvement (Xia et al. 2018).

Limitations

Up to certain extent, the authors have faced problems in the apportionment of NPS pollution based on land use due to the following facts. First, the authors faced difficulty in the identification of NPS pollution owing to the availability of limited number of water quality parameters. Secondly, the authors faced problems due to unavailability of field benchmark NPS pollution emission profiles of different sources at different land use levels. Field NPS pollution emission profile of different sources works as benchmark for PMF results. Validation of PMF results can be done via field benchmark NPS pollution emission profiles.

Conclusions

The underlying information was extracted from the complex multidimensional water quality data matrix via multivariate statistical techniques and PMF analysis for Huaihe River basin. HCA clustered twenty-seven water quality monitoring stations into four groups of similar water quality characteristics based on four water quality parameters. MCA has identified that some water quality monitoring stations (3, 9 and 21) are located away from the rest which suggests that they have different water quality characteristics due to unique pollutant emission sources. Box and whisker plots have suggested that temporal trends are possibly influenced by temperature and rainfall, while spatial trends are linked with NPS pollution from different land uses, i.e., agricultural, urban and industrial land uses. PMF identified 4 factors per each group and each season based on land use which gives clear picture of NPS pollution originating from different land uses. Each factor identified by PMF analysis shows the severity of anthropogenic activities at different land use level. Besides, NPS pollution varies with season which shows its possible linkages with natural processes, for instance hydrological regime. The seasonal contamination patterns will be beneficial in controlling seasonal pollution risk. Generally, Huaihe River water quality was mainly impaired by land use variation, flows regulated by sluices and dams, etc. In high regulated rivers scientific regulation of dams and sluices may be helpful in alleviating water quality pollution problems. The proposed pollution apportionment approach supports the management and planning of the ongoing Huaihe River water pollution control project.