1 Introduction

Surface water resources act as sinks for pollutants arising from both human activities and natural processes [1, 2], with the former identified as the primary driver of pollution [3, 4]. This is evident when untreated wastewater is directly discharged into receiving surface water bodies, including rivers, canals, lakes, and ponds [5, 6]. As a result, ecological impacts due to organic pollution of fluvial components tends to increase from upstream to downstream [7], with higher levels of organic contaminants and ecological impacts observed in fluvial components during the wet season [8]. (Explain in more detail how this previous sentence relates to the following sentences) Due to the compounding effect of climate change, socioeconomic development, changing land-use practices, industrialisation, and increased water usage by upstream countries, the freshwater sources in the Vietnamese Mekong Delta (VMD) are witnessing a decline in both quantity and quality. Therefore, the development of a comprehensive monitoring and management scheme is imperative to oversee the changes to these vital resources and to contribute to safeguarding existing resources. Monitoring data on surface water quality helps managers understand the current state of water quality, serving as the foundation for identifying the factors contributing to pollution and suggesting mitigating solutions for water resource management [9,10,11].

Point sources, especially wastewater treatment discharges, are often indicated as primary causes for a decline in water quality. However, when wastewater is discharged into receiving water sources, it is often difficult to determine the exact sources of pollution. The extent of pollutant dispersion into the water bodies relies on factors such as the mechanism of dispersal, distance (from source to sink), temporal aspects, self-cleaning mechanisms, and the flow regime of the river [12,13,14,15]. Therefore, it is crucial to employ appropriate analytical approaches, giving careful consideration to relevant physical, chemical, and biological processes. This is pivotal to identify the primary sources of pollution that impact surface water quality, using data acquired from surface water quality monitoring networks.

Many studies, both globally and within Vietnam, have been undertaken to evaluate surface water quality. The studies often used the water quality index (WQI) combined with multivariate statistical analysis to evaluate surface water quality [16,17,18] or used one of the many other evaluation indices such as water quality index (WQI), comprehensive pollution index (CPI), organic pollution index (OPI), eutrophication index (EI), total pollution index (TPI), and metal pollution index (MPI) [4, 19, 20]. Cluster analysis (CA), principal component analysis (PCA), and multivariate regression statistics are also employed to assess the dynamics of surface water quality in spatial and temporal dimensions. These methods help identify correlations between different water quality indicators [21,22,23,24,25,26]. Other studies have used mathematical models to evaluate waste sources affecting the surface water quality of river basins [27,28,29].

Proper identification of discharge sources impacting surface water quality helped state management in implementing effective management plans and solutions [21, 27]. According to Bostanmaneshrad et al. and Mijares et al. [21, 30], the determination of pollution load is crucial to assess the impact of waste sources on the pollution status of the receiving source. However, in the VMD, identifying the pollutant loads to surface water quality resources still has numerous limitations. The studies frequently assess water quality using monitoring data, prompting subsequent discussions on related studies involving pollutants (What “substances”?); however, there is often a lack of clarity regarding specific waste sources associated with these evaluations. Therefore, this research aims to enhance the understanding of waste source identification impacting water quality in Vinh Long Province by (1) calculating the pollution load from discharge sources, which encompass both point and non-point sources, and (2) employing multivariable statistical analysis to assess precisely the influence of wastewater sources on surface water quality.

2 Materials and methods

2.1 Study area

Vinh Long Province (Fig. 1) is home to approximately 1 million residents and encompasses a natural area of approximately 150 thousand hectares, divided into six districts and a city. The land surface elevation is close to sea level, with lower elevations gradually rising slightly towards the river banks, particularly along the main Tien and Hau rivers [31]. The Hau River is characterized by an annual average flow of 13,800 m3/s, showing seasonal variations ranging from 5020 to 24,000 m3/s. The Tien River experiences a flow ranging from 2000 to 41,504 m3/s, with an average of 11,500 m3/s. Both rivers are characterized by their adjacent fertile floodplains. Because of its positioning between the two largest waterways in the VMD, the Tien River and the Hau River, inland water from over 90 river and canal systems provides most of the province’s water resources [31]. The abundant average yearly rainfall combined with a reasonably even distribution of surface water sources has created the perfect environment for both production and domestic uses. The Tien, Hau, Co Chien, and Mang Thit rivers all provide water to the region's water supply treatment plants [31]. Vinh Long Province also shares surface water resources with neighboring provinces, namely Tien Giang, Ben Tre, Tra Vinh, Hau Giang, Soc Trang, Dong Thap, and Can Tho.

Fig. 1
figure 1

Surface water bodies and wastewater monitoring fixed points in Vinh Long Province (a), with a zoom-in at Vinh Long City showing a greater density of monitoring station (b)

Over the period of 2017–2021, approximately 280 production units, primarily industrial zones, food processing facilities, seafood processing plants, slaughterhouses, livestock operations, healthcare facilities, and craft villages, continued to discharge wastewater into the receiving river systems. The proliferation of these diverse production facilities associated with both agricultural and domestic activities has substantially heightened the risk of surface water pollution. Wastewater from these facilities, categorized as PN and NPS, is typically directly released into rivers and canals, leading to the degradation of surface water quality. This impact is especially pronounced within the in-field channels. The surface water resources within the study area are confronted the ongoing growth in the agricultural, industrial, and manufacturing sectors, upon which the current economy and future socioeconomic development is structured.

2.2 Water sample collection and analysis

(what is “secondary” data?) Wastewater from domestic, cage aquaculture, and storm-water runoff are categorized as NPS, while wastewater from livestock, slaughterhouses, craft villages, pond aquaculture, industrial zones, healthcare facilities, food and seafood processing plants are considered as PS. The estimation of pollutant load is based on both the quantity and concentration of waste produced.

Wastewater samples were collected at the discharge of each wastewater treatment system as a PS, while domestic wastewater was collected from 2 locations at sewers in residential and urban areas from June–October 2021, with sampling conducted every ten days for the calculation of the pollution load. (Need a citation for TCVN and ISO and SME etc.—you cannot assume readers know these sources of methodologies) The sample collection follows the standard procedure according to TCVN 5999:1995 of the Ministry of Science and Technology of Vietnam [32] and ISO 5667–10:1992 of the International Organization for Standardization [33] and the preservation procedures according to TCVN 6663–3:2016 of the Ministry of Science and Technology of Vietnam [34], before being analyzed in the laboratory. Wastewater analysis parameters including BOD5, COD, TP, and TN were done using the corresponding methods: SMEWW 5210B:2017, SMEWW 5220C:2017, SMEWW 4500-P.B&E: 2017, and TCVN 6638:2000 according to the American Public Health Association, American Water Works Association, & Water Environment Federation, and the Ministry of Science and Technology of Vietnam [35, 36].

Sixty surface water monitoring data at fixed points (points or stations?) were systematically collected by the Department of Natural Resources and Environment of Vinh Long Province to assess the surface water quality during three distinct periods, including March (representing the annual dry season), June (representing the transitional season), and September (representing the wet season) within the timeframe of 2017–2021. The sample locations were selected to provide a comprehensive representation of surface water conditions across the province. These sampling fixed points were strategically distributed to cover over 30 rivers of different channel widths, ensuring a balanced coverage of the areas influenced by wastewater discharge by both PS and NPS.

Thirteen parameters were employed to assess surface water quality. All water parameters were assessed by laboratories recognized under ISO/IEC 17025:2017 and designated with the code Vilas 515. The analysis and evaluation procedures adhered to the QCVN standard (National technical regulation on surface water quality) for surface water quality specified in column A2 (QCVN 08:2015).

2.3 Data processing

This study used spatial data and attributes from 60 surface water monitoring stations in Vinh Long Province during the period 2017–2021. In addition, the study used data on the station location characteristics, the loading from wastewater facilities during the period of 2017–2021, and land use for 2020. This study also conducted wastewater sampling at waste sources for 4 months (average of every 10 days per time) during the dry season for calculating the pollution load of wastewater sources. Descriptive statistics, Kolmogorov–Smirnov Z test to check the normal distribution of data was carried out using SPSS 20 software. The WQI index were applied to evaluate surface water quality in Vinh Long during the period 2017–2021 based on the Vietnamese standard QCVN 08:2015. Next, cluster analysis (CA) was used to group monitoring locations based on the water quality index and principal component analysis (PCA) was also conducted to determine the main factors affecting surface water quality. Finally, one of the important issues of this study was to calculate the pollution load of waste sources (including point and non-point sources) to explain and clarify the impact of wastewater sources on surface water quality in Vinh Long Province.

2.3.1 Water quality index (WQI)

According to the regulations of the Vietnam Environment Administration (2019), the assessment of surface water quality for different uses was conducted using the WQI indexand was calculated and evaluated for the two seasons over the years at different positions (1). The WQI results show the level of water quality and the corresponding acceptable uses for a specific water body. The WQI ranges from 0 to 100, dividing water quality into five levels. Level 1 (indicated by blue color, 100 ≥ WQI ≥ 91) is good water quality that could be used for purposes of water supply. Level 2 (green, 90 ≥ WQI ≥ 76) is also acceptable for water supply for domestic uses but suitable water treatment measures are required. Level 3 (yellow) is acceptable for irrigation and other similar purposes (75 ≥ WQI ≥ 51). Level 4 (orange, 50 ≥ WQI ≥ 26) is suitable for transport and equivalent purposes while Level 5 (red, 25 ≥ WQI ≥ 0) is considered heavily polluted water such that proper treatment measures are urgently needed.

$$WQI = \frac{{WQI_{pH} }}{100}\left[ {\frac{1}{2}\mathop \sum \limits_{a = 1}^{2} WQI_{a} .WQI_{b} } \right]^{1/2}$$
(1)

where WQIa is the calculated WQI value for three parameters (DO, COD, BOD5, N–NH4+, N–NO3, P–PO43); WQIb is the calculated WQI value for coliform, E.coli; WQIpH is the calculated WQI value for pH (pH is in the range of 6–8.5).

2.3.2 Multivariate statistical methods

2.3.2.1 Cluster analysis (CA)

Water bodies in the study area were classified and study sites were grouped using cluster analysis (CA). Euclidean distance was used to assess the impact of waste, either from PS or NPS, on water quality. The CA results were reported as link distances, with clustering significance deemed present when Dlink/Dmax × 100 equal 60 [37]. Sampling stations were treated as dependent variables, while parameters were independent variables. Fixed points with similar water quality were grouped based on bonding distance [38] and presented in a hierarchical structure.

2.3.2.2 Principal component analysis (PCA)

Principal component analysis (PCA) was conducted to identify the primary components (PCs) responsible for fluctuations in water quality and to reveal their sources in the study area. By evaluating the correlation among water quality indicators, each PC was regarded as a potential pollution source [37, 39]. The Eigenvalue coefficient measured the significance of each PC, with Eigenvalues exceeding 1 showing the most critical components [38, 40]. The weighted correlation coefficient of each PC supported the identification of pollution sources and was categorized into three levels according to absolute load values: high (> 0.75), moderate (0.75–0.50), and weak (0.50–0.30) [40]. These analyses were conducted using Stat graphics Centurion version XVI software (Stat graphics Technologies Inc., The Plains, VA, USA). The principal component (PC) is expressed as Formula (2):

$$Z_{ij} = pc_{i1} x_{1j} + pc_{i2} x_{2j} + \cdots + pc_{im} x_{mj}$$
(2)

where z is the component score; pc is the component loading; x is the measured value of the variable; i is the component number; j is the sample number and m is the total number of variables.

2.3.2.3 Calculating pollution load from non-point sources

The pollution load from non-point sources was calculated for runoff, domestic wastewater, and cage aquaculture sources and was based on COD, BOD5, TN, TP parameters regulated by QCVN on wastewater as industrial production QCVN-40, domestic QCVN-14, livestock QCVN-62, healthcare QCVN-28, and seafood processing QCVN-11. The load of emission sources was calculated as follows:

2.3.2.4 Pollution load from rainwater runoff

The pollutant load from rainwater runoff was calculated based on land use data such as agriculture land, specialized land, residential land, and vacant land; average number of rainy days per year in the area; and unit of pollution load because of runoff water from land use types. The Formula (3) below [41] and Table 1 are used to compute the pollutant load by combining characteristics of COD, BOD5, TN, TP, area of land use types, and number of average annual rain days for the non-point sources in the study area.

$$L_{Nn} = n \times A \times L_{iNn} \times 10^{ - 3}$$
(3)

where LNn is the pollutant load of each land use purpose (tons/year); n is the number of rainy days in the year(days); A is the land use area for each purpose (km2); LiNn is the unit pollutant load (kg/km2/rainy day); Unit load 10–3 is the transfer coefficient unit.

Table 1 Pollutant load from rainwater runoff
2.3.2.5 Pollution load from domestic wastewater

The pollutant load from domestic wastewater was calculated based on the wastewater flow and pollutant concentration indicators [44] by using the following Formula (4).

$$\begin{gathered} L_{i} = C_{i} \times Q_{waste} \times 10^{ - 6} \hfill \\ {\text{and}}\;Q_{waste} = K_{supply} \times N \hfill \\ \end{gathered}$$
(4)

where Li is the pollutant load of parameter i (tons/year); Ci is the average concentration of parameter i (mg/L) (Based on the results of Table 2); Qwaste is the wastewater amount (L/day); Ksupply is the water supply coefficient (L/person a day) (120L/person in day following the TCXDVN 33:2006 standard); Unit load 10–6 is the unit transfer coefficient; N is the Total population (total resident population of Vinh Long Province, 2021 following the statistical yearbook of Vinh Long Province, 2021).

Table 2 Pollutant concentration of domestic wastewater
2.3.2.6 Pollution load from cage aquaculture

The pollutant load from cage aquaculture was calculated based on previous research [45] as shown in the Formula (5). In Vinh Long Province, 1714 cages of aquaculture were in operation and distributed along the Tien River. According to the Department of Agriculture and Rural Development of Vinh Long Province (2021), Vinh Long Province's aquaculture yield reached 19,888 tons/year in 2021 [46].

$$L_{ts} = SL \times L_{iTs } \times 10^{ - 3}$$
(5)

where Lts is the waste load from aquaculture (tons/year); SL is the aquaculture yield (tons/year); LiTs is the emission coefficient of parameter i (kg/ton/year) shown in the Table 3.

Table 3 Discharge coefficients from aquaculture
2.3.2.7 Calculating pollution load from point sources

Pollution load from point sources were computed flowing Formula (6) for livestock, slaughter, craft villages, pond aquaculture, industrial parks, healthcare, seafood processing, and food processing using the same pollutant concentration indicators as domestic wastewater and cage aquaculture [49]. In particular, the value of the calculation coefficient in Formula (6) is based on Table 4, in which the value of wastewater flow was collected from the Department of Natural Resources and Environment in 2021 and the value of wastewater pollution concentration per waste source was sampled and analyzed from discharge points.

$$L_{i} = C_{i} \times Q_{waste } \times 10^{ - 6} \times 365$$
(6)

where Li (tons/year) is the pollutant load calculated for parameter i; Ci (mg/L) is the average concentration of parameter i; Qwaste (m3/day) is the wastewater amount; Unit load 10–6: unit transfer coefficient.

Table 4 Pollution concentration and wastewater flow from point sources

3 Results and discussion

3.1 Surface water quality characteristics in Vinh Long Province

The physicochemical and microbiological characteristics of surface water in three periods, March (dry season), June (transitional season), and September (wet season) from 2017 to 2021 in Vinh Long are represented in Fig. 2. The water temperature in March was higher than that in June and September. The pH varied between 7.22 ± 0.08 and 7.77 ± 0.07, peaking during the dry season at an average of 7.49 ± 0.14. This value stayed consistent and under the threshold of the QCVN 08:2015 (column A2). The average EC value was 57 ± 22.6 mS/m, ranging from 32.8 ± 6.4 to 101.8 ± 40.9 mS/m. (note that EC can be a good indicator of rainfall and water flow, with EC increasing during low flow periods…… thus, does this agree with water flow or rainfall?) EC in March (101.8 ± 40.9 mS/m) was higher than the numbers in June (37.1 ± 20.7 mS/m) and September (32.8 ± 6.4 mS/m). This shows that as rainfall increases, EC tends to decrease in June and September (Fig. 2n).

Fig. 2
figure 2

Discriminant water quality parameters in the three assessment periods: March, June, and September during the study period (2017–2021)

The TSS concentration ranged from 28.5 ± 3.87 to 56.1 ± 13.9 mg/L, with an average of 41.2 ± 6.96 mg/L. At most monitoring fixed points, the concentration fell within the acceptable range of EPA criteria but exceeded the QCVN 08:2015 (30 mg/L) at some sites. The monitoring locations exceeding QCVN 08:2015 were 60 in September, 53 in June, and 51 in March. The turbidity ranged from 40.58 ± 14.73 NTU to 84.74 ± 43.42 NTU, with an average of 59.6 ± 8.09 NTU. TSS in September (56.1 ± 13.9 mg/L) was higher than those in June (28.5 ± 3.9 mg/L) and March (38.9 ± 3.1 mg/L). As a result, TSS increases with rainfall. (These parameters also trend with water flow and rainfall in most places. Was this true for these data?).

The dissolved oxygen (DO) value in the study area ranged from 3.88 ± 0.18 to 5.85 ± 0.06 mg/L, and an average of 4.64 ± 0.45 mg/L. In March, the DO values at all monitoring fixed points were under the threshold of the QCVN 08:2015. However, DO values at 10 and 17 monitoring fixed points exceeded QCVN 08:2015 in June and September, respectively.

The BOD5 parameter ranged from 4.48 ± 0.34 to 7.93 ± 0.28 mg/L, with an average of 5.93 ± 0.75 mg/L, exceeding the QCVN 08:2015 at 33 locations, which included main tributaries of the Hau River (NM42, NM55), and NM28, and NM58 belong to the inland rivers.

The COD concentration ranged from 8.94 ± 0.39 to 15.14 ± 0.7 mg/L, with an average of 11.49 ± 1.44 mg/L. Most of the sampling fixed points had COD values that reached QCVN 08:2015 in March, except for NM19, NM58, and NM59 belonging to the inland rivers. In September, the monitoring fixed points of NM01, NM05, NM09, NM32, NM 36, NM42, NM48, and NM50 along tributaries and main rivers exceeded this standard.

The concentration of N-NH4+ fluctuated in the range 0.24 ± 0.07 mg/L to 0.53 ± 0.24 mg/L, with an average of 0.37 ± 0.07 mg/L. There were 29 monitoring fixed points with N-NH4+ concentration exceeding QCVN 08:2015 simultaneously in March and June, mainly occurring in the inland rivers (NM28, NM02, NM03–NM08, NM09) and the tributary rivers (NM05, NM20–NM22, NM37). However, 59 fixed points with N–NH4+ concentration exceeded QCVN 08:2015 in September (except for NM23). The value of N–NO3 ranged from 0.18 ± 0.02 to 0.59 ± 0.1 mg/L, where the average was 0.35 ± 0.1 mg/L; it reached the QCVN 08:2015 at all monitoring fixed points. In contrast, P–PO43− concentration ranged from 0.38 ± 0.04 to 1.11 ± 0.2 mg/L, with an average of 0.68 ± 0.15 mg/L and was found to exceed the QCVN 08:2015 at all monitoring fixed points.

E.coli density ranged from 56.8 ± 8.4 MPN/100 mL to 489 ± 33.6 MPN/100 mL, with an average of 191 ± 109 MPN/100 mL. (Note that when parameters do not show a normal bell curve, then the median or geometric mean is a better representation than the average or mean. Consider this for all data.) Similarly, a high density of coliform was found with a range of 3166 ± 80.83 MPN/100 mL to 42,220 ± 26,806 MPN/100 mL, while its average was 10,517 ± 8771 MPN/100 mL. In March, June, and September, all monitoring station except for NM18 and NM40 exceeded the QCVN 08:2015 standard for E. coli. Similarly, in September, all monitoring fixed points exceeded the standard for total coliform. There were sampling fixed points with E. coli and coliform density over the QCVN 08:2015 standard dispersed in all river systems over Vinh Long Province. E. coli and coliform densities were, respectively, 9.55 and 8.44 times greater than the permissible limits (50 MPN/100 mL for E. coli and 5,000 MPN/100 mL for coliform). (Did E. coli and coliform levels follow rainfall events and continue to stay high for some time afterwards?).

The water temperature in March is higher than that in September, attributed to seasonal differences. The average value of temperature at the sampling sites had large fluctuations during the period 2017–2021 but was stayed within the preferred range for the growth of aquatic organisms [51]. The pH value was neutral, but some locations reached slightly alkaline conditions in March in the main rivers. This could increase the toxicity of ammonia in water, thus affecting the life of aquatic organisms [51]. The pH value is close to that in neighboring provinces, such as Can Tho (7.17–7.53), Dong Thap (7.00–7.52), and Tien Giang (7.2–7.8) [52,53,54]. It is also consistent with previous studies in Vinh Long in 2019 [31]. The findings presented that pH and water temperature are compatible with the evolution of the tropics, unaffected by natural factors [40]. They attained the acceptable quality for domestic water uses, also suitable for aquatic organisms’ growth [38].

The EC in the current study area was higher than Tien Giang Province (40.59–99.28 mS/m) [37], which was associated with the influence of geographic location and flow rate on the same Tien River. Vinh Long Province is near an estuary where the EC changes depending on ions in the ocean water. The EC value found in the study area was still in the threshold of 15–50 mS/m, which is suitable for fish and macroinvertebrates in freshwater environments [37]. High EC values in surface water are the most common in residential districts, industrial zones, and central markets where a substantial proportion of wastewater contains organic matters, resulting in greater mineral segregation [31]. TSS and water turbidity during September were higher than in March and June, which could be attributed to rainwater runoff and soil disturbance during agricultural production and cropping seasons. (Turbidity and TSS can indicate that high levels of sediment are introduced which can physically bury macroinvertebrates and effect fish. This could be important.) Turbidity and TSS can indicate that high levels of sediment are introduced which can physically bury macroinvertebrates and effect fish.

The DO concentration obtained was similar to those in the same study area in 2019 [31], (more literature review may be needed to state that DO does not go low for sensitive species of macroinvertebrates and fish Values below 4 at certain water temperatures can be limiting.). Vinh Long Province's DO was greater than other nearby provinces, including Tien Giang (3.2–4.0 mg/L) [54], Hau Giang (3.2–5.2 mg/L) [55], and Dong Thap (4.73–5.55 mg/L) [53]. Vinh Long's river system's greater water surface area makes it simpler for oxygen from the atmosphere to diffuse into the surface water [31]. There was an increase in DO because rain runoff from agriculture and aquaculture, along with the discharge of residential garbage, severely mixed with the river flow in the research locations. However, measurements of BOD5 and COD show organic matter contamination as an important factor along the Hau River sampling fixed points.

The values of nutrients in the study area were higher than those in the neighboring province, such as Hau Giang Province, with N–NH4+ and P–PO43 concentrations 0.27 mg/L and 0.23 mg/L respectively [55], Dong Thap concentrations 0.37 mg/L and 0.23 mg/L respectively [53], and Tien Giang concentrations 0.4 mg/L and 0.1 mg/L respectively [54]. However, these two parameters are compatible with a previous finding [31] for the present study Province wherein P–PO43− and N–NO3 values varied from 0.36–1.24 mg/L, 0.16–0.89 mg/L, respectively. The surface water quality in Vinh Long revealed signs of nutrient pollution but it was not severe [31]. According to Ustaoğlu and Tepe (2019), P–PO43− in surface water ranges from 0.05–0.3 mg/L, where its excess will lead to eutrophication [56]. Large-scale agricultural operations, as well as the requirement for high volume of water for irrigation, result in fertilizer and organic chemical residues, which result in high nutrient concentrations in wastewater outflow [31]. As a result, these problems are primarily associated with pollutant loads from agricultural activities in Vinh Long Province. The concentration of N-NO3 (> 0.7 mg/L) fell in the threshold range of 0.2–10 mg/L which is suitable for aquatic life [42], causing no eutrophication and not affecting human health. However, if solutions are not found to reduce the concentration of P–PO43− released from agricultural facilities, this could become a problem for the aquatic ecosystem in the long term.

Coliform and E. coli are indicators of pollution from human or animal feces, similar to previous findings [31] in Vinh Long. Because of the relatively high density of E.coli and coliforms, the surface water in Vinh Long was unsuitable for consumption. These findings are consistent with the results of previous studies in the VMD, including Dong Thap, Tien Giang, and Hau Giang [53,54,55]. They found that the density of E. coli and coliforms is high in the dry season [31]. In contrast, total coliform and E. coli increased during the rainy season in the study area. This is associated with transferring pollutants from terrestrial to water environments by the rainwater runoff, carrying waste of humans and animals to the rivers [57].

Water quality in the dry season is better than in the rainy season. This is likely because, during the rainy season, rainwater overflow carries pollutants, particularly waste, from agricultural production activities. In addition, water quality is also affected by runoff from neighboring areas into the Tien and Hau river systems.

3.2 Water quality index (WQI)

Four out of 60 monitoring fixed pointss (NM04, NM41, NM42, NM54) along the main rivers exhibited WQI values ranging from 76–90 (Fig. 3). Water quality at the monitoring fixed points was influenced by waste sources from aquaculture, domestic life, and agricultural production. Despite the rivers having a large, continuous flow, making the water quality suitable for domestic water supply purposes, appropriate treatment measures are still necessary.

Fig. 3
figure 3

Spatial distribution of Water Quality Index (WQI) classification at monitoring locations in Vinh Long Province (a), with a zoom-in at Vinh Long City showing a dense monitoring locations (b)

There were 34 out of 60 monitoring fixed points along the main and tributary rivers, with WQI values ranging from 51–75. This categorization denotes a medium water quality suitable for domestic, irrigation, and other comparable purposes. These rivers were influenced by factors such as domestic wastewater, aquaculture, production, business activities, and runoff from agricultural practices. The combination of these factors, coupled with low flow rates, contributes to suboptimal surface water quality.

There were 22 out of 60 monitoring fixed points with WQI values ranging from 26–51. These fixed points were along inland rivers or urban rivers, exhibiting poor water quality suitable only for navigation or similar purposes. The water quality in these rivers and canals was found to be directly influenced by domestic waste, production and industrial activities, as well as agricultural practices.

This study reveals significant variations in the surface water quality of rivers and canals in Vinh Long Province, both spatially and temporally. WQI values in the study area's rivers and canals ranged from 65 to 91, surpassing those observed in other localities within the VMD. For comparison, the WQI values in Bac Lieu Province during 2014–2019 ranged from 51 to 62 [58], and in Dong Thap Province in 2019, they varied from 25 to 64 [53].

3.3 Clustering surface water quality

The CA analysis (Fig. 4) findings reveal 3 groups formed by clustering at a distance (Dlink/Dmax) × 100 equaled 95 ??. Group 1 comprised 29 monitoring stations (NM01–NM04, NM06–NM16, NM23–NM26, NM30, NM34, NM35, NM39–MN41, NM47–NM49, NM56). These locations are primarily affected by domestic wastewater from urban areas, residential areas, and aquaculture activities. The wastewater from these sources has led to the parameters TSS, BOD5, N–NH4+, COD, P–PO43− exceeding QCVN 08:2015. Group 2 comprised 3 monitoring sites (NM27, NM28, NM29) mainly impacted by the same waste sources as group 1, along with runoff water carrying pollutants leading to elevated levels of TSS, BOD5, P-PO43, coliform and E. coli, exceeding QCVN 08:2015. Group 3 included 28 monitoring stations (NM05, NM17–NM22, NM31–NM33, NM36–NM38, NM42–NM46, NM50–NM55, NM57–NM60), which were influenced by agricultural activities (fruit trees, and rice cultivation) and domestic wastewater, resulting in exceeding QCVN 08:2015 for TSS, BOD5, N–NH4+, P–PO43−, coliform, and E. coli.

Fig. 4
figure 4

Clustering surface water quality for the water bodies of Vinh Long Province (No 1 = NM1…No 60 = NM60)

The study area’s complex spatial effects of water quality are highlighted through the division of groups from 60 sites between 2017 and 2021 (Table 5). Group 1 included monitoring stations in large rivers and around residential areas that received various sources of pollutants loaded from both domestic and aquaculture wastewater and were considered the least polluted, showing that all parameters insignificantly exceeded the standard QCVN 08:2015, yet the DO value was relatively high. The average DO concentration in the study area was higher than in neighboring provinces [31]. Meanwhile, Group 2 was mainly affected by rain overflow that led to high values of TSS, BOD5, P-PO43−, coliform, and E. coli, exceeding QCVN 08:2015 standard and causing organic pollution in this area. Group 3 experienced notable impacts from agricultural activities, including fruit and rice cultivation, as well as domestic wastewater. This resulted in elevated levels of TSS, BOD5, N–NH4+, P–PO43−, coliform, and E. coli, surpassing the QCVN 08:2015. In short, the impact of pollutant sources caused a difference in water quality among the three groups. The pollutant sources are determined by the different river/canal levels, the distance between production facilities and rivers, and the land use plan. (Did the smaller sections of the rivers with lower water flow show greater impacts?) The water quality in each group was recorded with the same impact sources with similar parameters exceeding the QCVN 08:2015. This could suggest simplifying the sampling data set and cutting the monitoring periods for monitoring tasks. Similarly, in the VMD, other water bodies have reported successful applications of CA techniques in water quality monitoring design [40].

Table 5 Comparing the parameters of water quality in each cluster to QCVN 08:2015 (are these average values and note that the mean may not be the best statistic for log-normal or non-normal data such as E. coli and coliform)

Based on the spatial distribution of each group, the results shown in Fig. 5 showed that Group 1 includes locations within urban areas, densely populated areas and aquaculture areas while Group 2 includes agricultural areas and residential areas. Group 3 is different from the two groups that it focused on specialized agricultural areas (such as fruit trees, rice, vegetables areas) and residential areas.

Fig. 5
figure 5

Spatial distribution of surface water quality by similarity group in Vinh Long Province (a), with a zoom-in at Vinh Long City showing a most dense similarity of group 1 (b)

3.4 Key variables influencing surface water quality

Four principal components (PCs) with Eigenvalues greater than one derived from the principal component analysis (PCA) significantly impacted variations in surface water quality (Fig. 6). These components account for 69% of the variance in total with various sources of pollution that impact the variation in surface water quality within the study area (Table 6).

Fig. 6
figure 6

Eigenvalues of the principal components

Table 6 Potential sources of pollution and key water quality parameters affecting surface water quality

The parameter coordinates in Table 6 showed that 26.8% of the variation of PC1 was explained by BOD5, COD, N–NH4+, and N–NO3, and the load factors for these parameters were positive and exhibited relatively high coordination, showing organic and nutritional pollution. On the PC2 axis, the load factors explained 18.9% of the variation in water quality and were positive for EC and negative for pH, temperature, and DO at the moderate correlation. PC3 exhibited a reasonably strong correlation with total coliform, and E. coli, explaining 13.2% of the variation in water quality. Lastly, PC4 had comparatively strong correlations with turbidity, accounting for 10.1% of the variation in water quality. Aside from that, the indicators on the remaining PCs continue to show a moderate correlation.

PC1 represents the significant influences of human activities on surface water, including PS and NPS pollution. PS’s originate from improper domestic waste disposal while NPS’s are from agricultural runoff carrying fertilizers and pesticides. The high TSS and COD levels are caused by contaminants entrained in residential wastewater and runoff. In addition, fertilizers are used in agricultural production and household laundry detergent as a source of phosphorus and nitrogen, respectively [59]. PC2 showed a natural process and surface runoff from acid sulfate areas, resulting in iron oxides leaching, and the positive load factor of temperature may be because of seasonal differences. PC3 also represents a runoff from agricultural and domestic activities, leaching human and animal excretions.

Seasonal variations in the weather or other industrial activities that raise the temperature of surface water could be the cause of the positive load factor of temperature [31]. A rise in turbidity in the study area is caused by increased agricultural farming activities eroding water bodies, as PC4 in Table 6 explains. Each PC represents a form of pollution, and those with moderate or higher correlation should be included in the water quality monitoring program [60]. Therefore, the surface water pollution of biochemical and physical factors ultimately distinguished among PCs. This is likely to impact chemical reactions, reaction rates, aquatic life, and the water’s suitability for other purposes [61], according to the direction of these principal components.

3.5 Assessing point and non-point sources of pollution

3.5.1 Pollutant loads from point sources

The pollution load from various wastewater sources and their corresponding observation locations is presented in Table 7. The industrial sector contributes about 26.1 tons/year of pollutants annually to Vinh Long’s rivers and canals, with COD being the most significant at 12.7 tons, followed by TN at 9.0 tons. Other parameters have less proportions. It is expected that the slaughterhouses release a yearly pollutant load of 23.5 tons/year. The pond aquaculture is the highest pollution load of point sources, accounting for 2,614 tons/year with COD being the predominant pollutant. The food processing industry releases approximately 7.3 tons annually, with rice vermicelli effluent containing concentrations of COD and BOD5. In the craft village, a small-scale activity by 33 households results in a substantial effluent discharge (99 m3/day), with BOD5 and COD accounting for most of the almost 1.9 tons of pollutants generated each year.

Table 7 Pollution load from point sources in Vinh Long Province in 2021

The pollution load from seafood processing wastewater containing high levels of COD and TN amounted to approximately 36 tons per year (Table 7). There is about 44.9 tons of pollutant load from wastewater produced annually at the 16 hospitals in Vinh Long, in which COD contributed the most weight in this sector. The pollutant load of the livestock sector is around 24.4 tons/year as calculated from a discharge of 320,363 heads of cattle and 10,867,820 heads of poultry. This animal waste leads to organic chemicals, nutrients, and the high concentrations of COD, BOD5, and TN in receiving water sources.

3.5.2 Pollutant loads from non-point sources

The pollution load of non-point sources is shown in Table 8, showing that the load from runoff was the highest; totaling around 19,337.7 tons per year. Agricultural land accounted for the highest share at 15,925 tons, followed by special-use land (2016 tons), residential land (1194 tons), and vacant land (202.7 tons). Within the agricultural sector, TN contributed the most significant pollution load, followed by BOD5. The estimated pollutant load from domestic wastewater, based on parameters such as BOD5, COD, TN, and TP, amounted to 6383 tons per year (Table 8). The pollutant load from aquaculture activities involving 1,714 cages in Vinh Long was determined based on BOD5, COD, TN, and TP parameters. The results show an annual generation of 518 tons of pollutants, primarily attributed to BOD5, COD, TN, and TP. Nutrients, including nitrogen and phosphorus compounds, derived from fertilizer decay, demonstrate how fertilizers contribute to pollution in the agricultural sector [31]. In addition, BOD5 and nitrogen compounds are generated by leftover food decay, demonstrating how population pressure and domestic activities contribute to the organic pollution in the river system. Direct release of effluents into the river system without treatment is practical. The biochemical parameters (BOD5, COD) and nutrient parameters (TN, TP) generated from feed is a significant source of waste, and the amount of waste produced depends on the feed’s nutrient composition, production method, ratio of feed size to fish size, amount of feed produced per time unit, feeding method, and storage time in the cage aquaculture systems [62].

Table 8 Pollution load from non-point sources in Vinh Long Province in 2021

4 Conclusions

From 2017 to 2021, the surface water quality in Vinh Long Province exhibited spatial and temporal variations. Water quality parameters exceeded the Vietnamese standard (QCVN 08:2015) and were contaminated with organic substances. Specifically, 33 monitoring fixed points during September showed fluctuations in BOD5 values ranging from 6.09 to 10.20 mg/L, exceeding the A2 threshold (> 6 mg/L) of the QCVN 08:2015 standard. Moreover, on September 8 the monitoring fixed points recorded COD values ranging from 15.68 to 17.60 mg/L, surpassing the A2 threshold (> 15 mg/L). Nutrients also contributed to the pollution, with 29 monitoring fixed points during March and June, and 59 fixed points during September, showing N-NH4+ values ranging from 0.34 to 0.80 mg/L, exceeding the A2 threshold (> 0.3 mg/L). Additionally, all monitoring fixed points exhibited P–PO43− values fluctuating from 0.38 ± 0.04 mg/L to 1.11 ± 0.2 mg/L, surpassing the A2 threshold (0.2 mg/L). The pollution was particularly severe in many small rivers and infield areas which have the least capacity to dilute the pollution. The calculated water quality index showed that surface water sources in most rivers were suitable for irrigation and other similar activities, whilst the water from a few rivers remained suitable for domestic purposes after remedial measures.

The surface water quality monitoring fixed points could be divided into three groups based on their water quality attributes via CA analysis. Group I contained mostly densely populated areas and aquaculture areas while Group II contained mainly agricultural production areas and residential areas. Group III, however, was seen to contain two groups, namely specialized agricultural areas (such as fruit trees, rice, and vegetables areas) and residential areas.

The PCA analysis results showed that water quality in rivers and canals in Vinh Long Province is affected by multiple sources of pollution, including (i) microbiological pollution, (ii) specific conductance and nutrients, (iii) total suspended solids and turbidity, and (iv) organic matter. Overall, the pollution load from non-point sources accounted for a large proportion (90.5% of the total load) compared to the load from point sources (9.5% of the total load). In the non-point sources group, the load from runoff accounts for the largest share (15,925 tons), followed by that from domestic wastewater (6383 tons) and caged aquaculture (518 tons) while the load from aquaculture accounts for the largest share in the point sources group, with the load from the remaining point sources deemed insignificant. Overall, the water quality in many surface water bodies across Vinh Long province is subpar, primarily attributed to significant pollution from non-point sources. However, all sources of pollution and variety of factors from point sources be considered for management plans in these study watersheds. Effective management of this type of pollution is essential to ensuring the safe use of water for different purposes. Especially, all sources of pollution and variety of factors from point sources need to be considered for management plans in these study watersheds in the coming years. In particular, using the groups of sampling stations/watershed regions and selecting for each a representative sampling station. In particular, use groups of sampling fixed points and select a representative sampling station for each station based on actual monitoring factors of pollution, seasonality, duration and extent of pollution dispersions.

(Another way to express this is that both PS and NPS pollution needs attention. If solutions to both were equally achievable, then NPS should be focused on more heavily than PS at first. However, all sources of pollution need attention and a variety of factors beyond this study must be considered in order to devise management plans for these important watersheds.)

Possible additional research: Consider using the groups of sampling stations/watershed regions and selecting for each a representative sampling station. Then look at the sampling station’s data in each case and see if it makes sense. Also, this could be important for designating where to place sampling stations for longer term water quality monitoring. What I see in Fig. 2 is that when temperature increases due to slower water flows, water flow is lower, EC increases due to less dilution, and DO goes down due to less turnover in the water. When water temperature decreases (due to rainfall and higher water flow) then turbidity and TSS go up (runoff), DO increases (water flow), BOD and COD increase (waste input), NH4 increases (waste and runoff), E. coli and total coliform increase (waste increases and runoff too). How long do these increased pollutants have this effect? How do different sized storms affect the magnitude of the pollution? All of this makes sense in most rivers. Taking a look at the data at the individual station level helps to verify the conclusions raised by the compiled data analyses. This is beyond the scope of this paper but may provide good continued research).