1 Introduction

Water is the source of all life and an indispensable substance for living organisms (Jiang et al. 2020). Groundwater is an important source of water for agricultural production and domestic uses, among others (Adimalla and Wu 2019), especially in areas lacking surface water. It is well known that groundwater is a vulnerable nature resource, and the mounting pressure on groundwater resources may result in a shortage of water resources coupled with their chemical contamination (Nurtazin et al. 2020; Chen et al. 2021). Once groundwater is contaminated, it severely disrupts the aquatic ecosystem’s structure and functioning, such that reverting to its original state takes very a long time. Moreover, the consumption of groundwater with excess amount of contaminants may endanger human health. Trace elements are viewed as priority contaminants, because their presence in excess concentrations may threaten human health and disrupt ecological security (Hossain and Patra 2020). Therefore, a better understanding of the degree of water contamination and potential health risk via exposure to these trace elements is necessary.

Health risk assessment is a method that links health to the degree of water pollution (Kaur et al. 2019). It can be used quantitatively describe health risks; however, most studies have evaluated health risks by using the US-EPA model (Bodrud-Doza et al. 2019; Cai et al. 2019; Xiao et al. 2019; Zhang et al. 2018), which relies on a single point value of each trace element to estimate the risk hazard. The results obtained using this method may overestimate or underestimate the true risks, and may foster uncertainty in the model results. Because of the complexities and uncertainties of the water environment, the process and results of health risk assessment are best characterized by fuzziness. Therefore, many scholars have introduced uncertainty theory into groundwater health risk assessments to address the uncertainty in their model results, for which probabilistic approach to evaluate health risks has recently become a popular trend (Yang et al. 2018; Qiu and Gui 2019; Zhang et al. 2017). Aided by Monte Carlo simulations, a probabilistic risk analysis uses the probability distribution of an input variable to evaluate the probability distribution of its output risk. Hence, this approach is likely to yield higher accuracy in the health risk assessment model’s results, which assesses the ranges of possible risk values, assigning a specific probability to each (Saha et al. 2017).

In the present work, the study area is a major mining area in the Huaibei coal basin. It is rich in coal, with 19 active coal mines, some of which have a long mining history. In recent years, investigations by researchers suggest that certain coal mines are facing a water shortage and chemical contamination. The fly ash and coal cinder from coal-mining activities are laden with toxic trace elements and easily enter groundwater via runoff and infiltration (as well as other forces) during coal washing and coal cinder stacking—finally causing surface water pollution and shallow groundwater pollution (Wang et al. 2019; Long and Luo 2020; Hu et al. 2021). Moreover, excessive mid-layer groundwater exploitation causes a decline in water levels as well as an increase in the water level difference between the shallow and mid-layer groundwater bodies. Consequently, leakage recharge occurs where the first aquifuge is thin, that is from shallow groundwater (SG) to mid-layer groundwater (MG), and then toxic trace elements produced by surface coal mining activities enter MG, exacerbating its pollution. Taking that into consideration, this study systematically collected SG and MG samples and measured eight trace elements in them. The present research sought to investigate the degree of SG and MG contamination, and find out which pollutants were affecting the groundwater quality. In addition, this study introduced Monte Carlo simulations into the US-EPA health risk assessment model, so that its results could attain greater accuracy. The findings of this research work are expected to provide a scientific basis for strengthening the environmental protection of drinking water.

2 Material and methods

2.1 Study area and its geological background

The Sulin mining area is located in the Northern Anhui Province, China, lying between 33° 20′–33° 42′ N and 116° 15′–117° 12′ E (Fig. 1). The study area has a relatively simple terrain, and is home to 19 coal mines. This region under investigation has mainly a semi-humid monsoon type of climate. The mean temperature and rainfall are 14.6 °C and 867 mm, respectively. The Huaihe River and Tuohe River are natural surface water bodies that flow through the study area, being medium and small seasonal rivers (Yu et al. 2020a, b).

Fig. 1
figure 1

Geographical map of the study area in China (a, b) and water sampling locations (c)

The study area is covered by a Quaternary thick, loose layer of Cenozoic. According to the regional stratigraphic distribution, lithologic characteristics, and the spatial distribution of the aquifer, the regional aquifer groups can be divided into those of loose rock pore, conglomerate karst, clastic rock and igneous rock fracture, and limestone karst fissure. Loose rock pore aquifer groups mainly consist of Quaternary and Neogene sand and gravel layers, which are mixed with a clay layer. The Cenozoic group contains four aquifers, namely, the first, second, third, and fourth aquifers (Qiu et al. 2019). The SG is derived from the first aquifer having a mean thickness of 30 m, while MG comes from the second and third aquifers. Collectively, they are the main water sources of industrial, agricultural, and domestic water in the mining area.

2.2 Field sampling

A total of 52 fresh groundwater samples (including 34 SG and 18 MG samples) were taken in 2.5-L prewashed sterile polyethylene bottles, from across the entire coal-mining district in July and August 2017. The SG samples were drawn from domestic wells (depth to water levels were less than 30 m), using a systematic random sampling technique, while MG samples were drawn from water source wells (depth to water levels were 80–130 m). The specific sampling locations appear in Fig. 1. To minimize pollution, prior to pouring the water samples into the bottles, each well was pumped for 10–15 min, and then the bottles were rinsed with sample water. A portable GPS device was used to record the coordinates of each sampling site. During their collection, all the water samples were acidified, using technical grade concentrated nitric acid at pH < 2. The sampling bottles were sealed, labeled, and taken to the laboratory for further analysis.

2.3 Chemical analysis

Trace elements’ concentrations were tested by an inductively coupled plasma-mass spectrometry (ICP-MS) (Agilent 7500cx). The concentrations of eight trace elements—Cr, Mn, Ni, Cu, Zn, Sr, Cd, and Pb—were higher than the detection limit, while other trace elements (e.g., As and Hg) contents were lower than the detection limit. At the time of groundwater samples’ testing, the calibration of the instruments (for example procedural blanks, reagent blanks, and detection limits) were designed to ensure robust quality control. Recoveries of the measured trace elements were 95.2%–102.5% for spiked blanks, and 97.5%–101.5% for spiked samples with ± 5% standard deviation. All the samples were tested thrice, and their averaged values regarded as the final experimental data for each target trace element. The relative percentage differences of the experimental data remained below 5%.

2.4 Human health risk assessment methods

2.4.1 Exposure assessment

The chemical constituents in contaminated groundwater may pose a detrimental health impact to humans via ingestion, inhalation and dermal absorption (Al osman et al. 2019). For trace elements in a groundwater environment, all other exposure routes can be ignored when compared with oral ingestion. The average daily dose (ADD) (μg/(kg d)) through oral ingestion for adults was calculated using Eq. (1).

$${\text{ADD}} = \frac{{C_{\text{water}} \times {\text{IR}} \times {\text{EF}} \times {\text{ED}}}}{{{\text{BW}} \times {\text{AT}}}}$$
(1)

Its model parameters are given in Table 1.

Table 1 Model parameters and their values

2.4.2 Non-carcinogenic risk assessment

The non-cancer risks in a specific pathway posed by each detected trace element can be conveyed as a hazard quotient (HQ). The HQ was calculated as:

$$\text{HQ} = \frac{{\text{ADD}}}{{\text{RfD}}}$$
(2)

where, RfD is the reference dose (μg/(kg d)). For each trace element, its RfD values (Table 2) were gathered from the US-EPA’s integrated risk information system (IRIS) database.

Table 2 Values of RfD and SF

To evaluate the total non-cancer hazards from oral ingestion, we introduced the hazard index, which is sum of HQs from all eight detected trace elements.

$$\text{HI} = \sum\limits_{{i = 1}}^{n} {\text{HQ}_{i} }$$
(3)

An HI of < 1 indicates an acceptable level of risk.

2.4.3 Carcinogenic risk assessment

Carcinogenic risks (CR) caused by carcinogens via oral ingestion exposure pathway were quantified using Eq. (4).

$$\text{CR} = \text{ADD} \times \text{SF}$$
(4)

where, SF denotes the carcinogenic slope factor of a given pollutant, expressed as (mg/(kg d)). The values used for SF are in Table 2. If several carcinogens are present in the groundwater environment at the same time, the carcinogenic risks caused by all carcinogens are summed.

According to IRIS, both Cr and Cd are regarded as carcinogens, and all detected trace elements have non-carcinogenic risks. A carcinogenic risk value below 1 × 10−6 is considered as the acceptable level.

2.5 Monte Carlo simulation

The traditional deterministic risk assessment method only involves the application of simple formula, without considering the random variation of concentrations (Qiu and Gui 2019). This study introduced a Monte Carlo simulation, which works well at modeling the probability of different outcomes that previously were difficult to predict because of the influence of random variables. This method was applied in the human health risk assessment, to better understand the impact of risk, and to reduce the randomness and uncertainty in prediction. Furthermore, a follow-up sensitivity analysis was also carried out to identify the most influential variables (Hossain and Patra 2020). The Monte Carlo simulation customizes a series of random values for each input variable, and runs the model using 10,000 iterations to obtain the probability distributions with different confidence levels (1%–99%), implemented in Crystal Ball software. The fitted probability distributions and AD statistic values of the detected trace elements in SG and MG are listed in Table 3.

Table 3 Fitted distribution of elements concentration for groundwater and goodness of fit (GoF) test outcome

3 Results and discussion

3.1 Pollution of groundwater by trace elements

The results of the trace elements’ analysis are shown in Table 4. The highest contents found were those of Sr and Mn, while the lowest were those of Cd and Pb. According to the standard for groundwater quality of China (GB/T 14848–2017), and drinking water standards of the World Health Organization (WHO) (WHO 2011), the concentrations of Sr and Mn found here were higher than their permissible concentration, while the contents of other trace elements occurred at relatively low levels. Therefore, it can be inferred that the groundwater was polluted with Sr and Mn to different degrees.

Table 4 Composition of trace elements in the two groundwater types

According to the statistics, 28.13% of SG samples had an Mn content above the standard values set by China and WHO, whereas 61.11% of the MG samples’ Mn content exceeded the standard values. Related research findings on shallow water in the study area reported that 41.18% of the Mn samples exceeded the standard values (Qiu et al. 2018). The high concentration of Mn is likely due to the regional stratigraphic lithology. The loose layer of the Sulin coal-mining district is mainly composed of clay, silt–clay, and silt-loam. In these types of soil, the content of clay minerals (such as montmorillonite and illite) are relatively high, which enables them to adsorb Mn. Water–rock interactions can augment the Mn content of groundwater (Zhang et al. 2012). Table 4 also shows that the Sr content of both SG and MG samples were above the minimum reporting level (MRL) of 0.3 μg/L. Nearly half (46.88%) of the SG samples had an Sr content above the health reference level (HRL) of 1500 μg/L (USEPA 2015), whereas 22.22% of MG samples’ Sr contents surpassed the HRL. These findings are in line with a previous study (Fang et al. 2020). There are two sources of Sr: one is the primitive Sr existing in the formation of the earth, and the other is the 87Sr of radioactive origin. The loose layer in the Sulin mining area is composed of weathered products of silicate, carbonate, and sulfate rocks, all of which are always enriched with Sr, and this Sr is able to enter SG and MG via water–rock interactions (Gui and Chen 2016).

3.2 Human health risk assessment

The health risks caused by each analyzed trace element were calculated using Eqs. (1)–(4). In the probabilistic risk analysis model, outliers could drive deviations in the maximum and minimum model values; so, it was suggested that 5th percentile (5%) and 95th percentile (95%) be viewed respectively as the low- and high-end estimates for the probabilistic risk calculations (Saha et al. 2017).

3.2.1 Hazard quotient (HQ)

The HQs of all the analyzed trace elements for adults through ingestion exposure were calculated using Monte Carlo simulations. At 95% confidence level, probabilistic estimates of HQ for SG and MG via ingestion by an adult are graphically displayed in Figs. 2 and 3, respectively.

Fig. 2
figure 2

Probabilistic estimates of HQ for SG

Fig. 3
figure 3

Probabilistic estimates of HQ for MG

Those clearly show the estimated 5%, mean and 95% values of the HQs related to each analyzed trace element through SG and MG by oral ingestion, respectively. The mean health risk values of eight trace elements in SG were ranked as follows: Sr > Mn > Cd > Cr > Ni > Pb > Cu > Zn; those in MG were ranked thus: Sr > Mn > Cr > Ni > Zn > Cd > Cu > Pb. In both cases, the highest risk values posed by trace element were observed for Sr. Combined with the regional stratigraphy, the loose layer of this area is composed of weathered products of silicate, carbonate, and sulfate rocks, always enriched with Sr (Gui and Chen 2016), which resulted in the high pollution risk level caused by Sr.

Generally, an HQ > 1 would indicate that an adverse effect caused by that trace element on human health is likely to occur. Whether it is SG or MG, the mean and 95th percentile HQ values of eight trace elements were within the line of the benchmark value of 1, as shown in Figs. 2 and 3. Therefore, non-carcinogenic health risks due to Cr, Mn, Ni, Cu, Zn, Sr, Cd, and Pb exposure were characterized as insignificant.

3.2.2 Hazard index (HI)

Probabilistic estimates of HI for the detected trace elements in SG and MG through exposure to ingestion appear in Fig. 4. For both SG and MG ingestion, the mean and 95th percentile values of their HIs were also well below the safety level of 1, suggesting no possible health risk posed to the local populace in terms of groundwater ingestion in the Sulin coal-mining district. For the exposed population to SG and MG, Sr contributed the most risk; hence, special attention and investigation should be paid to this trace element.

Fig. 4
figure 4

Probabilistic estimates of HI: a HI for SG, b HI for MG

3.2.3 Carcinogenic risk

The carcinogenic risk to adults from ingestion exposure to SG and MG were calculated for Cr and Cd using Monte Carlo simulations. At the 95% confidence level, the probability distributions of health risk value caused by carcinogens are graphically displayed in Figs. 5 and 6. These results suggested that in the study area, those people who consumed SG contaminated with Cr and Cd had carcinogenic risk values varying from 7.70 × 10−8 to 4.78 × 10−7, and 7.61 × 10−8 to 1.20 × 10−7, respectively; for MG, the corresponding values varied from 1.63 × 10−7 to 1.89 × 10−6, 2.90 × 10−8 to 1.27 × 10−7, respectively. It is noted that potential carcinogenic health risk values for Cr and Cd from SG were well below 1 × 10−6, while for MG, the values of Cr exceeded the safety level of 1 × 10−6. Therefore, this study identified Cr as a priority pollutant. Because the study area is within a coal mining area, previous studies have shown that such high contents of Cr were mainly caused by diffuse coal dust infiltration during the process of coal mining (Yang and Xiong 2018). Therefore, the high pollution risk level caused by Cr might originate from long-term coal mining activities. This interpretation and view agrees with published articles on investigation by Fang et al. (2020) and Lin et al. (2017).

Fig. 5
figure 5

Probabilistic estimates of cancer risk for SG through ingestion by adults

Fig. 6
figure 6

Probabilistic estimates of cancer risk for MG through ingestion by adults

Considering the two trace elements Cr and Cd, the total cancer risks through ingestion exposure to SG and MG were estimated and displayed in Figs. 7 and 8. Probabilistic estimates for the total cancer risk values for SG ranged from 9.90 × 10−8 to 5.42 × 10−7, with a mean value of 2.29 × 10−7, while for MG they varied from 2.09 × 10−7 to 1.93 × 10−6, with a mean value of 8.41 × 10−7. Although the 95th percentile total cancer risk values for MG exceeded 1 × 10−6, the probabilistic mean total cancer risk was below the limit of 1 × 10−6. According to the derived cumulative probability distribution curves (Figs. 7b, 8b), the probabilistic simulation showed that 0% of total cancer risk values for SG and 29.39% of total cancer risk values for MG were higher than 1 × 10−6; this implied that MG might have low level of contamination and the local population can safely drink that groundwater. For the exposed population to SG and MG, Cr contributed more the total cancer risks than did Cd, therefore we should pay more attention to chromium pollution in the local groundwater.

Fig. 7
figure 7

Probability distribution: a Probability distribution of total cancer risks for SG. b Cumulative probability distribution of total cancer risks for SG

Fig. 8
figure 8

Probability distributions: a Probability distribution of total cancer risks for MG. b Cumulative probability distribution of total cancer risks for MG

3.2.4 Sensitivity analysis

With the help of Oracle Crystal Ball software, a sensitivity analysis (SA) was used to determine the most significant input variables that influenced the results of our risk estimation (Li et al. 2018). Figure 9 displays the results of the SA of non-carcinogenic health risks, for which Sr and Mn were evidently the two most influential variables for SG and MG. The SA results for carcinogenic health risks appear in Fig. 10, which show that Cr was the main variable affecting the total cancer risk. Although Monte Carlo simulation can quantify the uncertainty of human health risk, not all uncertainties in health risk assessment could be accounted for. Accordingly, the focus of future research work should be on reducing the uncertainty of risk.

Fig. 9
figure 9

The SA results of total non-carcinogenic risks: a SG, b MG

Fig. 10
figure 10

The SA results of total carcinogenic risks: a SG, b MG

4 Conclusions

This research collected 52 groundwater samples and investigated their eight trace elements’ concentrations, along with their health risks. The results revealed that 28.13% of SG samples and 61.11% of MG samples had Mn contents above the standard guideline values prescribed by WHO and China, while 46.88% of SG samples and 22.22% of MG samples had an Sr content higher than the HRL. The results of the health risk assessment suggested that the mean and 95th percentile values of HQ and HI associated with each target trace element, in SG and MG, through exposure to ingestion for adult, were all within the limit of 1. Moreover, probabilistic-estimated 95th percentile values of total cancer risk for MG exceeded 1 × 10−6, but the mean total cancer risk values for SG and MG were below the benchmark. We may conclude that the non-carcinogenic and carcinogenic risks from SG and MG are not serious in the studied region. The sensitivity analysis results indicated the Cr and Sr contents were the most influential factors affecting the probabilistic risk estimation model outcome.

Although Monte Carlo simulations can quantify the uncertainty of human health risk, other uncertainties persist in health risk assessment. Therefore, a more accurate and precise evaluation ought to be carried out by considering additional factors. Despite those uncertainties, the research findings here are useful for formulating a management plan for proper utilization of groundwater resource in the studied area. Moreover, the methods and techniques used in this study are helpful for exploring the risk management of groundwater toxicity for human health in other regions.