Introduction

Trace element contamination has received considerable attention due to their negative impact on the human health and environment (Adriano 2001). The natural and anthropogenic inputs enrich the soil with trace elements. Pollution occurs when an element quantity excess its background concentrations (Kabata-Pendias 2010). The elements that come from anthropogenic sources are generally more bioavailable than pedogenic and lithogenic ones (Kabata-Pendias 1993 and Kobierski and Dabkowska-Naskret 2012). Pedogenic metals are of lithogenic and anthropogenic origin, but their distribution in soil profiles changes due to mineral transformation and other pedogenic processes (Kobierski and Dabkowska-Naskret 2012). Understanding the pollutant’s sources and distributions is among the most critical concerns for environmental management and decision-making (Sun et al. 2013). Unfortunately, many researchers recorded soil pollution with trace elements in many parts of Egypt: Aswan (Darwish and Pöllmann 2015), Assiut (Asmoay 2017), Helwan (Said 2015), Kafr El-Sheikh (Naggar et al. 2014), Sohag (Salman 2013), and (Salman et al. 2017). Salman et al. 2017 pointed out the accumulation of trace elements in the food chain (Egyptian clover) in Sohag. The presence of such metals in the food chain can cause an adverse impact on human beings (Karim et al. 2015).

Many researchers used statistical analyses as powerful tools in geo-environmental studies (Zhang et al. 2009; Sundaray et al. 2011; Ming-Kai et al. 2013; Kelepertzis 2014; Simu et al. 2016; and Guo et al. 2017). Statistical analysis is a useful tool for assessing the possible sources of pollutants because it allows for consideration of cause-and-effect relationships, highlighting exceedances. Contamination indices also help in the understanding of ecological status. Contamination factor (CF) is employed to evaluate the level of soil contamination and to infer anthropogenic inputs from the natural one. In the present study, we applied both statistical analysis and contamination factor in an attempt to distinguish anthropic sources from the natural one and to evaluate the level of soil contamination.

Materials and methods

Gerga district is one of the most important agricultural areas in Sohag governorate, Egypt. It contains one of the biggest sugar factories in Egypt. This industry was reported as a pollution source of soil with different chemicals (Zaki et al. 2015). It extends between longitudes 31° 46′–31° 55′ E and latitudes 26° 12′–26° 22′ N (Fig. 1). Geologically, it consists of Quaternary deposits and floodplain sediments of the River Nile (Said 1990). Due to soil degradation since the construction of the High Dam in 1968, more fertilizer has been applied to restore soil fertility. P and N fertilizers are the major used agrochemicals in the Gerga area. Roadways are crossing the cultivated lands of the study area. The traffic emissions besides application of fertilizers lead to the accumulation of trace elements which threaten agricultural soils and in turn the human health.

Fig. 1
figure 1

Map of the study area showing soil sampling sites

Sixteen agricultural soil samples (0–30-cm depth) were collected randomly as accessible (Fig. 1). ArcGIS10.2 (Desktop 2014) was used to prepare the sample map. Total trace elements were determined by digestion with 3 HCl:1 HNO3 mixture and analyzed using the atomic absorption spectrophotometer (Buck Scientific 205AA). The pH in soil was measured in 1:1 soil to water ratio by using the HANNA (HI93300) combined electrode. Calcium carbonate percentage (CaCO3%) and phosphorous (P) were estimated by the titrimetric and colorimetric methods, respectively. Soil organic matter percentage (SOM %) was determined according to the modified Walkley and Black method (USDA 2004).

Contamination level was assessed using the contamination factor (CF) recognized in (Hakanson 1980) based on the following equation:

$$ \mathrm{CF}={C}_{\mathrm{s}}/{C}_{\mathrm{b}} $$

where Cs is the concentration of metal in the study samples and Cb is the baseline concentration. Baseline concentrations as reported by (Turekian and Wedepohl 1961) were used as Cb during this study (Mn = 850 ppm, Co = 19 ppm, Ni = 68 ppm, and Pb = 20 ppm). Hakanson (1980) classified the contamination factor as follows: CF < 1 low, 1 to < 3 moderate, 3 to < 6 considerable, and > 6 high contamination.

Although the number of samples is relatively low, cluster analysis (CA) and principal component analysis (PCA) were conducted to take into account the complicated environmental situation of the study area. Several sources of contamination and several processes are influencing the occurrence of trace elements in the investigated soil. The minimum sample size recommended for conducting principle component is debatable in the literature. Generally, as the sample size increases, sampling error is reduced. Survey of literature relating to the minimum sample size used in principle component studies exhibited wide range of variation from 2 or less to 20 times the number of variables (Lingard and Rowlinson 2006). However, it is fair to say that no absolute rules can exist (Lingard and Rowlinson 2006). MacCallum et al. (1999) report that when data are strong the impact of sample size is greatly reduced. Strong data is data in which item commonalities that are consistently high factors exhibit high loadings (≥ 0.8) on a substantial number of items (at least three or four) and the number of factors is small. According to Guadagnoli and Velicer (1988), if components possess four or more variables with loadings above 0.60, the pattern may be interpreted whatever the sample size used. In a word, with high loadings, any sample size is okay. The present data possess seven variables with loadings of 0.712. The statistical analysis was performed using SPSS 16.0 software. Descriptive statistical analysis (minimum, maximum, mean, and coefficient of variation) of the soil physicochemical characteristics and element contents was performed as a first step towards an initial understanding of their distribution. And then, multivariate statistics were performed including principal component analysis (PCA) and cluster analysis (CA). PCA and CA were employed in an attempt to identify the common sources of the trace elements in the studied soil. PCA was implemented by means of the varimax rotation method which helps reduce the number of variables in fewer high loading components and facilitates their interpretation (Chen et al. 2008). Kaiser-Meyer-Olkin measure of sampling adequacy (KMO MSA) for the set of variables included in the analysis was 0.712. It exceeds the minimum requirement of 0.50 for overall MSA, with Bartlett’s test of sphericity (0.00), be less than the level of significance (Tabachnick and Fidell 2007). PCA was developed based on the Ward method using squared Euclidean distances (z-transformation) as a measure of similarity between samples based on their element content (Co, Ni, Pb, Mn, and P). The clustering results were provided in a hierarchical cluster (dendrogram).

Results

Table 1 illustrates the statistical summary of the analytical data. The measured sand, silt, clay, pH, CaCO3, OM, P, Co, Ni, Pb, and Mn values were 68.8%, 12.4%, 18.9%, 8.5, 2.8%, 2.1%, 0.3%, 12.6 ppm, 46.8 ppm, 11.9 ppm, and 990.6 ppm, respectively. P, Co, Ni, and Pb showed marked flocculation (C.V = 92.7%, 82.9%, 52.8%, and 92.2% respectively), whereas Mn exhibited a uniform distribution pattern (C.V = 15.4%). The calculated CF for Co, Ni, Pb, and Mn were around 0.7, 0.7, 0.6, and 1.2, respectively (Table 2). Generally, the studied samples were low contaminated with Co, Ni, and Pb and moderately contaminated with Mn according to CF.

Table 1 Descriptive statistical analysis of the studied soil data
Table 2 Contamination factor (CF) of the studied elements (italic numbers refer to the moderate contaminated samples)

Since the studied soil was found to be relatively enriched in some trace elements compared to those reported in Turekian and Wedepohl )1961(, PCA was performed to identify natural and anthropogenic sources. Two principal components were extracted from the investigated soil data, explaining 75.94% of the variance (Table 3). The first component (PC1) explains 48.74% and accounts for the majority of the variance in the dataset and includes the elements (Co, Ni, Pb, and P) of high variation coefficients. The second component (PC2) is responsible for 27.16% of the total variance and shows significant positive loadings for Mn, carbonate, and pH.

Table 3 Principal component loadings of soil data including variance % and cumulative %

Discussion

The texture of the studied soil was found to be muddy sand. Carbonate and sand content increased westwards near the limestone plateau. The soil contains low OM% as a result of study area aridity and agriculture practices (seasonal tilling) that oxidize the organic matter. The calculated contamination factor (CF) indicated that the investigated soil ranged from uncontaminated to moderately contaminated with the studied metals (Table 2). Moderately contaminated sites (2, 3, 4, 13, and 16) were most affected by anthropogenic inputs. Such an anthropic portion can cause serious environmental hazards. Now, low contaminated sites are not a serious environmental concern. Mn is derived mainly from the parent rocks of the Ethiopian plateau (Omer 1996).

The statistical analyses indicated the presence of two sources of metals in the studied soils: natural and anthropogenic. The high variance of elements loaded in PC1 and the anthropogenic marker of Pb can denote the anthropogenic source of PC1. Pb is usually accounted as a marker element of traffic activities; it is originated from leaded gasoline (Elnazer et al. 2015 and Wang et al. 2017). Co comes from tires, and Ni results from brake wear, engine oil leakage, tire wear, and road abrasion (Winther and Slentø 2010). Also, the high positive loadings of Co, Ni, and Pb with P imply the role of fertilization in the distribution of these elements in the soil. The field investigation indicated the uncontrolled application of NP fertilizers, herbicides, and pesticides in the study area. These substances are considered the principal source of trace elements worldwide. The applied P fertilizers in the study area contain about 74.9 and 15.4 ppm of Pb and Co, respectively (Salman et al. 2017). On the other hand, the load of Mn (with uniform distribution pattern), CaCO3, and pH in PC2 indicated this component is of a natural origin influenced by pedogenic factor. Organic matter content and texture had no significant influence on trace element distribution.

The hierarchical cluster divided the studied samples into two groups (A and B) based on the levels and sources of contamination (Fig. 2). Group A includes samples 2, 3, 4, 7, 8, 9, 13, and 16 (Fig. 2). These samples were relatively had higher contamination level with the studied elements (Table 2) with P content averaging 0.32%. Group B includes the remaining samples of lower trace elements with lower P content averaging 0.19%. The link between P content and levels of elements indicated the influence of fertilization process. The role of traffic activities on the pollution of soil in the study area is evidenced by the occurrence of contaminated sites adjacent to the roadway sides (group A samples). Sample no. 4 in group A is located near a residential area and is affected by domestic activities rather than road traffic emissions (Fig. 1). This is supported by the occurrence of sample no. 4 far from the remaining samples controlled by traffic emissions (Fig. 2). The hierarchical cluster confirms the previously suggested anthropogenic factor (PC1).

Fig. 2
figure 2

Dendrogram providing a graphic summary of the clustering processes

Conclusion

Like all Egyptian cities, roadways passing through the farmland of Gerga area, Upper Egypt, and large amounts of agrochemicals are applied to restore soil fertility. Such activities have led to the accumulation of trace elements and threat the agricultural soils. The studied soil varied from uncontaminated to moderately contaminated with trace elements. Low contaminated soils are not currently a serious environmental concern. Moderately contaminated soil indicates a presence of slightly non-residual portion having a tendency to become bio-available. Mn was statistically proven as from natural origin affected by pedogenic processes, indicated by its homogeneous distribution and high loading with pH. On the other hand, traffic emissions and phosphate fertilizers were labeled as important sources of Pb, Co, and Ni elements in the soil. The matter was evidenced by the occurrence of contaminated sites adjacent to the roadway sides with their relative higher phosphorus content. Hence, periodic environmental monitoring is recommended; planting of crops sensitive to Pb, Co, and Ni must be avoided; and fertilization rate should be minimized.