Introduction

In order to offset difficult economic conditions, forests and their understories were cleared and used as fuelwood in rural areas of China (Wang et al. 2012). Long-term acquirements for fuelwood in underdeveloped areas have resulted in vegetation degradation, as well as other environmental issues, such as greenhouse effect and soil erosion (Heltberg et al. 2000; Foell et al. 2011; Wang et al. 2012). In addition, anthropogenic activities in these areas have caused the degradation of terrestrial ecosystems (Bai et al. 2014). In most cases, these anthropogenic activities have profound effects on soil properties, especially on soil organic carbon (SOC) and total nitrogen (TN) (Xin et al. 2016; Zhang et al. 2016; Yuan et al. 2017; Wang et al. 2018).

Soil organic carbon is considered as an important indicator to assess soil fertility and quality, because of its significant influences on soil physicochemical properties (Sullivan et al. 2005; Wang et al. 2015). Furthermore, SOC plays a major role in the global carbon cycle (Mirzaee et al. 2016; Xin et al. 2016; Chu et al. 2018). As a key determinant of soil fertility and quality, TN also plays a vital part in generating and maintaining soil productivity in agricultural ecosystems (Wang et al. 2015; Zhang et al. 2016; Ma et al. 2018). Therefore, knowledge of the spatial variability in SOC and TN is indispensable to evaluate ecosystem productivity (Córdova et al. 2012).

Studies on spatial variability of SOC and TN have been one of the hotspot in soil ecology in the past decades (Bonmati et al. 1991; Nyamadzawo et al. 2008; Liu et al. 2014; Wang et al. 2015; Xiong et al. 2015). Different interpolation techniques, such as ordinary kriging, cokriging, and inverse distance weighting have been used for predicting the spatial distribution of soil properties (Guan et al. 2017; Ghorbanzadeh et al. 2019). Compared to other interpolation methods (e.g., cokriging), ordinary kriging is easy to conduct without additional variables, which lead to the reduction of work and does not decrease the accuracy (Tang et al. 2016). Additionally, ordinary kriging presents advantages in mapping the spatial distribution of soil properties compared to other interpolation techniques, as it is the most effective linear estimator which can provide the best and linear unbiased estimates (Yang et al. 2016). Therefore, ordinary kriging has been widely applied into the spatial variability of SOC and TN. For example, Elbasiouny et al. (2014) reported the spatial variability of SOC and TN pools in the northern of Nile delta using ordinary kriging. Wang et al. (2015) characterized the spatial patterns of SOC and TN within 0–80 cm for minesoils in the Loess Plateau area with ordinary kriging. Yang et al. (2016) analyzed the spatial distribution of SOC and TN for farmland ecosystem in northern China using ordinary kriging.

In China, a growing number of studies have been conducted in different areas on the spatial variability of soil properties, including Loess Plateau (Wang et al. 2015), Tibetan Plateau (Dai et al. 2014), and the Three Gorges Reservoir area (Teng et al. 2017). Nevertheless, few studies were carried out in red soil region, particularly in red soil region of southern China (Mirzaee et al. 2016). Hetian Town, one of the most degraded areas in this region, has experienced severe soil erosion from intensive and extensive human activities (e.g., deforestation and farming). To mitigate soil and water loss, the local government has implemented a series of ecological restoration measures (e.g., closed forest measures), leading to a significant change in soil nutrients. Nevertheless, there is a general lack of the knowledge on the variations of spatial pattern for SOC and TN in response to ecological restoration. Additionally, our understanding on the variations of spatial pattern for soil properties with the change of sampling size is limited. Thus, the aims of this study were to: (1) investigate SOC and TN status at three soil depths (i.e., 0–20, 20–40, and 40–60 cm) in the hilly red soil of southern China; (2) build the spatial distribution maps of SOC and TN by means of geostatistical methods based on 59 analyzed soil sample; and (3) determine if this samples number is reasonable to monitoring points for evaluating the effects of ecological restoration measures.

Materials and methods

Study site

The study area is situated in the Hetian Town, Fujian Province, China (25°33′–25°48′N,116°18′–116°31′E), and the total land area reaches approximately 296 km2 (Fig. 1). It is well-known for severe soil erosion and affluent rare earth ore. This town belongs to a typical subtropical monsoon climate, with a mean frost-free period, mean annual temperature and mean annual precipitation of 260 d, 19 °C and 1621 mm (Yao et al. 2019; Yu et al. 2019). The main soil type of this town is acid red soil, derived from crystalline granite. The dominant land use is woodland which mainly consist of Pinus massoniana forests. The P. massoniana forests are distributed mainly in poor soil, and most of which are pure forests. Historically, this town was covered by a lush forest but years of deforestation and clearings resulted in severe vegetation degradation and soil erosion. The dominant erosion type of this study site is water erosion. Ecological restoration measures, such as the arbor-bush-herb mixed plantation and closed forest measures, have been widely implemented in this town since the 1980s. Although these measures have brought significant effect on the reduction of soil erosion, the ecological environment condition in the town is still pessimistic.

Fig. 1
figure 1

Geographic locations of the study area and sample plots

Soil sampling and analysis

Soil was sampled from 59 sampling plots in January 2015 (Fig. 1). The sampling plots were established with a size of 666.7 m2 (i.e., 25.82 m × 25.82 m). GPS was used to determine the coordinate positions of the plots. In each plot, a soil pit was excavated and soil were sampled from three soil layers: topsoil (0–20 cm), eluviation layer (20–40 cm), and subsoil (40–60 cm). For each soil depth, approximately 1 kg weight was collected for soil chemical analysis. In the laboratory, plant residues (e.g., visible root and leaf litter) and rocks were removed, and then all soil samples were air-dried and passed through a sieve (0.15 mm) prior to measuring the SOC and TN contents (Yao et al. 2019). SOC was determined using the K2Cr2O7–H2SO4 wet oxidation approach (Tang et al. 2017). TN was measured using a CN elemental analyzer (Yu et al. 2019).

Data analysis

Descriptive statistics

Outliers of the whole dataset were removed before descriptive statistics. In this study, the outliers were defined as extreme values which were caused by errors in soil sampling and analysis. Extreme values were considered as any data value is not within the three standard deviation of the mean (\(\bar{x} \pm 3s\), where \(\bar{x}\) and s are the average value and the standard deviation value for each soil variable, respectively) (Rosemary et al. 2017). After detecting and removing the outliers, the mean, median, minimum, maximum, standard deviation (SD), coefficient of variation (CV), skewness and kurtosis for each soil variable were obtained from the rest entire dataset and within each soil depth (Yao et al. 2019). At the same time, we used one-sample Kolmogorov–Smirnov (K–S) method to test whether the soil variables values satisfy the normal distribution. All statistical analyses were conducted on the SPSS version 21.0.

Geostatistical analyses

Geo-statistical analysis uses the semivariogram to generate the spatial prediction of each soil variable (Zhang et al. 2016). The semivariogram (γ(h)) of a soil variable for the lag distance h was calculated using Eq. 1 below:

$$\gamma \left( h \right) = \frac{1}{2N\left( h \right)}\sum\limits_{i = 1}^{N\left( h \right)} {\left[ {Z\left( {x_{i} } \right) - Z\left( {x_{i} + h} \right)} \right]^{2} }$$
(1)

where Z(xi) and Z(xi + h) are the measured values for a soil variable at location xi and xi + h, N(h) represents the number of data pairs separated by the lag h.

To explore the spatial structure of SOC and TN, different models (i.e., spherical, exponential model, and Gaussian model) was used. The best-fitted model for a soil variable was selected based on the maximum model efficiency and the minimum residuals. Thereafter, the optimal parameters of the range (a), the nugget (C0) and the sill (C0 + C) were derived from the best-fitted models (Yao et al. 2019), and then ordinary kriging interpolation approach was applied to generate the distribution maps of SOC and TN. The predicted value \(\hat{Z}\left( x \right)\) at unsampled locations was calculated as below (Yao et al. 2019):

$$\hat{Z}\left( x \right) = \sum\limits_{j = 1}^{m} {\lambda_{j} } Z\left( {x_{i} } \right)$$
(2)

where m is the number of the adjacent observations used for the prediction and λj is the weight. In this study, the semivariogram analysis and continuous distribution of SOC and TN contents were conducted on the GS+ 9.0 and ArcGIS 9.3, respectively.

Model validation

The interpolation quality of SOC and TN was evaluated by the leave-one-out cross-validation (Guan et al. 2017). In this study, ordinary kriging model validation was carried out using the determination coefficient (R2), mean error (ME), root mean square error (RMSE), and mean standardized error (MSE) (Zhang et al. 2016; Yao et al. 2019).

Determination of reasonable sampling number

To determine the reasonable sampling points, CV was applied. Because of the high spatial variability of soil properties in the regional scale, a large number of sampling points were needed to obtain reliable and accurate estimation (Rodeghiero and Cescatti 2008). The coefficient of variation, as a parameter, was used to estimate the number of observation points (m), which can be estimated using the following model (Rodeghiero and Cescatti 2008):

$$m = \frac{{CV^{2} t_{\alpha }^{2} }}{{E^{2} }}$$
(3)

where m denotes the reasonable sampling number, CV represents the coefficient of variation for a soil variable, tα represents the Student’s t statistic at the α confidence level, E is the specified error limit, which represents the sample mean in an allowable error. Further details about the specified error limit can be found in Rodeghiero and Cescatti (2008).

Results and discussion

Brief results of SOC and TN contents

Among those samples, only two samples showed a very high SOC (49.79 g kg−1) or TN (4.35 g kg−1) value at 20–40 cm and were considered as outliers. The mean SOC contents were 19.31 g kg−1 at 0–20 cm, 17.47 g kg−1 at 20–40 cm and 16.10 g kg−1 at 40–60 cm (Table 1), indicating that the mean SOC decreased with increased soil depths, which was in agreement with results obtained in other studies (Yang et al. 2016; Guan et al. 2017). Nevertheless, the SOC contents were lower than those in secondary forest (20.50–60.36 g kg−1), irrespective of soil depth (Guan et al. 2015). The dissimilarity was mainly due to the sparse vegetations and the presence of soil erosion in our study, which produced less SOC than secondary forest mostly covered with lush vegetations. Additionally, the mean SOC contents were approximate to the median values in different soil depths, with a lower mean values compared with the median values, demonstrating a positive skewness can be found in the data (Mirzaee et al. 2016). A similarity between mean value and median value for SOC in different soil depths has been reported in previous study (Wang et al. 2015). One-sample Kolmogorov–Smirnov test manifested that all of the values of SOC were normal distribution (p > 0.05) (Table 1).

Table 1 Brief results of SOC and TN (n = 57)

Similarly, the mean TN in different soil depths were characterized by 0–20 > 20–40 > 40–60 cm, ranging from 0.28 to 2.71 g kg−1. In addition, the mean TN contents were smaller than the median TN contents in all depths except the depth of 40–60 cm. One-sample Kolmogorov–Smirnov test manifested that all TN contents were normal distribution (p > 0.05) (Table 1).

The CV was used for assessing the degree of dispersion of the variables (Yao et al. 2019). If the CV was less than 10%, the variable has low variability, if the CV was between 10% and 90%, the variable was defined as moderate variability, if the CV was greater than 90%, and the variable was defined as high variability (Fang et al. 2012; Yao et al. 2019). Our study indicated the CVs varied from 52.87% to 55.87% for SOC and 49.02% to 50.00% for TN in different soil depths (Table 1), which generally greater than most of results in other studies (She and Shao 2009; Wang et al. 2015). Therefore, all soil properties at different depths showed a moderate variability.

Spatial variability in SOC and TN contents

Semivariograms were extensively adopted to determine the spatial variability in soil properties (Goovaerts 1999). Similar to many related findings (Rossi et al. 2009; Wang et al. 2015), Gaussian and exponential models can accurately reflect the spatial variability in SOC across different soil depths; however, only Gaussian models showed superiority for all depths of TN (Fig. 2 and Table 2). Among the variables analysis, the R2 values of the optimal models for all depths of SOC were below 0.9, whereas, their values for all depths of TN were over 0.9. It indicated that TN distribution was more stable than SOC (Table 2), and the human factors might be the possible reason for this phenomenon (Huang et al. 2007).

Fig. 2
figure 2

Theoretical semivariogram models for SOC and TN contents. The parameters are given in Table 2

Table 2 The optimal parameters of semivariogram analysis for SOC and TN contents

Nugget values was commonly regarded as the degree of spatial variability result from random factors (e.g., sampling error) (Goovaerts 1999; Yao et al. 2019). In the present study, the nugget values for SOC contents were higher than the values of TN, mainly because of the larger magnitude of SOC contents. Moreover, the nugget values for SOC and TN declined with soil depth increased (Table 2). The sill values reflect the total spatial variation result from structural factors (e.g., soil and vegetation types) (Fang et al. 2016; Yao et al. 2019). Similar to the nugget values, the sill values of SOC and TN decreased with the depths increased (Table 2). Furthermore, the sill values of SOC and TN were positive for different soil depths, implying that there were varieties of positive substrate effects result from sampling error (Yao et al. 2019).

The nugget-to-sill ratio (C0/(C0 + C)) was used for determining the degree of spatial dependence for soil nutrients (Guan et al. 2017). Based on the criterion defined by Cambardella et al. (1994), when the C0/(C0 + C) is lower than 25%, the variable was regarded as having spatial dependence; moderate for C0/(C0 + C) between 25 and 75% and weak for C0/(C0 + C) larger than 75%. A weak spatial dependence for a soil variable may be primarily due to extrinsic sources of variability (i.e., soil factors imposed on a site, such as fertilizer application, tillage and management practice), whereas a strong spatial dependence for a soil variable is mainly affected by the intrinsic sources of variability (i.e., natural variations in soils, such as soil texture, soil types and soil parent material) (Fu et al. 2013; Guan et al. 2017). The C0/(C0 + C) values of SOC of all depths were between 25 and 75% (Table 2), suggesting the spatial dependence was moderate. This was mainly attributed to the complex topography and local ecological restoration measures. The study region was characterized by the complex physical geography, a typical valley basin, which might directly impact soil nutrient contents (Bai et al. 2014). Contrary to that, the C0/(C0 + C) values of TN of all depths were lower than 25% (Table 2), suggesting a strong spatial dependence.

The ranges were calculated to represent the minimum distance of spatial autocorrelation for soil variables (Blanchet et al. 2017; Yao et al. 2019). If the distance between the observations fall within the range, the observations can be defined as spatial autocorrelation and the spatial dependence for the distance greater than the range. (Webster and Oliver. 1992). The largest range was observed for SOC at 40–60 cm (19,750 m), whereas the lowest range was found for TN at 0–20 cm (8860 m) (Table 2). Similar to many previous studies, the spatial range values were much larger than the sampling distance (Fig. 1) in the present study, indicating the sampling strategy was adequate to estimate the spatial variability of soil properties (Zanini and Bonifacio 1992; Yao et al. 2019). According to the study by Rosemary et al. (2017), a sampling distance < 4430 m in this study can be applied for studying the spatial variability in soil properties. For the sake of comparing the spatial variability in soil variables among different studies, additional studies are required to identify the effects of topographic and environmental factors (e.g., slope, elevation, aspect, and stand management) on the spatial variability.

Cross-validation of ordinary kriging

Figure 3 showed the fitting models of SOC and TN were closer to the 1:1 line. However, the predicted values of SOC (TN) for each depth seemed to underestimate the measured values which were more than 20 g kg−1 (1 g kg−1) and to overestimate the measured values which were less than 20 g kg−1 (1 g kg−1). This conclusion was in agreement with results obtained in other studies owing to the nature of the algorithms, which aimed to achieve unbiased predictions of mean values (Liu et al. 2013; Guan et al. 2017; Wang et al. 2018; Yao et al. 2019). The slopes of the model were close to each other, irrespective of soil depth (Fig. 3).

Fig. 3
figure 3

Measured versus predicted SOC and TN at different soil depths, using ordinary kriging model (dashed line is a 1:1 line)

The MEs of SOC and TN for each depth were all close to zero (Table 3), illustrating that ordinary kriging was relatively unbiased in interpolating the SOC and TN. The MEs were all negative, which indicated that ordinary kriging underestimated the SOC and TN contents. The RMSE of SOC and TN declined as the soil depth increased (Table 3). The MSEs of SOC and TN was low and close to zero, indicating the ordinary kriging method in our study was reliable.

Table 3 Evaluation indices of ordinary kriging method

Although the determination coefficient values for SOC and TN were relatively low (Fig. 3), they were similar to many other findings (Liu et al. 2013; Yao et al. 2019). This phenomenon is likely attributed to the primary dataset without probabilistic design (Veronesi et al. 2014; Guan et al. 2017; Yao et al. 2019). In the present study, the sampling points were mainly determined by random designs, making the model have a poor performance. Thus, a dataset of probabilistic design is proposed for interpolating the soil properties in the future study. Moreover, machine learning methods are recommended for improving the accuracy of spatial interpolation for soil properties.

Spatial distribution of SOC and TN contents

According to the optimal parameters derived from the best-fitted model, we used ordinary kriging method to estimate the spatial patterns of SOC and TN contents (Fig. 4). In order to better compare the spatial distributions, maps for each depth of SOC and TN contents were plotted on the same scale.

Fig. 4
figure 4

Spatial patterns of SOC and TN contents, using ordinary kriging model (g kg−1)

The SOC content changed markedly from 0–20 to 40–60 cm (Fig. 4), it also had the largest change at 0–20 cm, which was in accordance with the study by Wang et al. (2015). The low SOC content was majorly distributed in the mid-south of our study site because of the lower coverage of vegetations and intensive human activities. In contrast, high SOC content areas mainly appeared in northern parts of the study site where the land use types were mixed forests within high vegetation cover (e.g., broadleaved trees mixed with Pinus massoniana, Cunninghamia lanceolata mixed with Pinus massoniana). Therefore, the SOC content was characterized by a center area with low content surrounded by bands with high content that steadily and continuously increased outwards towards the region’s peripheries for all three depths.

The distribution of the TN content had something in common with that of the SOC content at all three depths, which was changed markedly within each depth. The low TN content was mainly located in the middle areas because of long-lasting conflicts between land protection and human activities, which resulting in serious forest degradation and soil erosion. Thus, the spatial pattern of soil properties were found to be higher in the periphery and lower in the middle. Although afforestation and other ecological restoration measures were implemented in the study area in the past 40 years, especially in the middle, the pattern of soil properties will not be changed in the next short period of time. However, this pattern may be expanded under rapid urbanization process.

Reasonable sampling number to monitor SOC and TN contents

We used variation coefficient approach to estimate reasonable number of sampling points for SOC and TN (Table 4). The mean numbers of sampling points needed to estimate the mean TN contents, within specified error limits equal to 10% and 20% at the 95% confidence level, were 25 and 99, respectively. The optimal sampling points of each depth for SOC contents were more than 100 within specified error limits which equal to 10% at the 95% confidence level, and that ranged between 29 and 32 within specified error limits equal to 20% in our study (Table 4). According to the study by Adachi et al. (2005), the optimal sampling points to monitor soil properties ranged from 18 to 75 at the same specified error and confidence level in a forest area in Malaysia. Han et al. (2016) reported that the reasonable sampling points were 17 or less within an allowable error of 10% and the same confidence level in California forest soils. These results suggest that sampling points were higher than the previous studies due to the high CV of soil properties in this study (Table 1). Moreover, it also demonstrated that more sampling points are needed to calculate the variability of soil properties in tropical Asia than in temperate Asia (Adachi et al. 2005; Han et al. 2016). This information is important for monitoring SOC if we are to accurately estimate soil carbon stock for an area of interest in hilly red soil region.

Table 4 Reasonable sampling number points required to estimate the SOC and TN contents within different specified error limits at the 95% confidence level

Several studies have reported that the size of sampling points will affect the kriging interpolation accuracy (Liu et al. 2010; Wang et al. 2015). In general, the more the number of sampling points, the more accurate the interpolation. However, the measured points were hard to derive in the field due to the restricted human and material resources. Therefore, the rational sampling strategy is essential if we are to require reliable results with minimum sampling costs and maximum prediction accuracy of soil properties. The present study did not supply enough data for this aim due to the data was collected only once in the study area. Thus, further investigations are required for us to monitor the spatial variability of soil properties in the hilly red soil after soil erosion and afforestation.

Conclusion

Our study provides spatial distribution maps of SOC and TN with 59 soil samples in Hetian Town, China. We found that SOC was regulated by both extrinsic factors (e.g., fertilizer application, tillage, management practice) and intrinsic factors (e.g., soil texture, soil types, soil parent material), while TN content was mainly influenced by intrinsic factors. Nevertheless, more studies are needed to identify the effects of these variables on the spatial variability in soil nutrients.

We also found that the spatial pattern of SOC and TN were characterized by higher in the periphery and lower in the middle. Therefore, more ecological restoration measures are recommended for further increase of nutrients in the middle. Additionally, we found that the variation coefficient is a reliable approach to obtain the reasonable sampling number for monitoring SOC and TN. However, there is an urgent need to more studies are needed on the dynamic recovery of SOC and TN because of the increasing interest of spatial pattern of SOC and TN in red soil region of southern China as a result of soil erosion and land degradation.