Abstract
The spatial analysis of soil properties by means of quantitative methods is useful to make predictions at sampled and unsampled locations. Two most important characteristics are tackled, namely the option of using complex and nonlinear models in contrast with (also very simple) linear approaches, and the opportunity to build spatial inference tools using horizons as basic soil components. The objective is to perform the spatial analysis of clay content for validation purposes in order to understand whether nonlinear methods can manage soil horizons, and to quantitatively measure how much they outperform simpler methods. This is addressed in a case study in which relatively few records are available to calibrate (train) such complex models. We built three models which are based on artificial neural networks, namely single artificial neural networks, median neural networks and bootstrap aggregating neural networks with genetic algorithms and principal component regression (BAGAP). We perform a validation procedure at three different levels of soil horizon aggregations (i.e. topsoil, profile and horizon pedological supports). The results show that neurocomputing performs best at any level of pedological support even when we use an ensemble of neural nets (i.e. BAGAP), which is very data intensive. BAGAP has the lowest RMSE at any level of pedological support with \(\hbox {RMSE}_\mathrm{BAGAP}^{Topsoil} = 7.2\,\%\), \(\hbox {RMSE}_\mathrm{BAGAP}^{Profile} = 7.8\,\%\) and \(\hbox {RMSE}_\mathrm{BAGAP}^{Horizon} = 8.8\,\%\). We analysed in-depth artificial neural parameters, and included them in the “Appendix”, to provide the best tuned neural-based model to enable us to make suitable spatial predictions.
Similar content being viewed by others
References
Basile A, Ciollaro G, Coppola A (2003) Hysteresis in soil water characteristics as a key to interpreting comparisons of laboratory and field measured hydraulic properties. Water Resour Res. doi:10.1029/2003WR002432
Bishop T, McBratney A, Laslett G (1999) Modelling soil attribute depth functions with equal-area quadratic smoothing splines. Geoderma 91(1–2):27–45
Blalock H (1985) Causal models in the social sciences, 2nd edn. Aldine Publishing Company, New York
Bonfante A, Basile A, Langella G, Manna P, Terribile F (2011) A physically oriented approach to analysis and mapping of terroirs. GEODERMA 167–68:103–117. doi:10.1016/j.geoderma.2011.08.004
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Chen C, Hu K, Li W, Li Z, Li B (2014) Three-dimensional mapping of clay content in alluvial soils using hygroscopic water content. Environ Earth Sci 73(8):4339–4346. doi:10.1007/s12665-014-3720-9
Cockx L, Van Meirvenne M, Vitharana U, Verbeke L, Simpson D, Saey T, Van Coillie F (2009) Extracting topsoil information from EM38DD sensor data using a neural network approach. Soil Sci Soc Am J 73(6):2051–2058
Coppola A, Comegna A, Dragonetti G, Gerke HH, Basile A (2015) Simulated preferential water flow and solute transport in shrinking soils. Vadose Zone J. doi:10.2136/vzj2015.02.0021
Gee GW, Bauder JW (1986) Methods of soil analysis, part 1. Physical and mineralogical methods. Soil science society of America book series. Soil Science Society of America, Madison, WI
Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford University Press, New York
Goovaerts P (2011) A coherent geostatistical approach for combining choropleth map and field data in the spatial interpolation of soil properties. Eur J Soil Sci 62(3):371–380. doi:10.1111/j.1365-2389.2011.01368.x
Grunwald S (2009) Multi-criteria characterization of recent digital soil mapping and modeling approaches. Geoderma 152(3–4):195–207
Haykin S (1999) Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall, Upper Saddle River, NJ, USA
He Y, Chen D, Li B, Huang Y, Hu K, Li Y, Willett I (2009) Sequential indicator simulation and indicator kriging estimation of 3-dimensional soil textures. Aust J Soil Res 47(6):622–631
Iamarino M, Terribile F (2008) The importance of andic soils in mountain ecosystems: a pedological investigation in Italy. Eur J Soil Sci 59(6):1284–1292
Jaccard J, Wan C (1996) LISREL approaches to interaction effects in multiple regression, 1st edn. Sage Publications, Thousand Oaks
Kempen B, Brus DJ, Stoorvogel JJ (2011) Three-dimensional mapping of soil organic matter content using soil type-specific depth functions. Geoderma 162(1–2):107–123. doi:10.1016/j.geoderma.2011.01.010
Kroes J, Wesseling J, Van Dam J (2000) Integrated modelling of the soil-water-atmosphere-plant system using the model SWAP 2.0 an overview of theory and an application. Hydrol Process 14(11–12):1993–2002
Langella G, Basile A, Bonfante A, Terribile F (2010) High-resolution space-time rainfall analysis using integrated ANN inference systems. J Hydrol 387(3–4):328–342
Lark R, Bishop T (2007) Cokriging particle size fractions of the soil. Eur J Soil Sci 58(3):763–774
Liu Z, Martina M, Todini E (2005) Flood forecasting using a fully distributed model: application of the TOPKAPI model to the Upper Xixian catchment. Hydrol Earth Syst Sci 9(4, SI):347–364
Malone B, McBratney A, Minasny B, Laslett G (2009) Mapping continuous depth functions of soil carbon storage and available water capacity. Geoderma 154(1–2):138–152
McBratney A, Mendonca Santos M, Minasny B (2003) On digital soil mapping. Geoderma 117:3–52
Metherell AK, Harding LA, Cole CV, Parton WJ (1993) Century soil organic matter model environment. Technical Report No. 4, Great Plains System Research Unit, USDA-ARS, Fort Collins, Colorado, USA
Minasny B, McBratney A, (2007) Spatial prediction of soil properties using EBLUP with the Matern covariance function. Geoderma 140(4, SI):324–336, Pedometrics Meeting 2005 Naples, FL, 12–14 Sept 2005
Minasny B, McBratney A, Mendonca-Santos M, Odeh I, Guyon B (2006) Prediction and digital mapping of soil carbon storage in the Lower Namoi Valley. Aust J Soil Res 44(3):233–244. doi:10.1071/SR05136
Mishra U, Lal R, Slater B, Calhoun F, Liu D, Van Meirvenne M (2009) Predicting soil organic carbon stock using profile depth distribution functions and ordinary kriging. Soil Sci Soc Am J 73(2):614–621. doi:10.2136/sssaj2007.0410
Muñoz-Carpena R, Parsons JE (2011) VFSMOD-W vegetative filter strips modelling system. Version 6.x edn
Newhall F, Berdanier C (1996) Calculation of soil moisture regimes from the climatic record. Soil survey investigations report, National Soil Survey Center, Natural Resources Conservation Service, U.S. Dept. of Agriculture, Washington, DC
Odeh I, McBratney A (2000) Using AVHRR images for spatial prediction of clay content in the lower Namoi Valley of eastern Australia. Geoderma 97(3–4):237–254
Odgers NP, McBratney AB, Minasny B (2011) Bottom-up digital soil mapping. I. Soil layer classes. Geoderma 163(1–2):38–44. doi:10.1016/j.geoderma.2011.03.014
Park S, Vlek P (2002) Environmental correlation of three-dimensional soil spatial variability: a comparison of three adaptive techniques. Geoderma 109(1–2):117–140
Selige T, Boehner J, Schmidhalter U (2006) High resolution topsoil mapping using hyperspectral image and field data in multivariate regression modeling procedures. Geoderma 136(1–2):235–244
Stöckle CO, Nelson R (2005) Cropsyst for windows vers. 3.04.08. Department of Biological Systems, Washington State University
Tayman J, Swanson D (1999) On the validity of MAPE as a measure of population forecast accuracy. Popul Res Policy Rev 18(4):299–322
Terribile F, di Gennaro A, Coraggio S, de Mascellis R, Ferruzzi T, Laruccia N, Magliulo P, Rivieccio R, Sarnataro M (2009) Raccolta di 10 carte pedologiche della regione campania (1:50,000). Technical report, Assessorato all’Agricoltura, Settore Sirca, Regione Campania
Terribile F, Agrillo A, Bonfante A, Buscemi G, Colandrea M, D’Antonio A, De Mascellis R, De Michele C, Langella G, Manna P, Marotta L, Mileti FA, Minieri L, Orefice N, Valentini S, Vingiani S, Basile A (2015) A web-based spatial decision supporting system for land management and soil conservation. Solid Earth 6(3):903–928. doi:10.5194/se-6-903-2015
Wackernagel H (2003) Multivariate geostatistics: an introduction with applications. Springer, New York
Waiser T, Morgan C, Brown D, Hallmark C (2007) In situ characterization of soil clay content with visible near-infrared diffuse reflectance spectroscopy. Soil Sci Soc Am J 71(2):389–396
Weller U, Zipprich M, Sommer M, Castell W, Wehrhan M (2007) Mapping clay content across boundaries at the landscape scale with electromagnetic induction. Soil Sci Soc Am J 71(6):1740–1747
Wilson JP, Gallant JC (2000) Terrain analysis: principle and application. Wiley, New York
Xu W, Tran TT, Stanford U, Srivastava RM, Journel AG (1992) Integrating seismic data in reservoir modeling: the collocated cokriging alternative. In: Proceedings of 67th annual technical conference of the society of petroleum engineers, no. 24742 in SPE, pp 833–842
Zhang J (1999) Developing robust non-linear models through bootstrap aggregated neural networks. Neurocomputing 25:93–113
Zhou Z, Wu J, Tang W (2002) Ensembling neural networks: many could be better than all. Artif Intell 137(1–2):239–263
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix: Definition of specific terms
Activation function It is a function that converts the neuron’s weighted input to the its output.
Artificial neural network (ANN) It is a network of connected artificial neurons resembling how biological neurons are arranged in the animal brain. Neurons are distributed in (input, hidden and output) layers defining the structure of the neural network.
Bagging Bootstrap aggregating is a technique used to improve the accuracy of a regression (or a classification) task. It combines models trained with randomly generated sets (sampling with replacement).
Early stopping It is a method used during the training phase to improve the generalization ability of the neural network. Two data subsets are required during the training phase: the training set which is used to update the network weights (model calibration), and the validation set which is used to stop the training phase if the validation error increases for a specified number of times.
Initialization (of ANN) Each weight and bias of all network neurons are set to a random initial value used to start the training phase.
Overfitting A statistical model overfits when it learns more noise and random error rather than the underlying function to be approximated. In case of overfitting, a model cannot reproduce signals other than the training ones without considerably losing accuracy.
Principal component regression (PCR) It is a statistical technique in which a subset of loadings of the principal component analysis is used as regressor of the dependent variable.
Training phase In this work, a supervised learning paradigm is used for training which is based on the widely used back-propagation algorithm. Weights and biases are updated to fit the input/output couples.
Artificial neural networks (ANN)
Neural networks are highly complex and nonlinear systems which are capable of learning (adaptivity) and representing (generalization) real-world problems (Haykin 1999). Parallel-distributed and fully connected artificial neurons are elementary structural constituents. An artificial neuron (or perceptron) is an information-processing unit which consists of a few basic elements: the input signals \(\left( x_{j}\right)\) scaled by synaptic weights \(\left( w_{kj}\right)\), the neuronal bias \(\left( b_k\right)\), the induced local field, \(\left( I_k\right)\) which is calculated as the weighted sum of the inputs plus the bias, and the activation function \(\left( \phi \left( I_k\right) \right)\) with which the i−th neuron generates an equalized output (\(y_k\), Fig. 9).
A feedforward neural network (FFNN) is a multilayer perceptron made by a set of neurons organized in the input layer in zero or more hidden layers and in the output layer. Signals propagate on a layer basis along one direction at a time. The supervised learning process is based on the popular two-pass error back-propagation algorithm: in the forward pass, the input vector propagates through the network, generating outputs while keeping fixed synaptic weights. During the backward pass, the error signal (given by the difference between the desired target and th network output) back-propagates to adjust the synaptic weights. We used the slower convergent Levenberg-Marquardt algorithm to carry out the training phase, with or without the early stopping criterion, which consists of the use/non-use of an independent testing subset during the training phase. Every neuron is equipped with the following hyperbolic tangent sigmoid transfer function:
In this work, we used fully connected multilayer FFNNs to address the input/output mapping. During the training phase, we used input/target couples from the training subset to adjust the weights and biases (free parameters) in order to best fit target signals with network outputs (learning ability by means of neural plasticity). Then, we carried out the simulation phase to measure (through validation procedure) the generalization capability of the trained network on the validation input/target couples which were left out during the onefold splitting procedure described in “Study region and soil data” section.
In the “Appendix” (mANN), we discuss different aspects of the training phase during a sensitivity analysis on bootstrap artificial neural nets. We selected the trained neural network (the median net) from the best-performing setting to represent the single neural network model in order to explain the clay spatial variability. The procedure employed to develop the BAGAP model is proposed in the “Appendix”, where groups of neural nets are used to validate whether an aggregated response performs better than the response by a single neural network or by linear models.
Sensitivity analysis and the median ANN (mANN)
We carried out a sensitivity analysis by varying three main constitutive elements in order to measure the supervised learning ability under distinctive ANN conditions. The complete procedure is explained in more detail in the following steps:
-
1.
Target data subsetting (for any level of pedological support)
Data were firstly stratified using the onefold random splitting procedure described in “Study region and soil data” section. Hence, 20 % of clay data composed the validation (Va) subset, which was made by the same signals within each level of pedological support. We used it to measure the generalization capability by trained ANNs and to compare neurocomputing with other prediction models.
The residual 80 % of data were randomly split into training (Tr) and testing (Te) subsets during bootstrapping. The training phase for any ANN takes place on Tr. If necessary, we remove a random 12.5 % Te subset from the training data to test the generalization capability using the ES method (step 3). In Table 2, a summary of target data dimensionality is reported according to the use (ES = 1) or non-use (ES = 0) of the ES criterion.
-
2.
Input terms gathering
Digital terrain analysis of the topographic surface and remote sensing are sources for spatially continuous input covariates. The best trade-off between the ANN size (i.e. number of inputs and hence the number of free parameters) and sample size is investigated for all levels of pedological support. The \(p_{io}^{b}\) inputs used in the sensitivity analysis are defined in the i-th subscript in Table 3, where i denotes one of three possible sets of inputs; o indicates one of three target subsets {Tr, Va and Te}, as defined in step 1 b is the ANN bootstrap resample presented afterwards (see also Eq. 9).
-
3.
Bootstrapping (for any setting)
Preliminary trials suggested that the size of the input layer (=n) should equal the number of input terms and that both inputs and hidden layers should have the same size as a good trade-off ability between fitting and generalization. We carried out a sensitivity analysis by bootstrapping ANNs for each to-be-investigated neural network setting. The list of feasible settings is given in Table 4 for three varying network conditions: (1) the number and type of input variable (\(p_{io}^{b}\)), (2) the topology (the number of hidden layers ranges from zero to one) and (3) the use (ES = 1) or non-use (ES = 0) of ES. The need to study the effect of each neural net’s setting on learning and generalization capabilities drove the sensitivity analyses to one thousand bootstrapped ANNs for each setting type. The sensitivity sheet mapped out in Table 4 was repeated for each level of pedological support.
-
4.
Single neural network architectures
As an example, setting #8 in Table 4 uses the \(p_{3o}^{b}\) input of Table 3 at Horizon, which is made of DEM, SPI, NDVI-S5, NDVI-D16, ASP, \(N_\mathrm{LAY}\). The corresponding 6:6:1 network topology is depicted in Fig. 10, and its mathematical expression is:
$$\begin{aligned} y= \phi \left( ob + \sum _{r = 1}^{6}ow_r\cdot \phi \left( hb_r + \sum _{s = 1}^{6}hw_{rs}\cdot \phi \left( ib_s + \sum _{v = 1}^{6}iw_{sv}\cdot x_v \right) \right) \right) \end{aligned}$$(8)iw, hw, ow input, hidden and output weights, ib, hb, ob input, hidden and output biases, \(\phi\) activation function (Eq. 7), x, y input and output signals, v, s, r subscripts typifying network layers.
-
5.
Median artificial neural net (mANN)
The objective of the sensitivity analysis was to analyse the effects on generalization of the network conditions mapped out in the sensitivity sheet (step 3), i.e. to determine the feasibility of obtaining a well-trained single ANN to be employed in real-world simulations. We assumed that the lowest mANN from within all settings could represent a good trade-off between learning and generalization. We selected mANN as the ANN whose RMSE is nearest to the median value from within the setting having the lowest median. It was selected as the reference single-ANN modelling class for any level of pedological support. To compare the models, we performed simulations with mANNs on Va. Two corollaries of the assumption of building mANNs are: (1) an ANN whose RMSE is below mANN is over-trained and characterized by a lesser generalization capability, and (2) an ANN whose RMSE is above mANN is poorly trained and is unreliable for the simulation.
Bagging neural networks (BAGAP)
The training phase aims to adjust weights and biases in order to supply the generalization capability (a method used to correctly simulate real-world signals) to a single neural network. The generalization capability can be significantly improved using groups of neural networks, where several ANNs are trained and their outputs combined to form a combined response. A collective approach has two major components: a method for training individual ANNs and a method for combining (a selection of) ANNs.
In the particular context of our study domain, single ANNs were not highly suited for simulating clay values for each level of pedological support because a small learning set was available. This was the premise for the development of a sensitivity analysis procedure which afforded for setting up a prototype fine-tuned ANN (and allowed the mANN) in the previous section. This was the method used to train individual ANNs. We realized the arrangement of the combined ANN (i.e. of more complicated models made by more ANNs) using a bagging procedure (Breiman 1996).
Bagging aims to improve prediction accuracy by combining multiple models. In order to calibrate multiple repeated neural network models, b bootstrap samples of the learning set
with
were drawn at random with replacements to amplify real-world signals. The i subscript denotes three possible sets of predictors (Table 3): o indicates one of three target subsets (step 1); s accounts for three levels of pedological support; \({\mathbf {p}}_{io}^{b}\) are input terms of the b-th resample; \({\mathbf {T}}_{so}^{b}\) are the b-th resample target data. Each bootstrap replicates \({\mathbf {L}}_{ios}^{b}\) with a unique set of pairs of inputs, and target signals were used to calibrate a different neural network component \({{\mathbf {N}}}{{\mathbf {N}}}_{is}^{b}\) during the sensitivity analysis for a fixed level of pedological support. Next, we used a technique for building robust nonlinear models by aggregating multiple and randomly selected neural network components \(\hbox {NN}_{is}^{\cdot } = \left\{ \left( \hbox {NN}_{is}^{1}\right) , \left( \hbox {NN}_{is}^{2}\right) ,\ldots , \left( \hbox {NN}_{is}^{100}\right) \right\}\) (Langella et al. 2010).
It is proposed that the algorithm structures and calibrates groups of ANNs, called BAGAP, as well as provides a solution to choose which (by virtue of genetic algorithms) and how (using PCR) ANNs should be aggregated. We performed this using the following steps, which are a natural continuation of the steps elucidated in the previous section:
-
6.
We randomly selected a set of 100 ANNs from within the sensitivity sheet setting (step 3) characterized by the lowest RMSE median (the same setting to which mANN belongs). They represented the mixture of experts used as input for BAGAP training algorithm. For instance, 100 \(NN_{3,hor}^{\cdot }\) were selected at random for Horizon support from setting #9 (Fig. 11, left).
-
7.
We used genetic algorithms (GAs) to arrange an optimization problem in order to find the best candidate ANN population. At each GA epoch, we obtained a new 1-bit population, and the which task was fulfilled. This means that the population of selected neural nets was identified from within 100 \(NN_{3,hor}^{\cdot }\) at each GA epoch (e.g. at the first GA epoch 57 \(NN_{3,hor}^{\cdot }\) were selected as the current best population for Horizon, and so forth at each GA epoch).
-
8.
The stack of selected \(NN_{is}^{\cdot }\) components at the current GA epoch underwent aggregation through PCR weights calculated using training data (Zhang 1999). The BAGAP training internally requires a cross-validation procedure to avoid overfitting. As a consequence, we selected the first few eigenvectors which minimized the error function on the testing subset to ensure BAGAP generalization capability, and the how task was solved.
-
9.
BAGAP yields a combined aggregated response at each epoch. When a stopping criteria was met, the 1-bit population and PCR loadings of the current epoch were stored to represent the BAGAP model for the level of pedological support at hand. It was assumed that this particular PCR aggregation of the GA-selected ANN population is characterized by the best combined fitness.
-
10.
At the end of the calibration process, we simulated BAGAP using Va for model comparison.
The GASEN mixture model presented by Zhou et al. (2002) provided the core framework for our BAGAP. According to GASEN, we employed a GA to evolve randomly initialized weights. However, unlike GASEN, which cuts floating weights on an arbitrary threshold after optimizing the search of the minimum error function, BAGAP uses binary (1-bit) weights during the GA search in order to select only the population of candidate ANNs. The composition of the combined response is weighted by PCR loadings, and thus, BAGAP has a novel structural component compared to GASEN.
Rights and permissions
About this article
Cite this article
Langella, G., Basile, A., Bonfante, A. et al. Spatial analysis of clay content in soils using neurocomputing and pedological support: a case study of Valle Telesina (South Italy). Environ Earth Sci 75, 1357 (2016). https://doi.org/10.1007/s12665-016-6163-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12665-016-6163-7