Abstract
Performing comprehensive laboratory test programs to estimate rockfill strength for rockfill dam projects is a lengthy and onerous task because of the large sample-size. Accordingly, it has become a common practice to carry out limited experimental investigation, and extrapolate the results to the expected conditions in actual embankments. A number of investigators have established a function of the type τ = ασβ, where τ and σ are the shear and normal stresses, respectively, and the constants α and β, which result from a fitting procedure, have no physical meaning. Results of laboratory tests on a variety of rockfills have shown that in addition to effective confining stresses the relative density, uniformity coefficient, maximum particle size and particle-breaking load influence rockfill strength. Thus these parameters must be included in any function for computing rockfill strength. Other parameters, whose influence is understood partially, are not included here. Given the non linear-multidimensional nature of the problem, in this paper a neuronal procedure is developed. This approach takes into account the influence of each of the parameters mentioned before. The network used in this article was defined after comparing the results obtained with a variety of algorithms. After several attempts, the Cascade Correlation Network (CCN) was found to yield most accurate strength predictions.
Similar content being viewed by others
References
Alberro J, Gaziev E (2000) Resistencia y compresibilidad de los enrocamientos. Internal Report. Institute of Engineering, UNAM. México
Barton N, Kjaernski B (1981) Shear strength of rockfill. J Geotech Eng Div Am Soc Civ Engrs 107(GT7):873–891
Breiman L, Friedman J, Olshen R, Stone C (1995) Classification and regression tree1s, Wadsworth International Group, Belmont CA (1986), cited in Gallant SI, Neural network learning and expert systems, The MIT Press, Third printing
Charles JA, Watts KS (1980) The influence of confining pressure on the shear strength of compacted rockfill. Géotechnique 30(4):353–367
De Mello VFB (1977) Reflexions on design decisions of practical significance to embankment dams. Géotechnique 27(3):279–355
Fahlman SE (1988) An empirical study of learning speed in back-propagation networks. CMU Technical Report CMU-CS–88-162
Fahlman SE, Lebière C (1991) The cascade-correlation learning architecture, CUV-CS-90–100, School of Computer Science. Carnegie Mellon University, Pittsburgh, PA
Franklin JA (1971) Triaxial strength of rock materials. Rock Mech 3(2):86–98
Frean M (1990) The upstart algorithm: a method for constructing and training feed forward neural networks. Neural Comput 2:198–209
Gallant SI (1986) Three constructive algorithms for network learning, Proceedings, eighth annual conference of the cognitive science society, Amherst, MA, August 15–17, pp 652–660
Gallant SI (1990) Perceptron-based learning algorithm. IEEE Trans Neural Netw 1(2):179–192
García SR, Romo MP, Taboada-Urtuzuástegui V, Mendoza M (2000) Sand behavior modeling using static and dynamic artificial neural networks. Serie de Investigación y Desarrollo, Institute of Engineering, UNAM, SID/631, México
García SR, Romo MR, Botero E (2007) A neurofuzzy system to analyze liquefaction-induced lateral spread. Soil Dyn Earthquake Eng 28(3):169–180
Gaziev E (2001) Out energy evaluation for brittle materials. Int J Solids Struct 38(42–43):7681–7690
Ghaboussi J, Sidarta DE (1998) New nested adaptive neural networks (NANN) for constitutive modeling. Comp Geotech 22(1):29–71
Ghaboussi J, Garret JH, Wu X (1991) Knowledge-based modeling of material behavior with neural networks. J Eng Mech Div Am Soc Civ Engrs 117(EM1):133–157
Indraratna B, Wijewardena LSS, Balasubramanian AS (1993) Large-scale triaxial testing of greywacke rockfill. Géotechnique 43(1):37–51
Marachi ND, Chan CK, Seed HB (1972) Evaluation of properties of rock materials. J Soil Mech Fdns Div Am Soc Civ Engrs 98(SM1):95–114
Marsal RJ, Ramírez de Arellano L (1967) Performance of El Infiernillo dam, 1963–1966. J Soil Mech Fdns Div Am Soc Civ Engrs 93(SM4):265–297
Mézard M, Nadal JP (1989) Learning feed forward layered networks: the tiling algorithm. J Phys A: Math Gen 22(22):2191–2203
Rahman MS (2002) Fuzzy neural network for liquefaction prediction. Soil Dny Earthquake Eng 22(8):685–694
Romo MP (1999) Earthquake geotechnical engineering and artificial neural networks: 4th Arthur Casagrande Lecture, Proceedings of XI Pan American Conference on Soil Mechanics and Geotechnical Engineering, Special Volume.
Romo MP, García SR, Mendoza M, Taboada-Urtuzuástegui V (2001) Recurrent and constructive-algorithm networks for sand behavior modeling. Int J Geomech 1(4):371–387
Rumelhart DE, Hinton GE, Williams RJ (1986) In: Rumelhart DE, McClelland JL (eds) Learning internal representations by error propagation, parallel distributed processing: explorations in the microstructure of cognition, vol 1. MIT Press, Cambridge, pp 318–362
Specht D (1991) A general regression neural network. IEEE Trans Neural Netw 2(6):568–576
Theodoridis S, Koutroumbas K (1999) Pattern recognition. Academic Press, Elsevier Science, USA
Vega Pinto AA (1983) Previsao do comportamento structural de barragens de enrocamento. Laboratorio Nacional de Engenharia Civil, Lisboa
Vouille G, Laurent D (1969) Étude de la courbe intrenséque de quelque granites. Revue de l’Industrie Minérale, Paris, Numero special, 25–28
Yoshida K, Hayashi Y, Imura AA (1989) Neural network expert system for diagnosing hepatobiliary disorders, MEDINFO ’89. Proceedings of the sixth conference on medical informatics, Beijing, October 16–20, cited in Gallant SI, Neural network learning and expert systems, The MIT Press, Third printing, pp 116–120
Acknowledgments
The authors are grateful to CONACyT for the support provided throughout the grant 33032 LI. Also they acknowledge the skillful editing by Eng. Mercedes Ortega, Arturo Paz and Roberto Soto.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A
Appendix B
2.1 Neural Networks Background
The background on ANNs needed to follow the procedures used here is summarized below. Neural networks consist of massively connected single neurones. They are computational models that process the information in a parallel distributed fashion. A NN is usually defined as a network composed of a large number of single processing units (neurones) that are massively interconnected, operate in parallel and learn from experience (training examples). Feed forward as well as constructive algorithms are major classes of network models. Feed forward networks, stick as the popular multilayer perceptron (Rumelhart et al. 1986), are commonly used as representative models trained employing a learning algorithm based on a set of sampled input-output data.
The specialized notation for architecture definition used in this study (m × h × o) is interpreted as follows: m is the number of input cells, h is the number of processing units in the hidden layer and o represents the number of output cells.
2.1.1 Feed Forward Networks (FFN)
FNNs consist of one input layer, one or multiple hidden layers and one output layer. The input units are merely distribution cells, which provide all of the measured variables to all of the processing neurones on the second (hidden) layer. Each of these neurones are activated and transmitted to other processing units. The input and output of a network computation is represented by the activation level of designated input and output units, respectively. The connections between these units, of which are many, vary in their efficiency of transmittal of this activation signal. What the network computes is highly dependent on how the units are interconnected and the strengths of the connections between them.
2.1.2 Input Functions
Dot Product Input Function (DPF). The Dot Product is a weighted sum of the inputs plus a bias value. Intuitively, this scales each input according to its relative influence in increasing the net input to the node.
L1 Distance Input Function (L1). This function calculates the distance between two vectors. Thus the processing element automatically obtains the length to the input example.
2.1.3 Learning Rules
There exist a number of methods to carry out supervised learning.
Quick Propagation (QP). QP is a supervised learning algorithm, which provides several useful heuristic procedures for minimizing the time required for finding a good set of weights. These heuristic procedures automatically regulate the step size, and detect conditions that accelerate learning. QP evaluates the trend of the weight updates over time when the step size can be optimized (Fahlman 1988). In this paper, DPF is always used in conjuction with QP.
General Regression (GR). Originally developed in the probabilistic literature, this algorithm was modified by Donald Specht (1991) to approximate functions. It is based on the principle that the contribution an exemplar has on the projection of another one decreases exponentially with the distance between both exemplars. This algorithm has the advantage that its training is fast and can handle both linear and non linear data, however it has trouble with irrelevant inputs (i.e. suffers from the curse of dimensionality) and requires that all the training samples be stored for future use (i.e. prediction). GR algorithm only works with L1 Distance input function for calculating neurone activations.
2.1.4 Constructive Algorithms
There are several constructive algorithms such as the tower and pyramid algorithm (Gallant 1986, 1990), the cascade-correlation algorithm (Fahlman and Lebière, 1991), the tiling algorithm (Mézard and Nadal 1989), and the upstart algorithm (Frean 1990). This study employs the cascade-correlation algorithm to establish a correlation between σ1 at failure and (σ3, Dr, Cue, dmax, Pa). In the case of cascade architectures the hidden units are added to the network one at a time and do not change after they have been adjoined. After each processing unit is added, the magnitude of the correlation between the new unit’s output and the residual error signal is maximized (Fahlman and Lebière 1991). The residual error is the net difference between the network output and the target (desired) output.
The Cascade-Correlation Algorithm (CCA) starts with n number of input branching modes and the j output neurones. There is no middle (hidden) layer at first, as shown in Fig. B-1. The weights at the output neurones are trained over all classes in the single layer perceptron mode (Rumelhart et al. 1986). Then a hidden neurone is added to the net and is trained using any well-known algorithm for single-layer networks. The input weights are frozen and all the output weights are trained again. The process is repeated until the residual error is acceptably small. Once this is achieved, a new hidden unit is created beginning with a candidate unit that receives trainable input connections from all of the network’s external inputs and from all pre-existing hidden units. Then, the correlation between the activation of the candidate unit and the residual error of the network is maximized by training all the links leading to the candidate unit. Learning is stopped when the correlation has no more improvement. Finally the candidate unit with maximum correlation is selected, its incoming weights are frozen and it is adjoined to the net. The candidate unit is transformed into a hidden unit by generating links between the selected candidate unit and all the output units. Since the weights leading to the new hidden unit are frozen, a new permanent feature detector is obtained.
The algorithm is repeated until the overall error of the network falls below a chosen threshold value. The process described above is shown in Fig. B-1 for the case of two hidden units added. There, the vertical lines adjoin all incoming activations. Frozen connections are indicated with open boxes and the solid boxes represent the connections that are trained repeatedly. It is to be noted that training is done in stages where the early stages are very quick because of the small size of the hidden layer. This type of network is very powerful to analyze non linear problems. However, as the number of hidden neurones in the single hidden layer increases the operational mode becomes progressively slower.
Thus, it is recommended to keep the length of the network as small as possible. Constructive algorithms should, in principle, fit better the training data as new neurones are added. Generalization improves, but only up to the point where data starts being over fitted. Generalization problem may be dealt with using in an analogous fashion to increasing the number of cells with the distributed method (Yoshida et al. 1989), and apply techniques such as cross validation (Breiman et al. 1995) to determine how many cells to add. Such pruning (or destructive algorithms) can be very important for fitting data.
Rights and permissions
About this article
Cite this article
García, S.R., Romo, M.P. Rockfill Strength Evaluation Using Cascade Correlation Networks. Geotech Geol Eng 27, 289–304 (2009). https://doi.org/10.1007/s10706-008-9229-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10706-008-9229-9