Abstract
This paper attempts to develop a mathematically rigid framework for minimising the cross-entropy function in an error backpropagating framework. In doing so, we derive the backpropagation formulae for evaluating the partial derivatives in a computationally efficient way. Various techniques of optimising the multiple-class cross-entropy error function to train single hidden layer neural network classifiers with softmax output transfer functions are investigated on a real world multispectral pixel-by-pixel classification problem that is of fundamental importance in remote sensing. These techniques include epoch-based and batch versions of backpropagation of gradient descent, PR-conjugate gradient, and BFGS quasi-Newton errors. The method of choice depends upon the nature of the learning task and whether one wants to optimise learning for speed or classification performance. It was found that, comparatively considered, gradient descent error backpropagation provided the best and most stable out-of-sample performance results across batch and epoch-based modes of operation. If the goal is to maximise learning speed and a sacrifice in classification accuracy is acceptable, then PR-conjugate gradient error backpropagation tends to be superior. If the training set is very large, stochastic epoch-based versions of local optimisers should be chosen utilising a larger rather than a smaller epoch size to avoid unacceptable instabilities in the classification results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Battiti R. and Tecchiolli G. (1994): Learning with first, second, and no derivatives: A case study in high energy physics, Neurocomputing 6(2), 181–206
Benediktsson J.A., Swain P.H. and Ersoy O.K. (1993): Conjugate-gradient neural networks in classification of multisource and very-high-dimensional remote sensing data, International Journal of Remote Sensing 14(15), 2883–2903
Benediktsson J.A., Swain P.H. and Ersoy O.K. (1990): Neural network approaches versus statistical methods in classification of multisource remote sensing data, IEEE Transactions on Geoscience and Remote Sensing 28(4), 540–551
Bischof H., Schneider W. and Pinz A.J. (1992): Multispectral classification of Landsat-images using neural networks, IEEE Transactions on Geoscience and Remote Sensing 30(3), 482–490
Bishop C.M. (1995): Neural Networks for Pattern Recognition, Clarendon Press, Oxford
Bridle J.S. (1990): Probabilistic interpretation of feedforward classification network outputs with relationships to statistical pattern recognition. In: Fogelman-Soulié F. and Hérault J. (eds.) Neurocomputing: Algorithms, Architectures, and Applications, Springer, Berlin, Heidelberg, New York, pp. 227–236
Bruzzone L., Conese C., Maselli F. and Roli F. (1997): Multisource classification of complex rural areas by statistical and neural-network approaches, Photogrammetric Engineering & Remote Sensing 63(5), 523–533
Chen K.S., Tzeng Y.C., Chen C.F. and Kao W.L. (1995): Land-cover classification of multispectral imagery using a dynamic learning neural network, Photogrammetric Engineering & Remote Sensing 61(4), 403–408
Cichocki A. and Unbehauen R. (1993): Neural Networks for Optimisation and Signal-Processing, John Wiley, Chichester [UK], New York
Civco D.L. (1993): Artificial neural networks for land-cover classification and mapping, International Journal for Geographical Information Systems 7(2), 173–186
Fahlman S.E. (1988): Faster-learning variations on backpropagation: An empirical study. In: Touretzky D., Hinton G.E. and Sejnowski T.J. (eds.) Proceedings of the 1988 Connectionist Models Summer School, Morgan Kaufman, San Mateo [CA], pp. 38–51
Fischer M.M., Gopal S., Staufer P. and Steinnocher K. (1997): Evaluation of neural pattern classifiers for a remote sensing application, Geographical Systems 4(2), 195–224 and 243–244
Fletcher R. (1986): Practical Methods for Optimization, Wiley-Interscience, New York
Foody G.F. (1995): Land cover classification by a neural network with ancillary information, International Journal for Geographical Information Systems 9, 527–542
Foody G.F., McCulloch M.B. and Yates W.B. (1995): Classification of remotely sensed data by an artificial neural network: Issues related to the training set characteristics, Photogrammetric Engineering & Remote Sensing 61(4), 391–401
Heerman P.D. and Khazenie N. (1992): Classification of multispectral remote sensing data using a backpropagation neural network, IEEE Transactions on Geoscience and Remote Sensing 30(1), 81–88
Hepner G.F., Logan T., Ritter N. and Bryant N. (1990): Artificial neural network classification using a minimal training set: Comparison to conventional supervised classification, Photogrammetric Engineering & Remote Sensing 56(4), 469–473
Hestenes M.R. and Stiefel E. (1952): Methods of conjugate gradients for solving linear systems, Journal of Research of the National Bureau of Standards 49(6), 409–436
Jacobs R.A. (1988): Increased rates of convergence through learning rate adaptation, Neural Networks 1(4), 295–307
Kanellopoulos I., Varfis A., Wilkinson G.G. and Megiér J. (1992): Landcover discrimination in SPOT HRV imagery using an artificial neural network — A 20-Class experiment, Remote Sensing of the Environment 13(5), 917–924
Le Cun Y. (1989): Generalization and network design strategies. In: Pfeifer M. (ed.) Connections in Perspective, North-Holland, Amsterdam, pp. 143–155
Luenberger P. (1984): Linear and Nonlinear Programming, Addison-Wesley, Reading, [MA]
McClellan G.E., DeWitt R.N., Hemmer T.H., Mattheson L.N. and Moe G.O. (1989): Multispectral image processing with a three-layer backpropagation network. In: International Joint Conference on Neural Networks, IEEE Press, Piscataway [NJ], pp. I: 151–153
Møller M.F. (1993): A scaled conjugate gradient algorithm for fast supervised learning, Neural Networks 6(4), 525–533
Paola J.D. and Schowengerdt R.A. (1995): A review and analysis of backpropagation neural networks for classification of remotely sensed multispectral imagery, International Journal of Remote Sensing 16(16), 3033–3058
Paola J.D. and Schowengerdt R.A. (1997): The effect of neural-network structure on a multispectral land-use/land-cover classification, Photogrammetric Engineering & Remote Sensing 63(5), 535–544
Press W.H., Teukolsky S.A., Vetterling W.T. and Flannery B.P. (1992): Numerical Recipes in C. The Art of Scientific Computing, Cambridge University Press, Cambridge [MA]
Rumelhart D.E., Hinton G.E. and Williams R.J. (1986): Learning internal representations by error propagation. In: Rumelhart D.E., McClelland L.J. and the PDP Research Group (eds.) Parallel Distributed Processing: Explorations in the Microstructures of Cognition, MIT Press, Cambridge [MA], pp. 318–332
Schiffmann W., Jost M. and Werner R. (1993): Comparison of optimised backpropagation algorithms. In: Verleysen M. (ed.) European Symposium on Artificial Neural Networks, Brussels, pp. 97–104
Shanno D.F. (1978): Conjugate gradient methods with inexact searches, Mathematics of Operations Research 3(3), 244–256
Shanno D.F. (1990): Recent advances in numerical techniques for large-scale optimization. In: Miller W.T. (ed.) Neural Networks for Robotics and Control, MIT Press, Cambridge [MA], pp. 171–178
Skidmore A.K., Turner B.J., Brinkhof W. and Knoles E. (1997): Performance of a neural network: Mapping forests using GIS and remotely sensed data, Photogrammetric Engineering & Remote Sensing 63(5), 501–514
Yoshida T. and Omatu S. (1994): Neural network approach to landcover mapping, IEEE Transactions on Geoscience and Remote Sensing 32, 1103–1109
Rights and permissions
Copyright information
© 2006 Springer Berlin · Heidelberg
About this chapter
Cite this chapter
Staufer, P. (2006). Optimisation in an Error Backpropagation Neural Network Environment with a Performance Test on a Spectral Pattern Classification Problem. In: Spatial Analysis and GeoComputation. Springer, Berlin, Heidelberg . https://doi.org/10.1007/3-540-35730-0_10
Download citation
DOI: https://doi.org/10.1007/3-540-35730-0_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35729-2
Online ISBN: 978-3-540-35730-8
eBook Packages: Business and EconomicsEconomics and Finance (R0)