Skip to main content

Optimisation in an Error Backpropagation Neural Network Environment with a Performance Test on a Spectral Pattern Classification Problem

  • Chapter
Spatial Analysis and GeoComputation

Abstract

This paper attempts to develop a mathematically rigid framework for minimising the cross-entropy function in an error backpropagating framework. In doing so, we derive the backpropagation formulae for evaluating the partial derivatives in a computationally efficient way. Various techniques of optimising the multiple-class cross-entropy error function to train single hidden layer neural network classifiers with softmax output transfer functions are investigated on a real world multispectral pixel-by-pixel classification problem that is of fundamental importance in remote sensing. These techniques include epoch-based and batch versions of backpropagation of gradient descent, PR-conjugate gradient, and BFGS quasi-Newton errors. The method of choice depends upon the nature of the learning task and whether one wants to optimise learning for speed or classification performance. It was found that, comparatively considered, gradient descent error backpropagation provided the best and most stable out-of-sample performance results across batch and epoch-based modes of operation. If the goal is to maximise learning speed and a sacrifice in classification accuracy is acceptable, then PR-conjugate gradient error backpropagation tends to be superior. If the training set is very large, stochastic epoch-based versions of local optimisers should be chosen utilising a larger rather than a smaller epoch size to avoid unacceptable instabilities in the classification results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Battiti R. and Tecchiolli G. (1994): Learning with first, second, and no derivatives: A case study in high energy physics, Neurocomputing 6(2), 181–206

    Article  Google Scholar 

  • Benediktsson J.A., Swain P.H. and Ersoy O.K. (1993): Conjugate-gradient neural networks in classification of multisource and very-high-dimensional remote sensing data, International Journal of Remote Sensing 14(15), 2883–2903

    Google Scholar 

  • Benediktsson J.A., Swain P.H. and Ersoy O.K. (1990): Neural network approaches versus statistical methods in classification of multisource remote sensing data, IEEE Transactions on Geoscience and Remote Sensing 28(4), 540–551

    Article  Google Scholar 

  • Bischof H., Schneider W. and Pinz A.J. (1992): Multispectral classification of Landsat-images using neural networks, IEEE Transactions on Geoscience and Remote Sensing 30(3), 482–490

    Article  Google Scholar 

  • Bishop C.M. (1995): Neural Networks for Pattern Recognition, Clarendon Press, Oxford

    MATH  Google Scholar 

  • Bridle J.S. (1990): Probabilistic interpretation of feedforward classification network outputs with relationships to statistical pattern recognition. In: Fogelman-Soulié F. and Hérault J. (eds.) Neurocomputing: Algorithms, Architectures, and Applications, Springer, Berlin, Heidelberg, New York, pp. 227–236

    Google Scholar 

  • Bruzzone L., Conese C., Maselli F. and Roli F. (1997): Multisource classification of complex rural areas by statistical and neural-network approaches, Photogrammetric Engineering & Remote Sensing 63(5), 523–533

    Google Scholar 

  • Chen K.S., Tzeng Y.C., Chen C.F. and Kao W.L. (1995): Land-cover classification of multispectral imagery using a dynamic learning neural network, Photogrammetric Engineering & Remote Sensing 61(4), 403–408

    Google Scholar 

  • Cichocki A. and Unbehauen R. (1993): Neural Networks for Optimisation and Signal-Processing, John Wiley, Chichester [UK], New York

    Google Scholar 

  • Civco D.L. (1993): Artificial neural networks for land-cover classification and mapping, International Journal for Geographical Information Systems 7(2), 173–186

    Google Scholar 

  • Fahlman S.E. (1988): Faster-learning variations on backpropagation: An empirical study. In: Touretzky D., Hinton G.E. and Sejnowski T.J. (eds.) Proceedings of the 1988 Connectionist Models Summer School, Morgan Kaufman, San Mateo [CA], pp. 38–51

    Google Scholar 

  • Fischer M.M., Gopal S., Staufer P. and Steinnocher K. (1997): Evaluation of neural pattern classifiers for a remote sensing application, Geographical Systems 4(2), 195–224 and 243–244

    Google Scholar 

  • Fletcher R. (1986): Practical Methods for Optimization, Wiley-Interscience, New York

    Google Scholar 

  • Foody G.F. (1995): Land cover classification by a neural network with ancillary information, International Journal for Geographical Information Systems 9, 527–542

    Google Scholar 

  • Foody G.F., McCulloch M.B. and Yates W.B. (1995): Classification of remotely sensed data by an artificial neural network: Issues related to the training set characteristics, Photogrammetric Engineering & Remote Sensing 61(4), 391–401

    Google Scholar 

  • Heerman P.D. and Khazenie N. (1992): Classification of multispectral remote sensing data using a backpropagation neural network, IEEE Transactions on Geoscience and Remote Sensing 30(1), 81–88

    Article  Google Scholar 

  • Hepner G.F., Logan T., Ritter N. and Bryant N. (1990): Artificial neural network classification using a minimal training set: Comparison to conventional supervised classification, Photogrammetric Engineering & Remote Sensing 56(4), 469–473

    Google Scholar 

  • Hestenes M.R. and Stiefel E. (1952): Methods of conjugate gradients for solving linear systems, Journal of Research of the National Bureau of Standards 49(6), 409–436

    MATH  MathSciNet  Google Scholar 

  • Jacobs R.A. (1988): Increased rates of convergence through learning rate adaptation, Neural Networks 1(4), 295–307

    Article  Google Scholar 

  • Kanellopoulos I., Varfis A., Wilkinson G.G. and Megiér J. (1992): Landcover discrimination in SPOT HRV imagery using an artificial neural network — A 20-Class experiment, Remote Sensing of the Environment 13(5), 917–924

    Google Scholar 

  • Le Cun Y. (1989): Generalization and network design strategies. In: Pfeifer M. (ed.) Connections in Perspective, North-Holland, Amsterdam, pp. 143–155

    Google Scholar 

  • Luenberger P. (1984): Linear and Nonlinear Programming, Addison-Wesley, Reading, [MA]

    MATH  Google Scholar 

  • McClellan G.E., DeWitt R.N., Hemmer T.H., Mattheson L.N. and Moe G.O. (1989): Multispectral image processing with a three-layer backpropagation network. In: International Joint Conference on Neural Networks, IEEE Press, Piscataway [NJ], pp. I: 151–153

    Chapter  Google Scholar 

  • Møller M.F. (1993): A scaled conjugate gradient algorithm for fast supervised learning, Neural Networks 6(4), 525–533

    Article  Google Scholar 

  • Paola J.D. and Schowengerdt R.A. (1995): A review and analysis of backpropagation neural networks for classification of remotely sensed multispectral imagery, International Journal of Remote Sensing 16(16), 3033–3058

    Google Scholar 

  • Paola J.D. and Schowengerdt R.A. (1997): The effect of neural-network structure on a multispectral land-use/land-cover classification, Photogrammetric Engineering & Remote Sensing 63(5), 535–544

    Google Scholar 

  • Press W.H., Teukolsky S.A., Vetterling W.T. and Flannery B.P. (1992): Numerical Recipes in C. The Art of Scientific Computing, Cambridge University Press, Cambridge [MA]

    MATH  Google Scholar 

  • Rumelhart D.E., Hinton G.E. and Williams R.J. (1986): Learning internal representations by error propagation. In: Rumelhart D.E., McClelland L.J. and the PDP Research Group (eds.) Parallel Distributed Processing: Explorations in the Microstructures of Cognition, MIT Press, Cambridge [MA], pp. 318–332

    Google Scholar 

  • Schiffmann W., Jost M. and Werner R. (1993): Comparison of optimised backpropagation algorithms. In: Verleysen M. (ed.) European Symposium on Artificial Neural Networks, Brussels, pp. 97–104

    Google Scholar 

  • Shanno D.F. (1978): Conjugate gradient methods with inexact searches, Mathematics of Operations Research 3(3), 244–256

    Article  MATH  MathSciNet  Google Scholar 

  • Shanno D.F. (1990): Recent advances in numerical techniques for large-scale optimization. In: Miller W.T. (ed.) Neural Networks for Robotics and Control, MIT Press, Cambridge [MA], pp. 171–178

    Google Scholar 

  • Skidmore A.K., Turner B.J., Brinkhof W. and Knoles E. (1997): Performance of a neural network: Mapping forests using GIS and remotely sensed data, Photogrammetric Engineering & Remote Sensing 63(5), 501–514

    Google Scholar 

  • Yoshida T. and Omatu S. (1994): Neural network approach to landcover mapping, IEEE Transactions on Geoscience and Remote Sensing 32, 1103–1109

    Article  Google Scholar 

Download references

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer Berlin · Heidelberg

About this chapter

Cite this chapter

Staufer, P. (2006). Optimisation in an Error Backpropagation Neural Network Environment with a Performance Test on a Spectral Pattern Classification Problem. In: Spatial Analysis and GeoComputation. Springer, Berlin, Heidelberg . https://doi.org/10.1007/3-540-35730-0_10

Download citation

Publish with us

Policies and ethics