Skip to main content
Log in

Multistage Newton’s Approach for Training Radial Basis Function Neural Networks

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

A systematic four-step batch approach is presented for the second-order training of radial basis function (RBF) neural networks for estimation. First, it is shown that second-order training works best when applied separately to several disjoint parameter subsets. Newton’s method is used to find distance measure weights, leading to a kind of embedded feature selection. Next, separate Newton’s algorithms are developed for RBF spread parameters, center vectors, and output weights. The final algorithm’s training error per iteration and per multiply are compared to those of other algorithms, showing that convergence speed is reasonable. For several widely available datasets, it is shown that tenfold testing errors of the final algorithm are less than those for recursive least squares, the error correction algorithm, the support vector regression training, and Levenberg–Marquardt.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Kumar S. Neural networks: a classroom approach. McGraw-Hill; 2004. https://books.google.com/books?id=y67YnH4kEMsC.

  2. Buja A, Tibshirani R, Hastie T, Simard P, Sackinger E. Pattern classification and scene analysis. New York: Wiley; 1973.

    Google Scholar 

  3. Medgassy P. Decomposition of superposition of distributed functions. 1961.

  4. Micchelli CA. Interpolation of scattered data: distance matrices and conditionally positive definite functions. In: Approximation theory and spline functions. Springer; 1984, p. 143–5.

  5. Broomhead DS, Lowe D. Radial basis functions, multi-variable functional interpolation and adaptive networks. Technical report, DTIC Document; 1988.

  6. Specht DF. Probabilistic neural networks. Neural Netw. 1990;3(1):109–18.

    Article  Google Scholar 

  7. Tomaso P, Federico G. A theory of networks for approximation and learning. DTIC Document: Technical report; 1989.

  8. Hornik K. Approximation capabilities of multilayer feedforward networks. Neural Netw. 1991;4(2):251–7.

    Article  MathSciNet  Google Scholar 

  9. Haykin SS. Neural networks and learning machines. 3rd edition. Pearson Education; 2009. https://www.bibsonomy.org/bibtex/2e5015812328aaeccd73d8b03a7e36831/vngudivada.

  10. Tilahun SL, Ngnotchouye JM, Hamadneh NN. Continuous versions of firefly algorithm: a review. Artif Intell Rev. 2019;51(3):445–92.

    Article  Google Scholar 

  11. Tilahun S, Hong C, Ong HC. Prey-predator algorithm: a new metaheuristic algorithm for optimization problems. Int J Inf Technol Decis Mak. 2015;14:12.

    Article  Google Scholar 

  12. Tilahun S. Prey-predator algorithm for discrete problems: a case for examination timetabling problem. Turk J Electr Eng Comput Sci. 2019;27:950–60.

    Article  Google Scholar 

  13. Ong HC, Hamadneh N, Tilahun SL, Sathasivam S. Prey-predator algorithm as a new optimization technique using in radial basis function neural networks. Res J Appl Sci. 2013;8:383–7.

  14. Karayiannis NB. Gradient descent learning of radial basis neural networks. In: Proceedings of International Conference on Neural Networks (ICNN'97). 1997. vol 3. pp 1815–20. https://doi.org/10.1109/ICNN.1997.614174.

  15. Karayiannis NB. Learning algorithms for reformulated radial basis neural networks. IEEE World Congr Comput Intell. 1998;3:2230–5.

    Google Scholar 

  16. Karayiannis NB. Reformulated radial basis neural networks trained by gradient descent. IEEE Trans Neural Netw. 1999;10(3):657–71.

    Article  Google Scholar 

  17. Karayiannis NB, Behnke S. New radial basis neural networks and their application in a large-scale handwritten digit recognition problem. p. 39–94. https://doi.org/10.1007/BF01893414.

  18. Sukhan L, Kil RM. Multilayer feedforward potential function network. In: IEEE international conference on neural networks, p. 161–171. IEEE, 1988.

  19. Niranjan M, Fallside F. Neural networks and radial basis functions in classifying static speech patterns. Comput Speech Lang. 1990;4(3):275–89.

    Article  Google Scholar 

  20. Moody J, Darken CJ. Fast learning in networks of locally-tuned processing units. Neural Comput. 1989;1(2):281–94.

    Article  Google Scholar 

  21. Huang G-B, Saratchandran P, Sundararajan N. A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation. IEEE Trans Neural Netw. 2005;16(1):57–67.

    Article  Google Scholar 

  22. Chen S, Cowan CFN, Grant PM. Orthogonal least squares learning algorithm for radial basis function networks. IEEE Trans Neural Netw. 1991;2(2):302–9.

    Article  Google Scholar 

  23. Chen S, Chng ES, Alkadhimi K. Regularized orthogonal least squares algorithm for constructing radial basis function networks. Int J Control. 1996;64(5):829–37.

    Article  MathSciNet  Google Scholar 

  24. Chng ES, Chen S, Mulgrew B. Gradient radial basis function networks for nonlinear and nonstationary time series prediction. IEEE Trans Neural Netw. 1996;7(1):190–4.

    Article  Google Scholar 

  25. Whitehead BA, Choate TD. Evolving space-filling curves to distribute radial basis functions over an input space. IEEE Trans Neural Netw. 1994;5(1):15–23.

    Article  Google Scholar 

  26. Orr MJL. Regularization in the selection of radial basis function centers. Neural Comput. 1995;7(3):606–23.

    Article  Google Scholar 

  27. Cha I, Kassam SA. Interference cancellation using radial basis function networks. Signal Process. 1995;47(3):247–68.

    Article  Google Scholar 

  28. Malalur SS, Manry M. Feed-forward network training using optimal input gains. In: 2009 International joint conference on neural networks, p. 1953–1960. IEEE, 2009.

  29. Shi Y. Globally convergent algorithms for unconstrained optimization. Comput Optim Appl. 2000;16(3):295–308.

    Article  MathSciNet  Google Scholar 

  30. Levenberg K. A method for the solution of certain non-linear problems in least squares. Q Appl Math. 1944;2(2):164–8.

    Article  MathSciNet  Google Scholar 

  31. Marquardt DW. An algorithm for least-squares estimation of nonlinear parameters. J Soc Ind Appl Math. 1963;11(2):431–41.

    Article  MathSciNet  Google Scholar 

  32. Nelder JA, Mead R. A simplex method for function minimization. Comput J. 1965;7(4):308–13.

    Article  MathSciNet  Google Scholar 

  33. Haelterman R. Analytical study of the least squares quasi-Newton method for interaction problems. PhD thesis, Ghent University; 2009.

  34. Xie T, Hao Y, Hewlett J, Rózycki P, Wilamowski B. Fast and efficient second-order method for training radial basis function networks. IEEE Trans Neural Netw Learn Syst. 2012;23(4):609–19.

    Article  Google Scholar 

  35. Roberto B. First-and second-order methods for learning: between steepest descent and Newton’s method. Neural Comput. 1992;4(2):141–66.

    Article  Google Scholar 

  36. Bishop C. Exact calculation of the hessian matrix for the multilayer perceptron. Neural Comput. 1992;4(4):494–501.

    Article  Google Scholar 

  37. Fletcher R. Practical methods of optimization. John Wiley & Sons, USA; 2013.

  38. Møller MF. Efficient training of feed-forward neural networks. DAIMI Rep Ser. 1993;22:464.

    Google Scholar 

  39. Irwin G, Lightbody G, McLoone S. Comparison of gradient based training algorithms for multilayer perceptrons. In: IEEE colloquium on advances in neural networks for control and systems, p. 11–1. IET, 1994.

  40. Ampazis N, Perantonis SJ. Two highly efficient second-order algorithms for training feedforward networks. IEEE Trans Neural Netw. 2002;13(5):1064–74.

    Article  Google Scholar 

  41. Rifkin RM. Everything old is new again: a fresh look at historical approaches in machine learning. PhD thesis, Massachusetts Institute of Technology; 2002.

  42. Wasserman PD. Advanced methods in neural computing. 1st edition. John Wiley & Sons, Inc, USA.

  43. Saarinen S, Bramley R, Cybenko G. Ill-conditioning in neural network training problems. SIAM J Sci Comput. 1993;14(3):693–714.

    Article  MathSciNet  Google Scholar 

  44. Golub GH, Van Loan CF. Matrix computations. The Johns Hopkins University Press; 2012.

  45. Press WH, Teukolsky SA, Vetterling WT, Flannery BP. Numerical recipes in C. Cambridge University Press, Cambridge, USA; 1992.

  46. Sartori MA, Antsaklis PJ. A simple method to derive bounds on the size and to train multilayer neural networks. IEEE Trans Neural Netw. 1991;2(4):467–71.

    Article  Google Scholar 

  47. Manry MT, Dawson MS, Fung AK, Apollo SJ, Allen LS, Lyle WD, Gong W. Fast training of neural networks for remote sensing. Remote Sens Rev. 1994;9(1–2):77–96.

    Article  Google Scholar 

  48. Subramanian C, Manry MT, Naccarino J. Reservoir inflow forecasting using neural networks. In: Proceedings of the American power conference, vol. 61, p. 220–225. Citeseer; 1999.

  49. Nocedal J, Wright S. Numerical optimization. Berlin: Springer Science Business Media; 2006.

    MATH  Google Scholar 

  50. Lengellé R, Denoeux T. Training MLPS layer by layer using an objective function for internal representations. Neural Netw. 1996;9(1):83–97.

    Article  Google Scholar 

  51. Yu H, Reiner PD, Xie T, Bartczak T, Wilamowski BM. An incremental design of radial basis function networks. IEEE Trans Neural Netw Learn Syst. 2014;25(10):1793–803.

    Article  Google Scholar 

  52. Smola AJ, Schölkopf B. A tutorial on support vector regression. Stat Comput. 2004;14(3):199–222.

    Article  MathSciNet  Google Scholar 

  53. Awad M, Khanna R. Efficient learning machines: theories, concepts, and applications for engineers and system designers. 1st ed. Berkeley: Apress; 2015.

    Book  Google Scholar 

  54. Wilamowski BM. Challenges in applications of computational intelligence in industrial electronics. In: 2010 IEEE international symposium on industrial electronics, p. 15–22. IEEE, 2010.

  55. Vapnik V. The nature of statistical learning theory. Springer-Verlag New York, Inc. 2013. https://www.bibsonomy.org/bibtex/2df9b1e85d80b3e3a448fc6c93e7051a0/tomhanika.

  56. Doost R, Sayadian A, Shamsi H. A new perceptually weighted distance measure for vector quantization of the STFT amplitudes in the speech application. IEICE Electron Express. 2009;6(12):824–30.

    Article  Google Scholar 

  57. De Wachter M, Demuynck K, Wambacq P, Van Compernolle D. A locally weighted distance measure for example based speech recognition. In: IEEE international conference on acoustics, speech, and signal processing, vol. 1, p. I–181. IEEE, 2004.

  58. Kwon S, Narayanan SS. Speaker change detection using a new weighted distance measure. In: Seventh international conference on spoken language processing; 2002.

  59. De Wachter M, Demuynck K, Wambacq P, Van Compernolle D. A locally weighted distance measure for example based speech recognition. In: International Conference on Acoustics, Speech, and Signal 2004, Montreal, Quebec, Canada, pp 181–4. https://doi.org/10.1109/ICASSP.2004.1325952.

  60. Dawson MS, Fung AK, Manry MT. Surface parameter retrieval using fast learning neural networks. Remote Sens Rev. 1993;7(1):1–18.

  61. Dawson MS, Olvera J, Fung AK, Manry MT (1992) Inversion of surface parameters using fast learning neural networks. Proceedings of IGARSS’92, Houston, Texas, May 1992. vol II, pp 910–2.

  62. Malalur SS, Manry MT. Multiple optimal learning factors for feed-forward networks. In: SPIE defense, security, and sensing, p. 77030F–77030F. International Society for Optics and Photonics; 2010.

  63. Maldonado FJ, Manry MT, Kim T-H. Finding optimal neural network basis function subsets using the Schmidt procedure. In: International joint conference on neural networks, vol. 1, p. 444–9. IEEE, 2003.

  64. Yeh I-C. Modeling of strength of high-performance concrete using artificial neural networks. Cem Concr Res. 1998;28(12):1797–808.

    Article  Google Scholar 

  65. Lang KJ. Learning to tell two spirals apart. In: Proc. of 1988 connectionist models summer school; 1988.

  66. Waugh SG. Extending and benchmarking Cascade-Correlation: extensions to the Cascade-Correlation architecture and benchmarking of feed-forward supervised artificial neural networks. PhD thesis, University of Tasmania; 1995.

  67. Quinlan JR. Combining instance-based and model-based learning. In: Proceedings of the tenth international conference on machine learning, p. 236–243. 1993.

  68. Pace RK, Barry R. Sparse spatial autoregressions. Stat Probab Lett. 1997;33(3):291–7.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chinmay Rane.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tyagi, K., Rane, C., Irie, B. et al. Multistage Newton’s Approach for Training Radial Basis Function Neural Networks. SN COMPUT. SCI. 2, 366 (2021). https://doi.org/10.1007/s42979-021-00757-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-021-00757-8

Keywords

Navigation