Advertisement

Computational Statistics

, Volume 28, Issue 5, pp 1881–1914 | Cite as

Performance of derivative free search ANN training algorithm with time series and classification problems

  • Shamsuddin AhmedEmail author
Original Paper
  • 252 Downloads

Abstract

A Manhattan search algorithm to minimize artificial neural network error function is outlined in this paper. From an existing position in Cartesian coordinate, a search vector moves in orthogonal directions to locate minimum function value. The search algorithm computes optimized step length for rapid convergence. This step is performed when consecutive search is successful in minimizing function value. The optimized step length identifies favorable descent direction to minimize function value. The search method is suitable for complex error surface where derivative information is difficult to obtain or when the error surface is nearly flat. The rate of change in function value is almost negligible near the flat surface. Most of the derivative based training algorithm faces difficulty in such scenarios. This algorithm avoids derivative information of an error function. Therefore, it is an attractive search method when derivative based algorithm faces difficulty due to complex ridges and flat valleys. In case the algorithm gets into trapped minimum, the search vector takes steps to move out of a local minimum by exploring neighborhood descent search directions. The algorithm differs from the first and second order derivative based training methods. To measure the performance of the algorithm, estimation of electric energy generation model from Fiji Islands and “L-T” letter recognition problems are solved. Bootstrap analysis shows that the algorithm’s predictive and classification abilities are high. The algorithm is reliable when solution to a problem is unknown. Therefore, the algorithm identifies benchmark solution.

Keywords

Optimization Neural network Manhattan search  Derivative Free Exploratory search Convergence Bootstrap Benchmark 

References

  1. Ahmed S (2010) Multi-directional search to optimize neural network error function. Kybernetes 39(7):1145–1166CrossRefzbMATHGoogle Scholar
  2. Alrefaei MH, Andradóttir S (1999) A simulated annealing algorithm with constant temperature for discrete stochastic optimization. Manag Sci 45:748–764CrossRefzbMATHGoogle Scholar
  3. Audet C, Orban D (2006) Finding optimal algorithmic parameters using derivative free optimization. SIAM J Optim 17(3):642–664MathSciNetCrossRefzbMATHGoogle Scholar
  4. Carcangiu Sara, Carcangiu Alessandra Fanni, Augusto Montisci (2009) A constructive algorithm of neural approximation models for optimization problems. Int J Comput Math Electr Electron Eng 28(5):1276–1289CrossRefzbMATHGoogle Scholar
  5. Conn AR, Scheinberg K, Vicente LN (2008) Geometry of interpolation sets in derivative free optimization. Math Program Ser B 111:141–172MathSciNetCrossRefzbMATHGoogle Scholar
  6. Efron B, Tibshirani R (1993) An introduction to bootstrap. Chapman and Hall, New YorkCrossRefzbMATHGoogle Scholar
  7. Erkmen Burcu, Yıldırım Tulay (2008) Improving classification performance of sonar targets by applying general regression neural network with PCA. Expert Syst Appl 35(1–2):472–475CrossRefGoogle Scholar
  8. Gerencsér L, Hill SD, Vágó Z (1999) Optimization over discrete sets via SPSA. In: Proceedings of the 38th conference on decision and, control, pp 1791–1795Google Scholar
  9. Ghosh R, Ghosh M, Yearwood J, Bagirov A (2005) Determining regularization parameters for derivative free neural learning. In: Proceeding MLDM’05 Proceedings of the 4th international conference on machine learning and data mining in pattern recognitionGoogle Scholar
  10. Tawfeig H, Vijanth S (2011) Predicting flow rate of V-shape custom tank using derivative free recursive algorithm. J Appl Sci 11:1279–1284CrossRefGoogle Scholar
  11. Hecht-Nielsen R (1990) NeuroComputing. Addison-Wesley Publishing Company, ReadingGoogle Scholar
  12. Hesterberg T, Monaghan S, Moore DS, Clipson A, Epstein R (2003) Bootstrap methods and permutation tests. W.H. Freeman and Company, New YorkGoogle Scholar
  13. Hong LJ, Nelson BL (2006) Discrete optimization via simulation using COMPASS. Oper Res 54:115–129CrossRefzbMATHGoogle Scholar
  14. Hong LJ, Nelson BL (2007) Selecting the best system when systems are revealed sequentially. IIE Trans 39:723–734CrossRefGoogle Scholar
  15. Hooke R, Jeeves TA (1961) Direct search solution of numerical and statistical problems. J Assoc Comput Mach 8:212–229CrossRefzbMATHGoogle Scholar
  16. Hush DR, Horne B, Salas JM (1992) Error surfaces for multilayer. IEEE Trans Syst Man Cybern 22:1152–1161CrossRefGoogle Scholar
  17. Moore JJ, Wild SM (2009) Benchmarking derivative free optimization algorithms. SIAM J Optim 20(1):172–191MathSciNetCrossRefGoogle Scholar
  18. Jacobs RA (1988) Increased rate of convergence through learning rate adaptation. Neural Netw 1:295–307CrossRefGoogle Scholar
  19. Kamarthi SV, Pittner S (1999) Accelerating neural network training using weight extrapolations. Neural Netw 12:1285–1299CrossRefGoogle Scholar
  20. Price RK, Spitznagel EL, Downey TJ, Meyer DJ, Risk NK, El-Ghazzawy OG (2000) Applying artificial neural network models to clinical decision making. Psychol Assess 12(1):40–51CrossRefGoogle Scholar
  21. Kleywegt A, Shapiro A, Homem-de-Mello T (2001) The sample average approximation method for stochastic discrete optimization. SIAM J Optim 12:479–502Google Scholar
  22. Knoke JD, Anderson CM, Koch GG (2006) Analyzing repeated measures marginal models on sample surveys with resampling methods. J Stat Softw 15(8):1–13Google Scholar
  23. Kordos M, Duch Kordos W (2008) Variable step search algorithm for feedforward networks, Neurocomputing. Corrected Proof, Available online 29 April 2008 (in press)Google Scholar
  24. Krzyzak A, Dai W, Suen CY (1990) Classification of Large set of Handwritten Characters Using Modified Back Propagation Model. In: Proceedings of the international joint conference on neural networks III:225–232. IEEE Press, Piscataway, NJGoogle Scholar
  25. Takashi Kuremoto, Obayashi Kuremoto Masanao, Kobayashi Kunikazu (2009) Adaptive swarm behavior acquisition by a neuro-fuzzy system and reinforcement learning algorithm. Int J Intell Comput Cybern 2(4):724–744MathSciNetCrossRefzbMATHGoogle Scholar
  26. Derong Liu, Zhang Huaguang (2008) Neural networks: algorithms and applications. Neurocomputing 71(4–6):471–473Google Scholar
  27. Mirikitani Derrick T, Nikolaev Nikolay (2010) Efficient online recurrent connectionist learning with the ensemble Kalman filter. Neurocomputing 73(4–6):1024–1030CrossRefGoogle Scholar
  28. Mosteller F, Tukey (1968) Data analysis including statistics. In: Lindzey G, Aronson E (eds) Handbook of social psychology 2. Addision-Wesley, Reading MassGoogle Scholar
  29. Polak E, Ribiere G (1969) Note Sur la Convergence de Methods de Directions Conjures. Revue Francaise Information Recherche Operationnelle 16:35–43MathSciNetGoogle Scholar
  30. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representation by error propagation. In: Rumelhart DE, McClelland JL, PDP research group (eds) Parallel distributed processing: explorations in the microstructure of cognition, vol 1. foundations. MIT Press, Cambridge, MA, USA, pp 318–362Google Scholar
  31. Saini Lalit Mohan (2008) Peak load forecasting using Bayesian regularization, resilient and adaptive back propagation learning based artificial neural networks. Electr Power Syst Res 78(7):1302–1310CrossRefGoogle Scholar
  32. Salmalian K, Soleimani M, Rouhi S (2012) Fatigue life modeling and prediction of GRP composites using multi-objective evolutionary optimized neural networks. Int J Math Models Methods Appl Sci 1(6):1–10Google Scholar
  33. Salomon R, Van Hemmen L (1996) Accelerating back propagation through dynamic self-adaptation. Neural Netw 9(4):589–601CrossRefGoogle Scholar
  34. Shi L, Ólafsson S (2000) Nested partitions method for stochastic optimization. Methodol Comput Appl Probab 2:271–291MathSciNetCrossRefzbMATHGoogle Scholar
  35. Snee RD (1977) Some aspects of nonorthogonal data analysis, Part I. Developing prediction equations. J Qual Technol 5:67–79 Springer, Berlin, HeidelbergGoogle Scholar
  36. Stone M (1974) Cross-validation choice and assessment of statistical predictions (with discussions). J R Stat Soc Ser B 36:111–147zbMATHGoogle Scholar
  37. Torczon V (1997) On the convergence of pattern search algorithms. SIAM J Control Optim 7(1):1–25MathSciNetCrossRefzbMATHGoogle Scholar
  38. Van Ooyen A, Nienhuis B (1992) Improving the convergence of the back propagation algorithm. Neural Netw 5:465–471CrossRefGoogle Scholar
  39. Vogl TP, Mangis JK, Rigler AK, Zink WT, Alkon DL (1988) Accelerating the convergence of the back-propagation method. Biol Cybern 59:257–263CrossRefGoogle Scholar
  40. Tai-Yue Wang, Chien-Yu Huang Wang (2008) Optimizing back-propagation networks via a calibrated heuristic algorithm with an orthogonal array. Expert Syst Appl 34(3):1630–1641CrossRefGoogle Scholar
  41. Yan D, Mukai H (1992) Stochastic discrete optimization. SIAM J Control Optim 30:594–612MathSciNetCrossRefzbMATHGoogle Scholar
  42. Yang Xin-She, Benjamin Bronner, Leo Trottier, Nick Orbeck, James Meiss, Eugene M Izhikevich (2011) Metaheuristic Optimization. Scholarpedia 6(8)11472Google Scholar
  43. Zhang C, Wu W, Chen XH, Xiong Y (2008) Convergence of BP algorithm for product unit neural networks with exponential weights. Neurocomputing 72(1—-3):513–520CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Graduate School of BusinessThe University of the South PacificSuvaFiji

Personalised recommendations