Skip to main content
Log in

Comparison of new activation functions in neural network for forecasting financial time series

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In artificial neural networks (ANNs), the activation function most used in practice are the logistic sigmoid function and the hyperbolic tangent function. The activation functions used in ANNs have been said to play an important role in the convergence of the learning algorithms. In this paper, we evaluate the use of different activation functions and suggest the use of three new simple functions, complementary log-log, probit and log-log, as activation functions in order to improve the performance of neural networks. Financial time series were used to evaluate the performance of ANNs models using these new activation functions and to compare their performance with some activation functions existing in the literature. This evaluation is performed through two learning algorithms: conjugate gradient backpropagation with Fletcher–Reeves updates and Levenberg–Marquardt.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Box GEP, Jenkins GM (1976) Time series analysis, forecasting and control. Holden Day, San Francisco

    MATH  Google Scholar 

  2. Engle RF (1982) Autoregressive conditional heteroscedasticity with estimates of the variance of UK inflation. Econometrica 50:987–1008

    Article  MATH  MathSciNet  Google Scholar 

  3. Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2:303–314

    Article  MATH  MathSciNet  Google Scholar 

  4. White H (1990) Connectionist nonparametric regression: multilayer feedforward networks can learn arbitrary mappings. Neural Net 3:535–550

    Article  Google Scholar 

  5. Gallant AR, White H (1992) On learning the derivatives of an unknown mapping with multilayer feedforward networks. Neural Net 5:129–138

    Article  Google Scholar 

  6. Huang W, Lai K, Nakamori Y, Wang S, Yu L (2007) Neural networks in finance and economics forecasting. Int J Inf Technol Decision Making 6(1):113–140

    Article  Google Scholar 

  7. Bollerslev T (1990) Modeling the coherence in short-run nominal exchange rates: a multivariate generalized arch model. Rev Econ Statist 72:498–505

    Article  Google Scholar 

  8. Chandra P, Singh Y (2004) An activation function adapting training algorithm for sigmoidal feedforward networks. Neurocomputing 61:429–437

    Article  Google Scholar 

  9. Duch W, Jankowski N (1999) Survey of neural transfer functions. Neural Comput Appl 2:163–212

    Google Scholar 

  10. Duch W, Jankowski N (2001) Transfer functions: hidden possibilities for better neural networks. In 9th European symposium on artificial neural networks, pp 81–94

  11. Singh Y and Chandra P (2003) A class +1 sigmoidal activation functions for FFANNs. J Econ Dynamic Control 28(1):183–187

    Article  MATH  MathSciNet  Google Scholar 

  12. Pao YH (1989) Adaptive pattern recognition and neural networks, 2nd edn. Addison-Wesley, New York

    Google Scholar 

  13. Hartman E, Keeler JD, Kowalski JM (1990) Layered neural networks with gaussian hidden units as universal approximations. Neural Comput Appl 2(2):210–215

    Article  Google Scholar 

  14. Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Net 4(2):251–257

    Article  Google Scholar 

  15. Hornik K (1993) Some new results on neural network approximation. Neural Net 6(9):1069–1072

    Article  Google Scholar 

  16. Leshno M, Lin VY, Pinkus A, Schocken S (1993) Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Net 6(6):861–867

    Article  Google Scholar 

  17. Leung H, Haykin S (1993) Rational function neural network. Neural Comput Appl 5(6):928–938

    Article  Google Scholar 

  18. Giraud B, Lapedes A, Lon C, Lemm J (1995) Lorentzian neural nets. Neural Net 8(5):757–767

    Article  Google Scholar 

  19. Skoundrianos EN, Tzafestas SG (2004) Modelling and FDI of dynamic discrete time systems using a MLP with a new sigmoidal activation function. J Intell Robotics Syst 41(1):19–36

    Article  Google Scholar 

  20. Ma L, Khorasani K (2005) Constructive feedforward neural networks using hermite polynomial activation functions. IEEE Trans Neural Net 16(4):821–833

    Article  Google Scholar 

  21. Wen C, Ma X (2005) A max-piecewise-linear neural network for function approximation. Neurocomputing 71:843–852

    Article  Google Scholar 

  22. Efe MO (2008) Novel neuronal activation functions for feedforward neural networks. Neural Process Lett 28:63–79

    Article  Google Scholar 

  23. Gomes GSS, Ludermir TB (2008) Complementary log-log and probit: activation functions implemented in artificial neural networks. In: 8th International conference on hybrid intelligent systems. IEEE Computer Society, pp 939–942

  24. Fletcher R, Reeves CM (1964) Function minimization by conjugate gradients. Comput J 7:149–154

    Article  MATH  MathSciNet  Google Scholar 

  25. Hagan MT, Demuth HB, Beale MH (1996) Neural network design. PWS Publishing, Boston

  26. Hagan MT, Menhaj M (1994) Training feed-forward networks with the marquardt algorithm. IEEE Trans Neural Net 5(6):989–993

    Article  Google Scholar 

  27. Marquardt D (1963) An algorithm for least-squares estimation of nonlinear parameters. SIAM J Appl Math 11: 431–441

    Article  MATH  MathSciNet  Google Scholar 

  28. Cybenko G (1988) Continuous valued neural networks with two hidden layers are sufficient technical report, Department of Computer Science, Tufts University, Medford, MA

  29. Hara K, Nakayamma K (1994) Comparison of activation functions in multilayer neural network for pattern classification. In: International conference on neural networks vol 5, IEEE World Congress on computational intelligence, pp 2997–3002

  30. Nelder JA, Wedderburn WM (1972) Generalized linear models. J Royal Statist Soc 3:370–384

    Google Scholar 

  31. McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman and Hall, London

    Google Scholar 

  32. Bliss CI (1935) The calculation of the dosage-mortality curve. Ann Appl Biol 22:134–167

    Article  Google Scholar 

  33. Collett D (1994) Modelling survival data in medical research. Chapman and Hall, London

    Google Scholar 

  34. Dobson AJ (2002) An introduction to generalized linear models, 2nd edn. Chapman and Hall, New York

    MATH  Google Scholar 

  35. McCullagh P, Nelder JA (1989) Generalized linear models, 2nd ed. Chapman and Hall, London

    MATH  Google Scholar 

  36. Haykin S (2001) Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall, New Jersey

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank CNPq and FACEPE for their financial support. The authors gratefully acknowledge the contribution of the anonymous referees comments in improving the clarity of the work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Teresa B. Ludermir.

Rights and permissions

Reprints and permissions

About this article

Cite this article

da S. Gomes, G.S., Ludermir, T.B. & Lima, L.M.M.R. Comparison of new activation functions in neural network for forecasting financial time series. Neural Comput & Applic 20, 417–439 (2011). https://doi.org/10.1007/s00521-010-0407-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-010-0407-3

Keywords

Navigation