Abstract
In artificial neural networks (ANNs), the activation function most used in practice are the logistic sigmoid function and the hyperbolic tangent function. The activation functions used in ANNs have been said to play an important role in the convergence of the learning algorithms. In this paper, we evaluate the use of different activation functions and suggest the use of three new simple functions, complementary log-log, probit and log-log, as activation functions in order to improve the performance of neural networks. Financial time series were used to evaluate the performance of ANNs models using these new activation functions and to compare their performance with some activation functions existing in the literature. This evaluation is performed through two learning algorithms: conjugate gradient backpropagation with Fletcher–Reeves updates and Levenberg–Marquardt.
Similar content being viewed by others
References
Box GEP, Jenkins GM (1976) Time series analysis, forecasting and control. Holden Day, San Francisco
Engle RF (1982) Autoregressive conditional heteroscedasticity with estimates of the variance of UK inflation. Econometrica 50:987–1008
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2:303–314
White H (1990) Connectionist nonparametric regression: multilayer feedforward networks can learn arbitrary mappings. Neural Net 3:535–550
Gallant AR, White H (1992) On learning the derivatives of an unknown mapping with multilayer feedforward networks. Neural Net 5:129–138
Huang W, Lai K, Nakamori Y, Wang S, Yu L (2007) Neural networks in finance and economics forecasting. Int J Inf Technol Decision Making 6(1):113–140
Bollerslev T (1990) Modeling the coherence in short-run nominal exchange rates: a multivariate generalized arch model. Rev Econ Statist 72:498–505
Chandra P, Singh Y (2004) An activation function adapting training algorithm for sigmoidal feedforward networks. Neurocomputing 61:429–437
Duch W, Jankowski N (1999) Survey of neural transfer functions. Neural Comput Appl 2:163–212
Duch W, Jankowski N (2001) Transfer functions: hidden possibilities for better neural networks. In 9th European symposium on artificial neural networks, pp 81–94
Singh Y and Chandra P (2003) A class +1 sigmoidal activation functions for FFANNs. J Econ Dynamic Control 28(1):183–187
Pao YH (1989) Adaptive pattern recognition and neural networks, 2nd edn. Addison-Wesley, New York
Hartman E, Keeler JD, Kowalski JM (1990) Layered neural networks with gaussian hidden units as universal approximations. Neural Comput Appl 2(2):210–215
Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Net 4(2):251–257
Hornik K (1993) Some new results on neural network approximation. Neural Net 6(9):1069–1072
Leshno M, Lin VY, Pinkus A, Schocken S (1993) Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Net 6(6):861–867
Leung H, Haykin S (1993) Rational function neural network. Neural Comput Appl 5(6):928–938
Giraud B, Lapedes A, Lon C, Lemm J (1995) Lorentzian neural nets. Neural Net 8(5):757–767
Skoundrianos EN, Tzafestas SG (2004) Modelling and FDI of dynamic discrete time systems using a MLP with a new sigmoidal activation function. J Intell Robotics Syst 41(1):19–36
Ma L, Khorasani K (2005) Constructive feedforward neural networks using hermite polynomial activation functions. IEEE Trans Neural Net 16(4):821–833
Wen C, Ma X (2005) A max-piecewise-linear neural network for function approximation. Neurocomputing 71:843–852
Efe MO (2008) Novel neuronal activation functions for feedforward neural networks. Neural Process Lett 28:63–79
Gomes GSS, Ludermir TB (2008) Complementary log-log and probit: activation functions implemented in artificial neural networks. In: 8th International conference on hybrid intelligent systems. IEEE Computer Society, pp 939–942
Fletcher R, Reeves CM (1964) Function minimization by conjugate gradients. Comput J 7:149–154
Hagan MT, Demuth HB, Beale MH (1996) Neural network design. PWS Publishing, Boston
Hagan MT, Menhaj M (1994) Training feed-forward networks with the marquardt algorithm. IEEE Trans Neural Net 5(6):989–993
Marquardt D (1963) An algorithm for least-squares estimation of nonlinear parameters. SIAM J Appl Math 11: 431–441
Cybenko G (1988) Continuous valued neural networks with two hidden layers are sufficient technical report, Department of Computer Science, Tufts University, Medford, MA
Hara K, Nakayamma K (1994) Comparison of activation functions in multilayer neural network for pattern classification. In: International conference on neural networks vol 5, IEEE World Congress on computational intelligence, pp 2997–3002
Nelder JA, Wedderburn WM (1972) Generalized linear models. J Royal Statist Soc 3:370–384
McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman and Hall, London
Bliss CI (1935) The calculation of the dosage-mortality curve. Ann Appl Biol 22:134–167
Collett D (1994) Modelling survival data in medical research. Chapman and Hall, London
Dobson AJ (2002) An introduction to generalized linear models, 2nd edn. Chapman and Hall, New York
McCullagh P, Nelder JA (1989) Generalized linear models, 2nd ed. Chapman and Hall, London
Haykin S (2001) Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall, New Jersey
Acknowledgments
The authors would like to thank CNPq and FACEPE for their financial support. The authors gratefully acknowledge the contribution of the anonymous referees comments in improving the clarity of the work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
da S. Gomes, G.S., Ludermir, T.B. & Lima, L.M.M.R. Comparison of new activation functions in neural network for forecasting financial time series. Neural Comput & Applic 20, 417–439 (2011). https://doi.org/10.1007/s00521-010-0407-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-010-0407-3