Learning Activation Functions by Means of Kernel Based Neural Networks

  • Giuseppe MarraEmail author
  • Dario Zanca
  • Alessandro Betti
  • Marco Gori
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11946)


The neuron activation function plays a fundamental role in the complexity of learning. In particular, it is widely known that in recurrent networks the learning of long-term dependencies is problematic due to vanishing (or exploding) gradient and that such problem is directly related to the structure of the employed activation function. In this paper, we study the problem of learning neuron-specific activation functions through kernel-based neural networks (KBNN) and we make the following contributions. First, we give a representation theorem which indicates that the best activation function is a kernel expansion over the training set, then approximated with an opportune set of points modeling 1-D clusters. Second, we extend the idea to recurrent networks, where the expressiveness of KBNN can be an determinant factor to capture long-term dependencies. We provide experimental results on some key experiments which clearly show the effectiveness of KBNN when compared with RNN and LSTM cells.


  1. 1.
    Agostinelli, F., Hoffman, M., Sadowski, P., Baldi, P.: Learning activation functions to improve deep neural networks. ArXiv preprint arXiv:1412.6830 (2014)
  2. 2.
    Bengio, Y., Frasconi, P., Simard, P.: The problem of learning long-term dependencies in recurrent networks. In: IEEE International Conference on Neural Networks, pp. 1183–1188. IEEE (1993)Google Scholar
  3. 3.
    Castelli, I., Trentin, E.: Combination of supervised and unsupervised learning for training the activation functions of neural networks. Pattern Recogn. Lett. 37, 178–191 (2014)CrossRefGoogle Scholar
  4. 4.
    Eisenach, C., Wang, Z., Liu, H.: Nonparametrically learning activation functions in deep neural nets (2016)Google Scholar
  5. 5.
    Girosi, F., Jones, M., Poggio, T.: Regularization theory and neural networks architectures. Neural Comput. 7, 219–269 (1995)CrossRefGoogle Scholar
  6. 6.
    Girosi, F., Jones, M., Poggio, T.: Regularization networks and support vector machines. Adv. Comput. Math. 13(1), 1–50 (2000)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)Google Scholar
  8. 8.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  9. 9.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)Google Scholar
  10. 10.
    LeCun, Y., Bengio, Y., Hinton, G.E.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  11. 11.
    Mhaskar, H., Liao, Q., Poggio, T.A.: Learning real and boolean functions: when is deep better than shallow. ArXiv preprint arXiv:1603.00988 (2016)
  12. 12.
    Poggio, T., Girosi, F.: Networks for approximation and learning. Proc. IEEE 78(9), 1481–1497 (1990)CrossRefGoogle Scholar
  13. 13.
    Scardapane, S., Van Vaerenbergh, S., Totaro, S., Uncini, A.: Kafnets: kernel-based non-parametric activation functions for neural networks. arXiv preprint arXiv:1707.04035 (2017)
  14. 14.
    Smola, A.J., Schoelkopf, B., Mueller, K.R.: The connection between regularization operators and support vector kernels. Neural Netw. 11, 637–649 (1998)CrossRefGoogle Scholar
  15. 15.
    Su, Q., Liao, X., Carin, L.: A probabilistic framework for nonlinearities in stochastic neural networks. In: Advances in Neural Information Processing Systems 30, pp. 4486–4495. Curran Associates Inc. (2017)Google Scholar
  16. 16.
    Turner, A.J., Miller, J.F.: Neuroevolution: evolving heterogeneous artificial neural networks. Evol. Intell. 7(3), 135–154 (2014)CrossRefGoogle Scholar
  17. 17.
    Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. ArXiv preprint arXiv:1409.2329 (2014)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Giuseppe Marra
    • 1
    • 2
    Email author
  • Dario Zanca
    • 2
  • Alessandro Betti
    • 1
    • 2
  • Marco Gori
    • 2
  1. 1.DINFOUniversity of FirenzeFlorenceItaly
  2. 2.DIISMUniversity of SienaSienaItaly

Personalised recommendations