Applied Intelligence

, Volume 48, Issue 7, pp 1707–1720 | Cite as

A novel softplus linear unit for deep convolutional neural networks

  • Huizhen ZhaoEmail author
  • Fuxian Liu
  • Longyue Li
  • Chang Luo


Current improvements in the performance of deep neural networks are partly due to the proposition of rectified linear units. A ReLU activation function outputs zero for negative component, inducing the death of some neurons and a bias shift of the outputs, which causes oscillations and impedes learning. According to the theory that “zero mean activations improve learning ability”, a softplus linear unit (SLU) is proposed as an adaptive activation function that can speed up learning and improve performance in deep convolutional neural networks. Firstly, for the reduction of the bias shift, negative inputs are processed using the softplus function, and a general form of the SLU function is proposed. Secondly, the parameters of the positive component are fixed to control vanishing gradients. Thirdly, the rules for updating the parameters of the negative component are established to meet back- propagation requirements. Finally, we designed deep auto-encoder networks and conducted several experiments with them on the MNIST dataset for unsupervised learning. For supervised learning, we designed deep convolutional neural networks and conducted several experiments with them on the CIFAR-10 dataset. The experiments have shown faster convergence and better performance for image classification of SLU-based networks compared with rectified activation functions.


Deep learning Deep convolutional neural network Softplus function Rectified linear unit 



This work was supported by grants from Air Force Engineering University. The authors would like to thank all of the team members of D605 Laboratory.


  1. 1.
    Noda K, Yamaguchi Y, Nakadai K et al (2015) Audio- visual speech recognition using deep learning. Appl Intell 42(4):722– 737CrossRefGoogle Scholar
  2. 2.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks[C] Proceeding of the 26th Annual Conference on Neural Information Processing Systems Lake Taheo, USA, pp 1097–1105Google Scholar
  3. 3.
    Szegedy C, Liu W, Jia YQ et al (2015) Going deeper with convolutions[C] Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, pp 1–9Google Scholar
  4. 4.
    Ross G, Jeff D, Darrell, Trevor D et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation[C] Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA, pp 580–587Google Scholar
  5. 5.
    Wang N, Li S, Gupta A et al (2015) Transferring rich feature hierarchies for robust visual tracking[OL]. arXiv:1501.04587
  6. 6.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition[OL]. arXiv:1409.1556
  7. 7.
    He K, Zhang X, Ren S et al (2015) Deep residual learning for image recognition [OL]. arXiv:1512.03385
  8. 8.
    Trottier L, Giguère P, Chaib-draa B (2016) Parametric Exponential Linear Unit for Deep Convolutional Neural Networks [OL]. arXiv:1605.09332
  9. 9.
    Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines[C] Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel, pp 807– 814Google Scholar
  10. 10.
    Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models[C] Proceeding of the 30th International Conference on Machine Learning. Atlanta, GA, USA, vol 30Google Scholar
  11. 11.
    He K, Zhang X, Ren S et al (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification[C] Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile, pp 1026–1034Google Scholar
  12. 12.
    Xu B, Wang N, Chen T et al (2015) Empirical evaluation of rectified activations in convolutional network [OL] arXiv:1505.00853
  13. 13.
    Clevert DA, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus) [OL] arXiv:1511.07289
  14. 14.
    Desjardins G, Pascanu R, Courville A et al (2013) Metric-free natural gradient for joint-training of boltzmann machines[J]. arXiv:1301.3545
  15. 15.
    Desjardins G, Simanyan K, Pascanu R et al (2015) Natural neural network[OL] arXiv:1507.00210
  16. 16.
    Olivier Y (2013) Riemannian metrics for neural networks i: feedforward networks[OL] arXiv:1303.0818
  17. 17.
    Glorot X, Bordes A, Bengio Y (2011) Deep Sparse Rectifier Neural Networks [C] Proceeding of the 14th International Conference on Artificial Intelligence and Statistics. Fort Landerdale, FL, USA, pp 315–314Google Scholar
  18. 18.
    Senior A, Lei X (2014) Fine context, low-rank, softplus deep neural networks for mobile speech recognition [C] Proceeding of IEEE International Conference on Acoustic, Speech and Signal Processing, Florence, Italy, pp 7644–7648Google Scholar
  19. 19.
    Krizhevsky A, Hinton GE (2009) Learning multiple layers of features from tiny images[R]. Computer Science Department, University of toronto, Tech 1(4):7Google Scholar
  20. 20.
    Lin M, Chen Q, Yan S. (2013) Network in network[OL] arXiv:1312.4400
  21. 21.
    Njikam ANS, Zhao H (2016) A novel activation function for multilayer feed-forward neural networks[J]. Appl Intell 45(1):75–82CrossRefGoogle Scholar
  22. 22.
    Lee CY, Xie S, Gallagher P et al (2015) Deeply-supervised Nets[C] Proceeding of the 18th international conference on artificial intelligence and statistics. San Diego, California, USA, vol 2, p 6Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  • Huizhen Zhao
    • 1
    Email author
  • Fuxian Liu
    • 1
  • Longyue Li
    • 1
  • Chang Luo
    • 1
  1. 1.Air and Missile Defense CollegeAir Force Engineering University (AFEU)Xi’anPeople’s Republic of China

Personalised recommendations