Bi-branch deconvolution-based convolutional neural network for image classification

Abstract

With the rise of deep neural network, convolutional neural networks show superior performances on many different computer vision recognition tasks. The convolution is used as one of the most efficient ways for extracting the details features of an image, while the deconvolution is mostly used for semantic segmentation and significance detection to obtain the contour information of the image and rarely used for image classification. In this paper, we propose a novel network named bi-branch deconvolution-based convolutional neural network (BB-deconvNet), which is constructed by mainly stacking a proposed simple module named Zoom. The Zoom module has two branches to extract multi-scale features from the same feature map. Especially, the deconvolution is borrowed to one of the branches, which can provide distinct features differently from regular convolution through the zoom of learned feature maps. To verify the effectiveness of the proposed network, we conduct several experiments on three object classification benchmarks (CIFAR-10, CIFAR-100, SVHN). The BB-deconvNet shows encouraging performances compared with other state-of-the-art deep CNNs.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. 1.

    Clevert D-A, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). arXiv:1511.07289

  2. 2.

    Glorot X, Bordes A, Bengio Y (2012) Deep sparse rectifier neural networks. In: International conference on artificial intelligence and statistics

  3. 3.

    Goodfellow IJ, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. arXiv:1302.4389

  4. 4.

    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  5. 5.

    He K, Zhang X, Ren S, Sun J (2016) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE International conference on computer vision, pp 1026– 1034

  6. 6.

    He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European Conference on computer vision. Springer, pp 630–645

  7. 7.

    Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: alexnet-level accuracy with 50x fewer parameters and 0.5mb model size. arXiv:1602.07360

  8. 8.

    Ioffe S (2017) Batch renormalization: towards reducing minibatch dependence in batch-normalized models. arXiv:1702.03275

  9. 9.

    Ioffe, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456

  10. 10.

    Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093

  11. 11.

    Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical report, University of Toronto

  12. 12.

    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105

  13. 13.

    Lécun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  14. 14.

    Lee CY, Xie S, Gallagher P, Zhang Z, Zhuowen T (2015) Deeply-supervised nets. Artif Intell Statist, 562–570

  15. 15.

    Li J, Liang X, Shen SM, Xu T, Feng J, Yan S (2015) Scale-aware fast r-cnn for pedestrian detection. arXiv:1510.08160

  16. 16.

    Lin M, Chen Q, Yan S (2013) Network in network. arXiv:1312.4400

  17. 17.

    Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision, pp 21–37

  18. 18.

    Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. Comput Vis Pattern Recogn, 3431–3440

  19. 19.

    Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: International Conference on international conference on machine learning, pp 807–814

  20. 20.

    Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. Nips Workshop on Deep Learning & Unsupervised Feature Learning

  21. 21.

    Noh H, Hong S, Han B (2016) Learning deconvolution network for semantic segmentation. In: IEEE International conference on computer vision, pp 1520–1528

  22. 22.

    Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: International Conference on neural information processing systems, pp 91– 99

  23. 23.

    Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: Hints for thin deep nets. arXiv:1412.6550

  24. 24.

    Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    MathSciNet  Article  Google Scholar 

  25. 25.

    Sermanet P, Kavukcuoglu K, Chintala S, Lecun Y (2013) Pedestrian detection with unsupervised multi-stage feature learning. In: IEEE Conference on computer vision and pattern recognition, pp 3626– 3633

  26. 26.

    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  27. 27.

    Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  28. 28.

    Srivastava R K, Greff K, Schmidhuber J (2015) Training very deep networks. In: Advances in neural information processing systems, pp 2377–2385

  29. 29.

    Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Computer vision and pattern recognition, pp 1–9

  30. 30.

    Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Computer Vision and pattern recognition, pp 2818–2826

  31. 31.

    Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. AAAI, 4278–4284

  32. 32.

    Wu C, Wen W, Afzal T, Zhang Y, Chen Y, Li H (2017) A compact dnn: approaching googlenet-level accuracy of classification and domain adaptation. arXiv:1703.04071

  33. 33.

    Xie S, Girshick R, Dollár P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. arXiv:1611.05431

  34. 34.

    Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv:1605.07146

  35. 35.

    Zagoruyko S, Komodakis N (2017) Diracnets: training very deep neural networks without skip-connections. arXiv:1706.00388

  36. 36.

    Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. arXiv:1301.3557

  37. 37.

    Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833

  38. 38.

    Zeiler M D, Krishnan D, Taylor GW, Fergus R (2010) Deconvolutional networks. In: Computer IEEE Conference on computer vision and pattern recognition (CVPR). IEEE, pp 2528–2535

  39. 39.

    Zeiler MD, Taylor GW, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. In: International conference on computer vision, pp 2018–2025

Download references

Acknowledgments

This work is supported by the Natural Science Foundation of China (Grant 61572214 and U1536203), Independent Innovation Research Fund Sponsored by Huazhong university of science and technology (Project No. 2016YXMS089).

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Jingjuan Guo or Tianjiang Wang.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Guo, J., Yuan, C., Zhao, Z. et al. Bi-branch deconvolution-based convolutional neural network for image classification. Multimed Tools Appl 77, 30233–30250 (2018). https://doi.org/10.1007/s11042-018-6130-2

Download citation

Keywords

  • Image classification
  • Bi-branch convolutional neural network
  • Deconvolution
  • Multi-scale