Advertisement

Multimedia Tools and Applications

, Volume 77, Issue 23, pp 30233–30250 | Cite as

Bi-branch deconvolution-based convolutional neural network for image classification

  • Jingjuan Guo
  • Caihong Yuan
  • Zhiqiang Zhao
  • Ping Feng
  • Tianjiang Wang
  • Fang Liu
Article
  • 57 Downloads

Abstract

With the rise of deep neural network, convolutional neural networks show superior performances on many different computer vision recognition tasks. The convolution is used as one of the most efficient ways for extracting the details features of an image, while the deconvolution is mostly used for semantic segmentation and significance detection to obtain the contour information of the image and rarely used for image classification. In this paper, we propose a novel network named bi-branch deconvolution-based convolutional neural network (BB-deconvNet), which is constructed by mainly stacking a proposed simple module named Zoom. The Zoom module has two branches to extract multi-scale features from the same feature map. Especially, the deconvolution is borrowed to one of the branches, which can provide distinct features differently from regular convolution through the zoom of learned feature maps. To verify the effectiveness of the proposed network, we conduct several experiments on three object classification benchmarks (CIFAR-10, CIFAR-100, SVHN). The BB-deconvNet shows encouraging performances compared with other state-of-the-art deep CNNs.

Keywords

Image classification Bi-branch convolutional neural network Deconvolution Multi-scale 

Notes

Acknowledgments

This work is supported by the Natural Science Foundation of China (Grant 61572214 and U1536203), Independent Innovation Research Fund Sponsored by Huazhong university of science and technology (Project No. 2016YXMS089).

References

  1. 1.
    Clevert D-A, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). arXiv:1511.07289
  2. 2.
    Glorot X, Bordes A, Bengio Y (2012) Deep sparse rectifier neural networks. In: International conference on artificial intelligence and statisticsGoogle Scholar
  3. 3.
    Goodfellow IJ, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. arXiv:1302.4389
  4. 4.
    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778Google Scholar
  5. 5.
    He K, Zhang X, Ren S, Sun J (2016) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE International conference on computer vision, pp 1026– 1034Google Scholar
  6. 6.
    He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European Conference on computer vision. Springer, pp 630–645Google Scholar
  7. 7.
    Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: alexnet-level accuracy with 50x fewer parameters and 0.5mb model size. arXiv:1602.07360
  8. 8.
    Ioffe S (2017) Batch renormalization: towards reducing minibatch dependence in batch-normalized models. arXiv:1702.03275
  9. 9.
    Ioffe, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456Google Scholar
  10. 10.
    Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093
  11. 11.
    Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical report, University of TorontoGoogle Scholar
  12. 12.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105Google Scholar
  13. 13.
    Lécun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  14. 14.
    Lee CY, Xie S, Gallagher P, Zhang Z, Zhuowen T (2015) Deeply-supervised nets. Artif Intell Statist, 562–570Google Scholar
  15. 15.
    Li J, Liang X, Shen SM, Xu T, Feng J, Yan S (2015) Scale-aware fast r-cnn for pedestrian detection. arXiv:1510.08160
  16. 16.
    Lin M, Chen Q, Yan S (2013) Network in network. arXiv:1312.4400
  17. 17.
    Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision, pp 21–37Google Scholar
  18. 18.
    Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. Comput Vis Pattern Recogn, 3431–3440Google Scholar
  19. 19.
    Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: International Conference on international conference on machine learning, pp 807–814Google Scholar
  20. 20.
    Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. Nips Workshop on Deep Learning & Unsupervised Feature LearningGoogle Scholar
  21. 21.
    Noh H, Hong S, Han B (2016) Learning deconvolution network for semantic segmentation. In: IEEE International conference on computer vision, pp 1520–1528Google Scholar
  22. 22.
    Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: International Conference on neural information processing systems, pp 91– 99Google Scholar
  23. 23.
    Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: Hints for thin deep nets. arXiv:1412.6550
  24. 24.
    Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRefGoogle Scholar
  25. 25.
    Sermanet P, Kavukcuoglu K, Chintala S, Lecun Y (2013) Pedestrian detection with unsupervised multi-stage feature learning. In: IEEE Conference on computer vision and pattern recognition, pp 3626– 3633Google Scholar
  26. 26.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
  27. 27.
    Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetzbMATHGoogle Scholar
  28. 28.
    Srivastava R K, Greff K, Schmidhuber J (2015) Training very deep networks. In: Advances in neural information processing systems, pp 2377–2385Google Scholar
  29. 29.
    Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Computer vision and pattern recognition, pp 1–9Google Scholar
  30. 30.
    Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Computer Vision and pattern recognition, pp 2818–2826Google Scholar
  31. 31.
    Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. AAAI, 4278–4284Google Scholar
  32. 32.
    Wu C, Wen W, Afzal T, Zhang Y, Chen Y, Li H (2017) A compact dnn: approaching googlenet-level accuracy of classification and domain adaptation. arXiv:1703.04071
  33. 33.
    Xie S, Girshick R, Dollár P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. arXiv:1611.05431
  34. 34.
    Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv:1605.07146
  35. 35.
    Zagoruyko S, Komodakis N (2017) Diracnets: training very deep neural networks without skip-connections. arXiv:1706.00388
  36. 36.
    Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. arXiv:1301.3557
  37. 37.
    Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833Google Scholar
  38. 38.
    Zeiler M D, Krishnan D, Taylor GW, Fergus R (2010) Deconvolutional networks. In: Computer IEEE Conference on computer vision and pattern recognition (CVPR). IEEE, pp 2528–2535Google Scholar
  39. 39.
    Zeiler MD, Taylor GW, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. In: International conference on computer vision, pp 2018–2025Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyHuazhong University of Science and TechnologyWuhanChina
  2. 2.School of Information Science and TechnologyJiujiang UniversityJiujiangChina
  3. 3.School of Computer and Information EngineeringHenan UniversityKaifengChina

Personalised recommendations