Advertisement

Improved inception-residual convolutional neural network for object recognition

Abstract

Machine learning and computer vision have driven many of the greatest advances in the modeling of Deep Convolutional Neural Networks (DCNNs). Nowadays, most of the research has been focused on improving recognition accuracy with better DCNN models and learning approaches. The recurrent convolutional approach is not applied very much, other than in a few DCNN architectures. On the other hand, Inception-v4 and Residual networks have promptly become popular among computer the vision community. In this paper, we introduce a new DCNN model called the Inception Recurrent Residual Convolutional Neural Network (IRRCNN), which utilizes the power of the Recurrent Convolutional Neural Network (RCNN), the Inception network, and the Residual network. This approach improves the recognition accuracy of the Inception-residual network with same number of network parameters. In addition, this proposed architecture generalizes the Inception network, the RCNN, and the Residual network with significantly improved training accuracy. We have empirically evaluated the performance of the IRRCNN model on different benchmarks including CIFAR-10, CIFAR-100, TinyImageNet-200, and CU3D-100. The experimental results show higher recognition accuracy against most of the popular DCNN models including the RCNN. We have also investigated the performance of the IRRCNN approach against the Equivalent Inception Network (EIN) and the Equivalent Inception Residual Network (EIRN) counterpart on the CIFAR-100 dataset. We report around 4.53, 4.49 and 3.56% improvement in classification accuracy compared with the RCNN, EIN, and EIRN on the CIFAR-100 dataset respectively. Furthermore, the experiment has been conducted on the TinyImageNet-200 and CU3D-100 datasets where the IRRCNN provides better testing accuracy compared to the Inception Recurrent CNN, the EIN, the EIRN, Inception-v3, and Wide Residual Networks.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 99

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

References

  1. 1.

    Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems

  2. 2.

    Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  3. 3.

    Toshev A, Szegedy C (2014) Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1653–1660

  4. 4.

    Dong C, Loy CC, He K, Tang X (2014) Learning a deep convolutional network for image super-resolution. In: European conference on computer vision. Springer, Berlin, pp 184–199

  5. 5.

    Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Advances in neural information processing systems, pp 487–495

  6. 6.

    Wang N et al (2015) Transferring rich feature hierarchies for robust visual tracking. To further investigate the performance of the proposed IRRCNN model. arXiv preprint arXiv:1501.04587

  7. 7.

    Mao J et al (2014) Deep captioning with multimodal recurrent neural networks (m-rnn). arXiv preprint arXiv:1412.6632

  8. 8.

    Shankar S, Garg VK, Cipolla R (2015) Deep-carving: discovering visual attributes by carving deep neural nets. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3403–3412

  9. 9.

    Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1725–1732

  10. 10.

    Ballas N et al (2015) Delving deeper into convolutional networks for learning video representations. arXiv preprint arXiv:1511.06432

  11. 11.

    RojasBarahona Lina Maria (2016) Deep learning for sentiment analysis. Lang Linguist Compass 10(12):701–719

  12. 12.

    Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning. ACM

  13. 13.

    Manning CD et al (2014) The Stanford CoreNLP natural language processing toolkit. ACL (System Demonstrations)

  14. 14.

    Geoffrey Hinton et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 29(6):82–97

  15. 15.

    Mnih V et al (2013) Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602

  16. 16.

    Lillicrap TP et al (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971

  17. 17.

    DiCarlo JJ, Zoccolan D, Rust NC (2012) How does the brain solve visual object recognition? Neuron 73(3):415–434

  18. 18.

    McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133

  19. 19.

    Liang M, Hu X (2015) Recurrent convolutional neural network for object recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  20. 20.

    Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: CVPR

  21. 21.

    Fernandez Benito, Parlos AG, Tsai WK (1990) Nonlinear dynamic system identification using artificial neural networks (ANNs). In: 1990 IJCNN international joint conference on neural networks. IEEE

  22. 22.

    Alom MZ, Hasan M, Yakopcic C, Taha TM (2017) Inception recurrent convolutional neural network for object recognition. arXiv:1704.07709

  23. 23.

    Szegedy C et al (2016) Inception-v4, Inception-Resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261

  24. 24.

    He K et al (2016) Identity mappings in deep residual networks. In: European conference on computer vision. Springer, Berlin

  25. 25.

    He K et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  26. 26.

    Szegedy C et al (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  27. 27.

    LeCun Y et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

  28. 28.

    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  29. 29.

    Lin M, Chen Q, Yan S (2013) Network in network. arXiv preprint arXiv:1312.4400

  30. 30.

    Springenberg JT et al (2014) Striving for simplicity: the all convolutional net. arXiv preprint arXiv:1412.6806

  31. 31.

    Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  32. 32.

    Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint arXiv:1605.07146

  33. 33.

    Xie S, Girshick R, Dollr P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. arXiv preprint arXiv:1611.05431

  34. 34.

    Iandola FN et al (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\) 0.5 MB model size. arXiv preprint arXiv:1602.07360

  35. 35.

    Liao Q, Poggio T (2016) Bridging the gaps between residual learning, recurrent neural networks and visual cortex. arXiv preprint arXiv:1604.03640

  36. 36.

    O’Reilly RC, Wyatte D, Herd S, Mingus B, Jilk D (2013) Recurrent processing during object recognition. Front Psychol 4(124):1–14

  37. 37.

    Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical report

  38. 38.

    Tiny ImageNet (2017) https://tiny-imagenet.herokuapp.com/. Accessed Dec 2017

  39. 39.

    Ilya Sutskever et al (2013) On the importance of initialization and momentum in deep learning. ICML 3(28):1139–1147

  40. 40.

    Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167

  41. 41.

    Wan L et al (2013) Regularization of neural networks using drop-connect. In: Proceedings of the 30th international conference on machine learning (ICML-13)

  42. 42.

    Keras CF (2016) https://github.com/fchollet/keras. Accessed Jan 2017

  43. 43.

    Bastien F, Lamblin P, Pascanu R, Bergstra J, Goodfellow IJ, Bergeron A, Bouchard N, Bengio Y (2012) Theano: new features and speed improvements. NIPS workshop on deep learning and unsupervised feature learning

  44. 44.

    Mishkin D, Matas J (2015) All you need is a good init. arXiv preprint arXiv:1511.06422

  45. 45.

    Koushik J, Hayashi H (2016) Improving stochastic gradient descent with feedback. arXiv preprint arXiv:1611.01505

  46. 46.

    Goodfellow IJ et al (2013) Maxout networks. ICML 3(28):1319–1327

  47. 47.

    Lee C-Y et al (2015) Deeply-supervised nets. AISTATS 2(3):562–570

  48. 48.

    Springenberg JT, Riedmiller M (2014) Improving deep neural networks with probabilistic maxout units. In: International conference on learning representations (ICLR)

  49. 49.

    Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. In: Advances in neural information processing systems (Highway Network)

  50. 50.

    Stollenga MF et al (2014) Deep networks with internal selective attention through feedback connections. In: Advances in neural information processing systems

  51. 51.

    Romero A et al (2014) Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550

  52. 52.

    Snoek J, Rippel O, Swersky K, Kiros R, Satish N, Sundaram N, Adams RP (2015) Scalable Bayesian optimization using deep neural networks. In: ICML, pp 2171–2180

  53. 53.

    https://gist.github.com/kashif/0ba0270279a0f38280423754cea2ee1e. Accessed Dec 2017

  54. 54.

    https://github.com/fchollet/deep-learning-models/releases. Accessed July 2017

Download references

Author information

Correspondence to Tarek M. Taha.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Alom, M.Z., Hasan, M., Yakopcic, C. et al. Improved inception-residual convolutional neural network for object recognition. Neural Comput & Applic 32, 279–293 (2020). https://doi.org/10.1007/s00521-018-3627-6

Download citation

Keywords

  • DCNN
  • RCNN
  • Inception network
  • Residual network
  • Deep learning