Skip to main content
Log in

IX-ResNet: fragmented multi-scale feature fusion for image classification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the continuous in-depth study of convolutional neural network in computer vision, how to improve the performance of network structure has been the focus of current research. Recent works have shown that multi-scale feature concatenation, shortcut connection and grouping convolution can effectively train deeper networks and improve the accuracy and effectiveness of the network. In this paper, we present a novel feature transformation strategy of fragmented multi-scale feature fusion. Moreover, an efficient modularized image classification network, IX-ResNet, is proposed based on this new strategy. IX-ResNet consists of many large isomorphic modules stacked in the form of residual network while Each large module can be composed of many small heterogeneous modules. The performance of IX-ResNet is verified on cifar-10, cifar-100 and ImageNet-1 K datasets, which indicates that IX-ResNet model using fragmented multi-scale feature fusion strategy can further improve accuracy compare to the original grouping convolution network ResNeXt with the same or even lower parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The Pascal Visual Object Classes (VOC) Challenge.IJCV,pages 303–338

  2. Girshick R (2015) Fast R-CNN. In ICCV

  3. Girshick R Donahue J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In ECCV

  4. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. arXiv:1703.06870

  5. He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In ECCV

  6. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In CVPR

  7. Hornik K, Stinchcobe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366

    Article  Google Scholar 

  8. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861

  9. Hu J, Shen L, Sun G (2017) Squeeze-and-excitation networks.arXiv preprint arXiv:1709.01507

  10. Huang G, Liu Z, Weinberger KQ, Maaten L (2017) Densely connected convolutional networks. In CVPR

  11. Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size. arXiv preprint arXiv:1602.07360

  12. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167

  13. Khan MA, Akram T, Sharif M, Javed MY, Muhammad N, Yasmin M (2018) An implementation of optimized framework for action classification using multilayers neural network on selected fused features. Pattern Anal Applic

  14. Khan MA, Sarfaraz MS, Alhaisoni MM, Albesher AA, Ashraf I (2020) StomachNet: Optimal Deep Learning Features Fusion for Stomach Abnormalities Classification. IEEE Access

  15. Khan MA, Zhang YD, Khan SA, Attique M, Seo S (2020) A resource conscious human action recognition framework using 26-layered deep convolutional neural network. Multimed Tools Appl

  16. Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In NIPS

  17. Larsson G,Maire M,Shakhnarovich G (2017) FractalNet: Ultra-Deep Neural Networks without Residuals. in ICLR

  18. Lin TY, Dollar P, Girshick R, He K, Hariharan B, Belongie (2017) Feature pyramid networks for object detection. In CVPR

  19. Lin T, Goyal P, Girshick RB, He K, Dollár P (2017) Focal loss for dense object detection. In ICCV

  20. Liu W, Anguelov D, Erhan D, Szegedy C, Reed SE, Fu C, Berg A. C (2016) SSD: single shot multibox detector. In ECCV, pages 21–37

  21. Ma N, Zhang X, Zheng H, Sun J (2018) ShuffleNet V2: practical guidelines for efficient CNN architecture design. In ECCV

  22. Mehmood A, Khan MA, Sharif M, Khan SA, Shaheen M, Saba T (2020) Prosperous human gait recognition: an end-to-end system based on pre-trained CNN features selection[J]. Multimed Tools Appl

  23. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In ICML

  24. Rashid M, Khan MA, Alhaisoni M, Wang SH, Naqvi SR, Rehman A (2020) A sustainable deep learning framework for object recognition using multi-layers deep features fusion and selection. Sustainability

  25. Redmon J, Divvala SK, Girshick RB, Farhadi A (2016) You only look once: unified, real-time object detection. In CVPR, pages 779–788

  26. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In NIPS

  27. O Russakovsky, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, et al. (2014) Imagenet large scale visual recognition challenge. arXiv:1409.0575,

  28. Sermanet P, Eigen D, Zhang X, Mathieu M, Fer-gus R, LeCun Y (2014) Overfeat: Integrated recognition, localization and detection using convolutional networks. In ICLR

  29. Sharif M, Khan MA, Zahid F, Shah JH, Akram T (2020) Human action recognition: a framework of statistical weighted segmentation and rank correlation-based selection. Pattern Anal Applic 23:281–294

    Article  Google Scholar 

  30. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In ICLR

  31. Szegedy C, Ioffe S, Vanhoucke V (2016) Inception-v4, inception-resnet and the impact of rescidual connections on learning. In ICLR Workshop

  32. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S,Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In CVPR

  33. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In CVPR

  34. Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated Residual Transformations for Deep Neural Networks.in CVPR

  35. Zagoruyko S, Komodakis N (2016) Wide Residual Networks. In BMVC

  36. Zhang X, XinyuZhou ML, Jian Sun M (2017) Inc. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices.arXiv:1707.01083

  37. Zhu C, He Y, Savvides M (2019) Feature Selective Anchor-Free Module for Single-Shot Object Detection. In CVPR

Download references

Acknowledgements

This research was supported by the Shaanxi Province Technical Innovation Foundation (grant No. 2020CGXNG-012).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tao Xue.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xue, T., Hong, Y. IX-ResNet: fragmented multi-scale feature fusion for image classification. Multimed Tools Appl 80, 27855–27865 (2021). https://doi.org/10.1007/s11042-021-10893-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-10893-1

Keywords

Navigation