Skip to main content
Log in

Clothing attribute recognition algorithm based on improved YOLOv4-Tiny

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Aiming at the problem of low accuracy of clothing attribute recognition caused by factors such as scale, occlusion and beyond the boundary, a novel clothing attribute recognition algorithm based on improved YOLOv4-Tiny is proposed in this paper. YOLOv4-Tiny is used as the basic model, firstly, the multi-scale feature extraction module Res2Net is adopted to optimize the backbone network, the receptive field size of each layer of the network is increased, and more abundant fine-grained multi-scale clothing feature information is extracted. Then, the three feature layers of the output of feature extraction network are up-sampled, and the high-level semantic features and shallow features are fused to obtain rich shallow fine-grained feature information. Finally, K-Means clustering algorithm is employed to optimize the anchor box parameters to obtain the anchor box that is more compatible with the clothing object, and to improve the integrating degree between the clothing attribute characteristics and the network. The experimental results demonstrate that the proposed method outperforms the original YOLOv4-tiny network in terms of accuracy, speed, and model parameters, and is more suitable for deployment in resource-limited embedded devices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Availability of data and materials

Upon request.

References

  1. Gupta, M., Bhatnagar, C., Jalal, A.S.: Clothing image retrieval based on multiple features for smarter shopping. Proc. Comput. Sci. 125, 143–148 (2018)

    Article  Google Scholar 

  2. Zhou, W., Mok, P.Y., Zhou, Y., et al.: Fashion recommendations through cross-media information retrieval. J. Vis. Commun. Image Represent. 61, 112–120 (2019)

    Article  Google Scholar 

  3. Qiang, C., Huang, J. Feris, R., et al.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 5315–5324 (2015)

  4. Ke, X., Liu, T., Li, Z.: Human attribute recognition method based on pose estimation and multiple-feature fusion. SIViP 14, 1441–1449 (2020)

    Article  Google Scholar 

  5. Ahmed, K.T., Irtaza, A., Iqbal, M.A.: Fusion of local and global features for effective image extraction. Appl. Intell. 47, 526–543 (2017)

    Article  Google Scholar 

  6. Aslan, M.F., Durdu, A., Sabanci, K.: Human action recognition with bag of visual words using different machine learning methods and hyperparameter optimization. Neural Comput. Appl. 32, 8585–8597 (2020)

    Article  Google Scholar 

  7. Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

  8. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 1440–1448 (2015)

  9. Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28, 91–99 (2015)

    Google Scholar 

  10. He, K., Zhang, X., Ren, S., et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)

    Article  Google Scholar 

  11. Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 779–788 (2016)

  12. Liu, W., Anguelov, D., Erhan, D., et al.: Ssd: single shot multibox detector. In: European conference on computer vision. pp. 21–37, Springer, Cham (2016)

  13. Zhang, Z.H., Zhou, C.L., Liang, Y.: An optimized clothing classification algorithm based on residual convolutional neural network. Comput. Eng. Sci. 40(02), 354–360 (2018)

    Google Scholar 

  14. Lu, J.B., Xie, X.H., Li, W.T.: An improved clothing image recognition model based on residual network. Comput. Eng. Appl. 56(20), 206–211 (2020)

    Google Scholar 

  15. Liu, Y.J., Wang, W.Y., Li, Z.M., et al.: Cross-Domain clothing retrieval with attention model. J. Comput. Aided Des. Comput. Graph. 32(06), 894–902 (2020)

    Google Scholar 

  16. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. arXiv:2004.10934 (2020)

  17. Ge, Y., Zhang, R., Wang, X., et al.: DeepFashion2: a versatile benchmark for detection, pose estimation, segmentation and re-identification of clothing images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5337–5345 (2019)

  18. Jiang, Z., Zhao, L., Li, S., et al.: Real-time object detection method based on improved YOLOv4-tiny. arXiv:2011.04244 (2020)

  19. Selvaraju, R, R., Cogswell, M., Das, A., et al.: Grad-cam: visual explanations from deep networks via gradient-based localiza-tion. in: proceedings of the ieee international conference on computer vision. pp. 618–626 (2017)

  20. Gao, S.H., Shuang, H., et al.: Res2Net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 4(2), 652–662 (2021)

    Article  MathSciNet  Google Scholar 

  21. Sinaga, K.P., Yang, M.S.: Unsupervised K-means clustering algorithm. In: IEEE Access, pp. 80716–80727 (2020)

  22. Wu, B., Dai, X., Zhang, P., et al.: Fbnet: hardware-aware efficient convnet design via differentiable neural architecture search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 10734–10742 (2019)

  23. Han, K., Wang, Y., Tian, Q., et al.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp: 1580–1589 (2020)

  24. Zhang, X., Zhou, X., Lin, M., et al.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 6848–6856 (2018)

  25. Howard, A.G., Zhu, M., Chen, B., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017)

  26. Fang, L.F., Wu, Y.Q., Li, Y.H., et al.: Ginger seeding detection and shoot orientation discrimination using an improved YOLOv4-LITE network. Agronomy 11(11), 2328 (2021)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Jing Feng for fruitful discussions on this work.

Funding

None.

Author information

Authors and Affiliations

Authors

Contributions

MG and WH designed the research, performed the research, JL analyzed the data, all authors contributed to the writing and revisions.

Corresponding author

Correspondence to Meihua Gu.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gu, M., Hua, W. & Liu, J. Clothing attribute recognition algorithm based on improved YOLOv4-Tiny. SIViP 17, 3555–3563 (2023). https://doi.org/10.1007/s11760-023-02580-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02580-5

Keywords

Navigation