Abstract
Aiming at the problem of low accuracy of clothing attribute recognition caused by factors such as scale, occlusion and beyond the boundary, a novel clothing attribute recognition algorithm based on improved YOLOv4-Tiny is proposed in this paper. YOLOv4-Tiny is used as the basic model, firstly, the multi-scale feature extraction module Res2Net is adopted to optimize the backbone network, the receptive field size of each layer of the network is increased, and more abundant fine-grained multi-scale clothing feature information is extracted. Then, the three feature layers of the output of feature extraction network are up-sampled, and the high-level semantic features and shallow features are fused to obtain rich shallow fine-grained feature information. Finally, K-Means clustering algorithm is employed to optimize the anchor box parameters to obtain the anchor box that is more compatible with the clothing object, and to improve the integrating degree between the clothing attribute characteristics and the network. The experimental results demonstrate that the proposed method outperforms the original YOLOv4-tiny network in terms of accuracy, speed, and model parameters, and is more suitable for deployment in resource-limited embedded devices.
Similar content being viewed by others
Availability of data and materials
Upon request.
References
Gupta, M., Bhatnagar, C., Jalal, A.S.: Clothing image retrieval based on multiple features for smarter shopping. Proc. Comput. Sci. 125, 143–148 (2018)
Zhou, W., Mok, P.Y., Zhou, Y., et al.: Fashion recommendations through cross-media information retrieval. J. Vis. Commun. Image Represent. 61, 112–120 (2019)
Qiang, C., Huang, J. Feris, R., et al.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 5315–5324 (2015)
Ke, X., Liu, T., Li, Z.: Human attribute recognition method based on pose estimation and multiple-feature fusion. SIViP 14, 1441–1449 (2020)
Ahmed, K.T., Irtaza, A., Iqbal, M.A.: Fusion of local and global features for effective image extraction. Appl. Intell. 47, 526–543 (2017)
Aslan, M.F., Durdu, A., Sabanci, K.: Human action recognition with bag of visual words using different machine learning methods and hyperparameter optimization. Neural Comput. Appl. 32, 8585–8597 (2020)
Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28, 91–99 (2015)
He, K., Zhang, X., Ren, S., et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 779–788 (2016)
Liu, W., Anguelov, D., Erhan, D., et al.: Ssd: single shot multibox detector. In: European conference on computer vision. pp. 21–37, Springer, Cham (2016)
Zhang, Z.H., Zhou, C.L., Liang, Y.: An optimized clothing classification algorithm based on residual convolutional neural network. Comput. Eng. Sci. 40(02), 354–360 (2018)
Lu, J.B., Xie, X.H., Li, W.T.: An improved clothing image recognition model based on residual network. Comput. Eng. Appl. 56(20), 206–211 (2020)
Liu, Y.J., Wang, W.Y., Li, Z.M., et al.: Cross-Domain clothing retrieval with attention model. J. Comput. Aided Des. Comput. Graph. 32(06), 894–902 (2020)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. arXiv:2004.10934 (2020)
Ge, Y., Zhang, R., Wang, X., et al.: DeepFashion2: a versatile benchmark for detection, pose estimation, segmentation and re-identification of clothing images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5337–5345 (2019)
Jiang, Z., Zhao, L., Li, S., et al.: Real-time object detection method based on improved YOLOv4-tiny. arXiv:2011.04244 (2020)
Selvaraju, R, R., Cogswell, M., Das, A., et al.: Grad-cam: visual explanations from deep networks via gradient-based localiza-tion. in: proceedings of the ieee international conference on computer vision. pp. 618–626 (2017)
Gao, S.H., Shuang, H., et al.: Res2Net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 4(2), 652–662 (2021)
Sinaga, K.P., Yang, M.S.: Unsupervised K-means clustering algorithm. In: IEEE Access, pp. 80716–80727 (2020)
Wu, B., Dai, X., Zhang, P., et al.: Fbnet: hardware-aware efficient convnet design via differentiable neural architecture search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 10734–10742 (2019)
Han, K., Wang, Y., Tian, Q., et al.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp: 1580–1589 (2020)
Zhang, X., Zhou, X., Lin, M., et al.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 6848–6856 (2018)
Howard, A.G., Zhu, M., Chen, B., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017)
Fang, L.F., Wu, Y.Q., Li, Y.H., et al.: Ginger seeding detection and shoot orientation discrimination using an improved YOLOv4-LITE network. Agronomy 11(11), 2328 (2021)
Acknowledgements
The authors would like to thank Jing Feng for fruitful discussions on this work.
Funding
None.
Author information
Authors and Affiliations
Contributions
MG and WH designed the research, performed the research, JL analyzed the data, all authors contributed to the writing and revisions.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gu, M., Hua, W. & Liu, J. Clothing attribute recognition algorithm based on improved YOLOv4-Tiny. SIViP 17, 3555–3563 (2023). https://doi.org/10.1007/s11760-023-02580-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-023-02580-5