Skip to main content
Log in

Fashion sub-categories and attributes prediction model using deep learning

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

The fashion clothing and items classification is challenging to incorporate category/sub-category classification and attributes prediction for numerous fashion items into a compact multitask learning infrastructure. The main motive of this research is to improve the fashion items categorization and their attributes prediction from extracted visual features. We proposed a novel fashion sub-categories and attributes prediction (FS\(_{\mathrm{C}}\)AP) model using deep learning techniques. In this proposed model, YOLO and DeepSORT architectures are used for person detection and tracking, Faster-RCNN architecture is used for sub-categories classification, and Custom-EfficientNet-B3 architecture is designed for attributes prediction. Twenty-four distinct modules are designed to increase the attributes classification accuracy for detected fashion items again each sub-category. The performance of the proposed model is evaluated on a customized fully annotated FashionItem dataset. The experimental results clearly show that the proposed model outperforms the recent baseline methods in fashion sub-categories and attributes prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Cho, H., Ahn, C., Min Yoo, K., Seol, J., Lee, S.G.: Leveraging class hierarchy in fashion classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)

  2. Shajini, M., Ramanan, A.: A knowledge-sharing semi-supervised approach for fashion clothes classification and attribute prediction. Vis. Comput. (2021). https://doi.org/10.1007/s00371-021-02178-3

    Article  Google Scholar 

  3. Hu, P., Komura, T., Holden, D., Zhong, Y.: Scanning and animating characters dressed in multiple-layer garments. Vis. Comput. 33(6), 961–969 (2017)

    Article  Google Scholar 

  4. Li, H., Toyoura, M., Shimizu, K., Yang, W., Mao, X.: Retrieval of clothing images based on relevance feedback with focus on collar designs. Vis. Comput. 32(10), 1351–1363 (2016)

    Article  Google Scholar 

  5. Feng, F., Shen, B., Liu, H.: Visual object tracking: in the simultaneous presence of scale variation and occlusion. Syst. Sci. Control Eng. 6(1), 456–466 (2018)

    Article  Google Scholar 

  6. Ay, B., Aydin, G.: Visual similarity-based fashion recommendation system. In: Generative Adversarial Networks for Image-to-Image Translation, pp. 185–203. Academic Press, London (2021)

  7. Meshkini, K., Platos, J., Ghassemain, H.: An analysis of convolutional neural network for fashion images classification (Fashion-MNIST). In: International Conference on Intelligent Information Technologies for Industry, pp. 85–95. Springer, Cham (2019)

  8. Seo, Y., Shin, K.S.: Hierarchical convolutional neural networks for fashion image classification. Expert Syst. Appl. 116, 328–339 (2019)

    Article  Google Scholar 

  9. Kayed, M., Anter, A., Mohamed, H.: Classification of garments from fashion MNIST dataset using CNN LeNet-5 architecture. In: 2020 International Conference on Innovative Trends in Communication and Computer Engineering (ITCE), pp. 238–243. IEEE (2020)

  10. Kim, H.J., Lee, D.H., Niaz, A., Kim, C.Y., Memon, A.A., Choi, K.N.: Multiple-clothing detection and fashion landmark estimation using a single-stage detector. IEEE Access 9, 11694–11704 (2021)

    Article  Google Scholar 

  11. Jeon, Y., Jin, S., Han, K.: FANCY: human-centered, deep learning-based framework for fashion style analysis. In: Proceedings of the Web Conference 2021, pp. 2367–2378 (2021)

  12. Jain, P., Kankani, A., Amali, D.G.B.: A new technique for accurate segmentation, and detection of outfit using convolution neural networks. In: Information Systems Design and Intelligent Applications, pp. 169–177. Springer, Singapore (2019)

  13. Zhang, X., Song, C., Yang, Y., Zhang, Z., Zhang, X., Wang, P., Zou, Q.: Deep learning based human body segmentation for clothing fashion classification. In: 2020 Chinese Automation Congress (CAC), pp. 7544–7549. IEEE (2020)

  14. Martinsson, J., Mogren, O.: Semantic segmentation of fashion images using feature pyramid networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)

  15. Lian, G., Zhang, K.: Transformation of portraits to Picasso’s cubism style. Vis. Comput. 36(4), 799–807 (2020)

    Article  Google Scholar 

  16. Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)

    Article  Google Scholar 

  17. Zhang, X., Song, C., Yang, Y., Zhang, Z., Zhang, X., Wang, P., Zou, Q.: Deep learning based human body segmentation for clothing fashion classification. In: 2020 Chinese Automation Congress (CAC), pp. 7544–7549. IEEE (2020)

  18. Geng, L., Zhang, S., Tong, J., Xiao, Z.: Lung segmentation method with dilated convolution based on VGG-16 network. Comput. Assist. Surg. 24(sup2), 27–33 (2019)

    Article  Google Scholar 

  19. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241. Springer, Cham (2015)

  20. Dang, A.H., Kameyama, W.: Semantic segmentation of fashion photos using light-weight asymmetric U-Net. In: 2019 IEEE 8th Global Conference on Consumer Electronics (GCCE), pp. 175–178. IEEE (2019)

  21. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

  22. Manfredi, M., Grana, C., Calderara, S., Cucchiara, R.: A complete system for garment segmentation and color classification. Mach. Vis. Appl. 25(4), 955–969 (2014)

    Article  Google Scholar 

  23. Hu, L., Wei II, C., Yang II, X., Wang II, T.: Special faster-RCNN for multi-objects detection. In: Third International Workshop on Pattern Recognition, Vol. 10828, p. 108280W. International Society for Optics and Photonics (2018)

  24. Jouanneau, W., Bugeau, A., Palyart, M., Papadakis, N., Vezard, L.: Where are my clothes? A multi-level approach for evaluating deep instance segmentation architectures on fashion images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3951–3955 (2021)

  25. Shi, M., Lewis, V.D.: Using artificial intelligence to analyze fashion trends (2020). arXiv preprint arXiv:2005.00986

  26. Garg, D., Goel, P., Pandya, S., Ganatra, A., Kotecha, K.: A deep learning approach for face detection using YOLO. In: 2018 IEEE Punecon, pp. 1–4. IEEE (2018)

  27. Lee, C.H., Lin, C.W.: A two-phase fashion apparel detection method based on YOLOv4. Appl. Sci. 11(9), 3782 (2021)

    Article  Google Scholar 

  28. Ye, J., Feng, Z., Jing, Y., Song, M.: Finer-net: cascaded human parsing with hierarchical granularity. In: 2018 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2018)

  29. Li, J., Zhao, J., Wei, Y., Lang, C., Li, Y., Sim, T., Feng, J.: Multiple-human parsing in the wild (2017). arXiv preprint arXiv:1705.07206

  30. Wang, H., Wang, J., Wang, J., Zhao, M., Zhang, W., Zhang, F., Guo, M.: Graphgan: Graph representation learning with generative adversarial nets. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, No. 1 (2018)

  31. Li, J., Zhao, J., Wei, Y., Lang, C., Li, Y., Sim, T., Yan, S., Feng, J.: Multiple-human parsing in the wild (2017). arXiv preprint arXiv:1705.07206

  32. Liu, Z., Yan, S., Luo, P., Wang, X., Tang, X.: Fashion landmark detection in the wild. In: European Conference on Computer Vision, pp. 229–245. Springer, Cham (2016)

  33. Chen, M., Qin, Y., Qi, L., Sun, Y.: Improving fashion landmark detection by dual attention feature enhancement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)

  34. Lee, S., Oh, S., Jung, C., Kim, C.: A global–local embedding module for fashion landmark detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)

  35. Wang, W., Xu, Y., Shen, J., Zhu, S.C.: Attentive fashion grammar network for fashion landmark detection and clothing category classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4271–4280 (2018)

  36. Shajini, M., Ramanan, A.: An improved landmark-driven and spatial-channel attentive convolutional neural network for fashion clothes classification. Vis. Comput. 37(6), 1517–1526 (2021)

    Article  Google Scholar 

  37. Jia, M., Zhou, Y., Shi, M., Hariharan, B.: A deep-learning-based fashion attributes detection model (2018). arXiv preprint arXiv:1810.10148

  38. Zhang, H., Li, S., Cai, S., Jiang, H., Kuo, C.C.J.: Representative fashion feature extraction by leveraging weakly annotated online resources. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 2640–2644. IEEE (2018)

  39. Mallavarapu, T., Cranfill, L., Son, J., Kim, E.H., Parizi, R.M., Morris, J.: A federated approach for fine-grained classification of fashion apparel (2020). arXiv preprint arXiv:2008.12350

  40. Rubio, A., Yu, L., Simo-Serra, E., Moreno-Noguer, F.: Multi-modal embedding for main product detection in fashion. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 2236–2242 (2017)

  41. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

  42. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection (2020). arXiv preprint arXiv:2004.10934

  43. Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3645–3649. IEEE (2017)

  44. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)

    Article  Google Scholar 

  45. Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)

  46. Dutta, A., Zisserman, A.: The VIA annotation software for images, audio and video. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2276–2279 (2019)

  47. Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: Deepfashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1096–1104 (2016)

  48. Li, P., Li, Y., Jiang, X., Zhen, X.: Two-stream multi-task network for fashion recognition. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 3038–3042. IEEE (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Changbo Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest. The authors also declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Amin, M.S., Wang, C. & Jabeen, S. Fashion sub-categories and attributes prediction model using deep learning. Vis Comput 39, 3851–3864 (2023). https://doi.org/10.1007/s00371-022-02520-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02520-3

Keywords

Navigation