Abstract
The fashion clothing and items classification is challenging to incorporate category/sub-category classification and attributes prediction for numerous fashion items into a compact multitask learning infrastructure. The main motive of this research is to improve the fashion items categorization and their attributes prediction from extracted visual features. We proposed a novel fashion sub-categories and attributes prediction (FS\(_{\mathrm{C}}\)AP) model using deep learning techniques. In this proposed model, YOLO and DeepSORT architectures are used for person detection and tracking, Faster-RCNN architecture is used for sub-categories classification, and Custom-EfficientNet-B3 architecture is designed for attributes prediction. Twenty-four distinct modules are designed to increase the attributes classification accuracy for detected fashion items again each sub-category. The performance of the proposed model is evaluated on a customized fully annotated FashionItem dataset. The experimental results clearly show that the proposed model outperforms the recent baseline methods in fashion sub-categories and attributes prediction.
Similar content being viewed by others
References
Cho, H., Ahn, C., Min Yoo, K., Seol, J., Lee, S.G.: Leveraging class hierarchy in fashion classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Shajini, M., Ramanan, A.: A knowledge-sharing semi-supervised approach for fashion clothes classification and attribute prediction. Vis. Comput. (2021). https://doi.org/10.1007/s00371-021-02178-3
Hu, P., Komura, T., Holden, D., Zhong, Y.: Scanning and animating characters dressed in multiple-layer garments. Vis. Comput. 33(6), 961–969 (2017)
Li, H., Toyoura, M., Shimizu, K., Yang, W., Mao, X.: Retrieval of clothing images based on relevance feedback with focus on collar designs. Vis. Comput. 32(10), 1351–1363 (2016)
Feng, F., Shen, B., Liu, H.: Visual object tracking: in the simultaneous presence of scale variation and occlusion. Syst. Sci. Control Eng. 6(1), 456–466 (2018)
Ay, B., Aydin, G.: Visual similarity-based fashion recommendation system. In: Generative Adversarial Networks for Image-to-Image Translation, pp. 185–203. Academic Press, London (2021)
Meshkini, K., Platos, J., Ghassemain, H.: An analysis of convolutional neural network for fashion images classification (Fashion-MNIST). In: International Conference on Intelligent Information Technologies for Industry, pp. 85–95. Springer, Cham (2019)
Seo, Y., Shin, K.S.: Hierarchical convolutional neural networks for fashion image classification. Expert Syst. Appl. 116, 328–339 (2019)
Kayed, M., Anter, A., Mohamed, H.: Classification of garments from fashion MNIST dataset using CNN LeNet-5 architecture. In: 2020 International Conference on Innovative Trends in Communication and Computer Engineering (ITCE), pp. 238–243. IEEE (2020)
Kim, H.J., Lee, D.H., Niaz, A., Kim, C.Y., Memon, A.A., Choi, K.N.: Multiple-clothing detection and fashion landmark estimation using a single-stage detector. IEEE Access 9, 11694–11704 (2021)
Jeon, Y., Jin, S., Han, K.: FANCY: human-centered, deep learning-based framework for fashion style analysis. In: Proceedings of the Web Conference 2021, pp. 2367–2378 (2021)
Jain, P., Kankani, A., Amali, D.G.B.: A new technique for accurate segmentation, and detection of outfit using convolution neural networks. In: Information Systems Design and Intelligent Applications, pp. 169–177. Springer, Singapore (2019)
Zhang, X., Song, C., Yang, Y., Zhang, Z., Zhang, X., Wang, P., Zou, Q.: Deep learning based human body segmentation for clothing fashion classification. In: 2020 Chinese Automation Congress (CAC), pp. 7544–7549. IEEE (2020)
Martinsson, J., Mogren, O.: Semantic segmentation of fashion images using feature pyramid networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Lian, G., Zhang, K.: Transformation of portraits to Picasso’s cubism style. Vis. Comput. 36(4), 799–807 (2020)
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Zhang, X., Song, C., Yang, Y., Zhang, Z., Zhang, X., Wang, P., Zou, Q.: Deep learning based human body segmentation for clothing fashion classification. In: 2020 Chinese Automation Congress (CAC), pp. 7544–7549. IEEE (2020)
Geng, L., Zhang, S., Tong, J., Xiao, Z.: Lung segmentation method with dilated convolution based on VGG-16 network. Comput. Assist. Surg. 24(sup2), 27–33 (2019)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241. Springer, Cham (2015)
Dang, A.H., Kameyama, W.: Semantic segmentation of fashion photos using light-weight asymmetric U-Net. In: 2019 IEEE 8th Global Conference on Consumer Electronics (GCCE), pp. 175–178. IEEE (2019)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Manfredi, M., Grana, C., Calderara, S., Cucchiara, R.: A complete system for garment segmentation and color classification. Mach. Vis. Appl. 25(4), 955–969 (2014)
Hu, L., Wei II, C., Yang II, X., Wang II, T.: Special faster-RCNN for multi-objects detection. In: Third International Workshop on Pattern Recognition, Vol. 10828, p. 108280W. International Society for Optics and Photonics (2018)
Jouanneau, W., Bugeau, A., Palyart, M., Papadakis, N., Vezard, L.: Where are my clothes? A multi-level approach for evaluating deep instance segmentation architectures on fashion images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3951–3955 (2021)
Shi, M., Lewis, V.D.: Using artificial intelligence to analyze fashion trends (2020). arXiv preprint arXiv:2005.00986
Garg, D., Goel, P., Pandya, S., Ganatra, A., Kotecha, K.: A deep learning approach for face detection using YOLO. In: 2018 IEEE Punecon, pp. 1–4. IEEE (2018)
Lee, C.H., Lin, C.W.: A two-phase fashion apparel detection method based on YOLOv4. Appl. Sci. 11(9), 3782 (2021)
Ye, J., Feng, Z., Jing, Y., Song, M.: Finer-net: cascaded human parsing with hierarchical granularity. In: 2018 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2018)
Li, J., Zhao, J., Wei, Y., Lang, C., Li, Y., Sim, T., Feng, J.: Multiple-human parsing in the wild (2017). arXiv preprint arXiv:1705.07206
Wang, H., Wang, J., Wang, J., Zhao, M., Zhang, W., Zhang, F., Guo, M.: Graphgan: Graph representation learning with generative adversarial nets. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, No. 1 (2018)
Li, J., Zhao, J., Wei, Y., Lang, C., Li, Y., Sim, T., Yan, S., Feng, J.: Multiple-human parsing in the wild (2017). arXiv preprint arXiv:1705.07206
Liu, Z., Yan, S., Luo, P., Wang, X., Tang, X.: Fashion landmark detection in the wild. In: European Conference on Computer Vision, pp. 229–245. Springer, Cham (2016)
Chen, M., Qin, Y., Qi, L., Sun, Y.: Improving fashion landmark detection by dual attention feature enhancement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Lee, S., Oh, S., Jung, C., Kim, C.: A global–local embedding module for fashion landmark detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Wang, W., Xu, Y., Shen, J., Zhu, S.C.: Attentive fashion grammar network for fashion landmark detection and clothing category classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4271–4280 (2018)
Shajini, M., Ramanan, A.: An improved landmark-driven and spatial-channel attentive convolutional neural network for fashion clothes classification. Vis. Comput. 37(6), 1517–1526 (2021)
Jia, M., Zhou, Y., Shi, M., Hariharan, B.: A deep-learning-based fashion attributes detection model (2018). arXiv preprint arXiv:1810.10148
Zhang, H., Li, S., Cai, S., Jiang, H., Kuo, C.C.J.: Representative fashion feature extraction by leveraging weakly annotated online resources. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 2640–2644. IEEE (2018)
Mallavarapu, T., Cranfill, L., Son, J., Kim, E.H., Parizi, R.M., Morris, J.: A federated approach for fine-grained classification of fashion apparel (2020). arXiv preprint arXiv:2008.12350
Rubio, A., Yu, L., Simo-Serra, E., Moreno-Noguer, F.: Multi-modal embedding for main product detection in fashion. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 2236–2242 (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection (2020). arXiv preprint arXiv:2004.10934
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3645–3649. IEEE (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)
Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Dutta, A., Zisserman, A.: The VIA annotation software for images, audio and video. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2276–2279 (2019)
Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: Deepfashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1096–1104 (2016)
Li, P., Li, Y., Jiang, X., Zhen, X.: Two-stream multi-task network for fashion recognition. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 3038–3042. IEEE (2019)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest. The authors also declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Amin, M.S., Wang, C. & Jabeen, S. Fashion sub-categories and attributes prediction model using deep learning. Vis Comput 39, 3851–3864 (2023). https://doi.org/10.1007/s00371-022-02520-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02520-3