Fully Convolutional Network with Superpixel Parsing for Fashion Web Image Segmentation

  • Lixuan Yang
  • Helena Rodriguez
  • Michel Crucianu
  • Marin Ferecatu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10132)


In this paper we introduce a new method for extracting deformable clothing items from still images by extending the output of a Fully Convolutional Neural Network (FCN) to infer context from local units (superpixels). To achieve this we optimize an energy function, that combines the large scale structure of the image with the local low-level visual descriptions of superpixels, over the space of all possible pixel labellings. To assess our method we compare it to the unmodified FCN network used as a baseline, as well as to the well-known Paper Doll and Co-parsing methods for fashion images.


Clothing extraction Semantic segmentation FCN Superpixel parsing 


  1. 1.
    Bell, S., Upchurch, P., Snavely, N., Bala, K.: Material recognition in the wild with the materials in context database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3479–3487 (2015)Google Scholar
  2. 2.
    Bengio, Y.: Deep learning of representations for unsupervised and transfer learning. In: JMLR W&CP: Proceedings of Unsupervised and Transfer Learning Challenge and Workshop, vol. 27, pp. 17–36 (2012)Google Scholar
  3. 3.
    Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic attributes. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 609–623. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33712-3_44 CrossRefGoogle Scholar
  4. 4.
    Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: CVPR, pp. 5315–5324. IEEE Computer Society, Boston (2015)Google Scholar
  5. 5.
    Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 5:1–5:60 (2008)CrossRefGoogle Scholar
  6. 6.
    Deselaers, T., Keysers, D., Ney, H.: Features for image retrieval: an experimental comparison. Inf. Retr. 11(2), 77–107 (2008)CrossRefGoogle Scholar
  7. 7.
    Di, W., Wah, C., Bhardwaj, A., Piramuthu, R., Sundaresan, N.: Style finder: fine-grained clothing style detection and retrieval. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2013, pp. 8–13. IEEE Computer Society, Washington, DC (2013)Google Scholar
  8. 8.
    Dong, J., Chen, Q., Shen, X., Yang, J., Yan, S.: Towards unified human parsing and pose estimation. In: Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Washington, DC, USA, pp. 843–850 (2014)Google Scholar
  9. 9.
    Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture (2014). arXiv:abs/1411.4734
  10. 10.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)CrossRefGoogle Scholar
  11. 11.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  12. 12.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)CrossRefGoogle Scholar
  13. 13.
    Hsu, E., Paz, C., Shen, S.: Clothing image retrieval for smarter shopping (Stanford project) (2011)Google Scholar
  14. 14.
    Hu, Y., Yi, X., Davis, L.S.: Collaborative fashion recommendation: a functional tensor factorization approach. In: Proceedings of 23rd ACM International Conference on Multimedia, MM 2015, pp. 129–138. ACM, New York (2015)Google Scholar
  15. 15.
    Jammalamadaka, N., Minocha, A., Singh, D., Jawahar, C.V.: Parsing clothes in unrestricted images. In: British Machine Vision Conference, BMVC 2013, Bristol, UK, 9–13 September 2013Google Scholar
  16. 16.
    Kalantidis, Y., Kennedy, L., Li, L.J.: Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. In: Proceedings of 3rd ACM Conference on International Conference on Multimedia Retrieval, ICMR 2013, pp. 105–112. ACM, New York (2013)Google Scholar
  17. 17.
    Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV), ICCV 2015, pp. 3343–3351. IEEE Computer Society, Washington, DC (2015)Google Scholar
  18. 18.
    King, I., Lau, T.K.: A feature-based image retrieval database for the fashion, textile, and clothing industry in Hong Kong. In: International Symposium on Multi-Technology Information Processing (ISMIP 1996), Hsin-Chu, Taiwan, pp. 233–240 (1996)Google Scholar
  19. 19.
    Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: Advances in Neural Information Processing Systems, vol. 24, pp. 109–117. Curran Associates Inc. (2011)Google Scholar
  20. 20.
    Lagarias, J.C., Reeds, J.A., Wright, M.H., Wright, P.E.: Convergence properties of the Nelder-Mead simplex method in low dimensions. SIAM J. Optim. 9(1), 112–147 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: state of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl. 2(1), 1–19 (2006)CrossRefGoogle Scholar
  22. 22.
    Chen, L.-C., George, P., Kokkinos, I., Murphy, K., Yuille, A.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: International Conference on Learning Representations, San Diego, United States, May 2015Google Scholar
  23. 23.
    Lin, K., Yang, H.F., Liu, K.H., Hsiao, J.H., Chen, C.S.: Rapid clothing retrieval via deep learning of binary codes and hierarchical search. In: Proceedings of 5th ACM on International Conference on Multimedia Retrieval, New York, USA, pp. 499–502 (2015)Google Scholar
  24. 24.
    Liu, S., Feng, J., Song, Z., Zhang, T., Lu, H., Xu, C., Yan, S.: Hi, magic closet, tell me what to wear!. In: Proceedings of 20th ACM International Conference on Multimedia, MM 2012, pp. 619–628. ACM, New York (2012)Google Scholar
  25. 25.
    Liu, S., Liang, X., Liu, L., Lu, K., Lin, L., Yan, S.: Fashion parsing with video context. In: Proceedings of 22nd ACM International Conference on Multimedia, MM 2014, pp. 467–476. ACM, New York (2014)Google Scholar
  26. 26.
    Liu, S., Song, Z., Wang, M., Xu, C., Lu, H., Yan, S.: Street-to-shop: cross-scenario clothing retrieval via parts alignment and auxiliary set. In: Proceedings of 20th ACM International Conference on Multimedia, MM 2012, pp, 1335–1336. ACM, New York (2012)Google Scholar
  27. 27.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation (2014). arXiv:abs/1411.4038
  28. 28.
    Mostajabi, M., Yadollahpour, P., Shakhnarovich, G.: Feedforward semantic segmentation with zoom-out features (2014). arXiv: abs/1412.0774
  29. 29.
    Nguyen, T.V., Liu, S., Ni, B., Tan, J., Rui, Y., Yan, S.: Sense beauty via face, dressing, and/or voice. In: Proceedings of 20th ACM International Conference on Multimedia, MM 2012, pp. 239–248. ACM, New York (2012)Google Scholar
  30. 30.
    Redi, M.: Novel methods for semantic and aesthetic multimedia retrieval. Ph.D. thesis, Université de Nice, Sophia Antipolis (2013)Google Scholar
  31. 31.
    Simo-Serra, E., Fidler, S., Moreno-Noguer, F., Urtasun, R.: A high performance CRF model for clothes parsing. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9005, pp. 64–81. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-16811-1_5 Google Scholar
  32. 32.
    Simo-Serra, E., Fidler, S., Moreno-Noguer, F., Urtasun, R.: Neuroaesthetics in fashion: modeling the perception of fashionability. In: CVPR (2015)Google Scholar
  33. 33.
    Song, Z., Wang, M., Hua, X.S., Yan, S.: Predicting occupation via human clothing and contexts. In: Proceedings of the 2011 International Conference on Computer Vision, ICCV 2011, pp. 1084–1091. IEEE Computer Society, Washington, DC (2011)Google Scholar
  34. 34.
    Veit, A., Kovacs, B., Bell, S., McAuley, J., Bala, K., Belongie, S.: Learning visual clothing style with heterogeneous dyadic co-occurrences. In: International Conference on Computer Vision (ICCV), Santiago, Chile (2015)Google Scholar
  35. 35.
    Yamaguchi, K., Hadi, K., Luis, E., Tamara, L.B.: Retrieving similar styles to parse clothing. IEEE TPAMI 37, 1028–1040 (2015)CrossRefGoogle Scholar
  36. 36.
    Yamaguchi, K., Okatani, T., Sudo, K., Murasaki, K., Taniguchi, Y.: Mix and match: joint model for clothing and attribute recognition. In: Proceedings of British Machine Vision Conference (BMVC), pp. 51.1–51.12. BMVA Press, September 2015Google Scholar
  37. 37.
    Yang, M., Yu, K.: Real-time clothing recognition in surveillance videos. In: ICIP, ICIP 2011, pp. 2937–2940. IEEE (2011)Google Scholar
  38. 38.
    Zhang, N., Donahue, J., Girshick, R.B., Darrell, T.: Part-based R-CNNs for fine-grained category detection (2014). arXiv: abs/1407.3867
  39. 39.
    Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.S.: Conditional random fields as recurrent neural networks (2015). arXiv: abs/1502.03240

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Lixuan Yang
    • 1
    • 2
  • Helena Rodriguez
    • 2
  • Michel Crucianu
    • 1
  • Marin Ferecatu
    • 1
  1. 1.Conservatoire National des Arts et MetiersParisFrance
  2. 2.Shopedia SASParisFrance

Personalised recommendations