Spatio-Temporal Wardrobe Generation of Actors’ Clothing in Video Content

  • Florian Vandecasteele
  • Jeroen Vervaeke
  • Baptist Vandersmissen
  • Michel De Wachter
  • Steven VerstocktEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9733)


In this paper, we propose a methodology for spatio-temporal wardrobe generation for video content. The main goal is to suggest relevant matches between clothes worn by actors and images originating from a set of e-commerce clothing sites. The semi-automatic generation of fine-grained spatial metadata for each video sequence is based on shot detection, keyframe detection, feature matching and clothing type classification based filtering. The result of this annotation process is a spatio-temporal database consisting of videos and the corresponding actor clothing. This database can be queried in various ways depending on the intended target application.


Video summarization Shot detection Clothing annotation Metadata enrichment Deep learning 



SpotShop ( spotshop) is a research project facilitated by iMinds and funded by the Institute for the Promotion of Innovation by Science and Technology in Flanders (IWT).


  1. 1.
    Liaukonyte, J., Teixeira, T., Wilbur, K.C.: Television advertising and online shopping. Mark. Sci. 34(3), 311–330 (2015)CrossRefGoogle Scholar
  2. 2.
    Ajmal, M., Ashraf, M.H., Shakir, M., Abbas, Y., Shah, F.A.: Video summarization: techniques and classification. In: Bolc, L., Tadeusiewicz, R., Chmielewski, L.J., Wojciechowski, K. (eds.) ICCVG 2012. LNCS, vol. 7594, pp. 1–13. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    Wang, F., Ngo, C.-W.: Summarizing rushes videos by motion, object, and event understanding. IEEE Trans. Multimedia 14(1), 76–87 (2012)CrossRefGoogle Scholar
  4. 4.
    dos Santos Belo, L., Caetano, C.A., do Patrocínio, Z.K.G., Guimarães, S.J.F.: Summarizing video sequence using a graph-based hierarchical approach. Neurocomputing 173, 1001–1016 (2016)CrossRefGoogle Scholar
  5. 5.
    Uchihachi, S., Foote, J.T., Wilcox, L.: Automatic video summarization using a measure of shot importance and a frame-packing method. US Patent 6,535,639 (2003)Google Scholar
  6. 6.
    Qiu, X., Jiang, S., Liu, H., Huang, Q., Cao, L.: Spatial-temporal attention analysis for home video. In: IEEE International Conference on Multimedia and Expo, pp. 1517–1520. IEEE (2008)Google Scholar
  7. 7.
    Chalamala, S.R., Kakkirala, K., Dhillon, J.: A robust video synchronization method based on hierarchical shot detection. In: International Conference on Audio, Language and Image Processing (ICALIP), pp. 206–210. IEEE (2014)Google Scholar
  8. 8.
    Liu, T.-R., Chan, S.-C.: Automatic shot boundary detection algorithm using structure-aware histogram metric. In: 19th International Conference on Digital Signal Processing (DSP), pp. 541–546. IEEE (2014)Google Scholar
  9. 9.
    Thomas, S.S., Gupta, S., Venkatesh, K.S.: An energy minimization approach for automatic video shot and scene boundary detection. In: Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), pp. 297–300. IEEE (2014)Google Scholar
  10. 10.
    Baraldi, L., Grana, C., Cucchiara, R.: Shot and scene detection via hierarchical clustering for re-using broadcast video. In: Azzopardi, G., Petkov, N., Yamagiwa, S. (eds.) CAIP 2015. LNCS, vol. 9256, pp. 801–811. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-23192-1_67 CrossRefGoogle Scholar
  11. 11.
    Joy, K.R., Sarma, E.G.: Recent developments in image quality assessment algorithms: a review. J. Theoret. Appl. Inf. Technol. 65(1) (2014)Google Scholar
  12. 12.
    Chum, O., Philbin, J., Isard, M., Zisserman, A.: Scalable near identical image and shot detection. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 549–556. ACM (2007)Google Scholar
  13. 13.
    Taşdemir, K., Cetin, A.E.: Content-based video copy detection based on motion vectors estimated using a lower frame rate. Signal, Image Video Process. 8(6), 1049–1057 (2014)CrossRefGoogle Scholar
  14. 14.
    Sarkar, A., Ghosh, P., Moxley, E., Manjunath, B.S.: Video fingerprinting: features for duplicate and similar videodetection and query-based video retrieval. In: Electronic Imaging 2008, pp. 68200E–68200E. InternationalSociety for Optics and Photonics (2008)Google Scholar
  15. 15.
    Lux, M.: Lire: open source image retrieval in java. In Proceedings of the 21st ACM International Conference on Multimedia, pp. 843–846. ACM (2013)Google Scholar
  16. 16.
    Chatzichristofis, S.A., Boutalis, Y.S.: CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) ICVS 2008. LNCS, vol. 5008, pp. 312–322. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  17. 17.
    Chatzichristofis, S., Boutalis, Y.S., et al.: Fcth: fuzzy color and texture histogram-a low level feature foraccurate image retrieval. In: Ninth International Workshop on Image Analysis for MultimediaInteractive Services, WIAMIS, pp. 191–196. IEEE (2008)Google Scholar
  18. 18.
    Praveen Kumar, P., Aparna, D., Venkata Rao, K.: Compact descriptors for accurate image indexing and retrieval: fcthand cedd. In: International Journal of Engineering Research and Technology (2012)Google Scholar
  19. 19.
    Wang, H., Zhou, Z., Xiao, C., Zhang, L.: Content based image search for clothing recommendations in e-commerce. In: Baughman, A.K., Gao, J., Pan, J.-Y., Petrushin, V.A. (eds.) Multimedia Data Mining and Analytics, pp. 253–267. Springer, Heidelberg (2015)Google Scholar
  20. 20.
    Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., LBerg, T.: Where to buy it: matching street clothing photos in online shops. In: Proceedings of the IEEE International Conference on ComputerVision, pp. 3343–3351. IEEE (2015)Google Scholar
  21. 21.
    Nogueira, K., Veloso, A.A., dosSantos, J.A.: Pointwise and pairwise clothing annotation: combining features fromsocial media. Multimedia Tools Appl. 75, 4083–4113 (2015)CrossRefGoogle Scholar
  22. 22.
    Šaloun, P., Stonawski, J., Zelinka, I.: Automated face comparison with facebook friend’s faces and flickr photos. In: Zelinka, I., Duy, V.H., Cha, J. (eds.) AETA 2013. LNEE, vol. 282, pp. 349–362. Springer, Heidelberg (2014)Google Scholar
  23. 23.
    Klontz, J.C., Klare, B.F., Klum, S., Jain, A.K., Burge, M.J.: Open source biometric recognition. In: IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS), pp. 1–8. IEEE (2013)Google Scholar
  24. 24.
    Fan, H., Yang, M., Cao, Z., Jiang, Y., Yin, Q.: Learning compact face representation: packing a face into an int32. In: Proceedings of the ACM International Conference onMultimedia, pp. 933–936. ACM (2014)Google Scholar
  25. 25.
    Simo-Serra, E., Fidler, S., Moreno-Noguer, F., Urtasun, R.: A high performance CRF model for clothes parsing. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9005, pp. 64–81. Springer, Heidelberg (2015)Google Scholar
  26. 26.
    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2274–2282 (2012)CrossRefGoogle Scholar
  27. 27.
    Yang, J., Gan, Z., Li, K., Hou, C.: Graph-based segmentation for rgb-d data using 3-d geometry enhanced superpixels. IEEE Trans. Cybern., 927–940 (2015)Google Scholar
  28. 28.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556
  29. 29.
    Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 401–408. ACM (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Florian Vandecasteele
    • 1
  • Jeroen Vervaeke
    • 1
  • Baptist Vandersmissen
    • 1
  • Michel De Wachter
    • 2
  • Steven Verstockt
    • 1
    Email author
  1. 1.ELIS Department - Data Science LabGhent University IMindsGhentBelgium
  2. 2.AppinnessAalstBelgium

Personalised recommendations