Sitcom-Stars Oriented Video Advertising via Clothing Retrieval

  • Haijun Zhang
  • Yuzhu JiEmail author
  • Wang Huang
  • Linlin Liu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10828)


This paper introduces a novel learning-based framework for video content-based advertising, DeepLink, which aims at linking sitcom-stars and online stores with clothing retrieval by using state-of-the-art deep convolutional neural networks (CNNs). Concretely, several deep CNN models are adopted for composing multiple sub-modules in DeepLink, including human-body detection, human-pose selection, face verification, clothing detection and retrieval from advertisements (ads) pool that is constructed by clothing images collected from real-world online stores. For clothing detection and retrieval from ad images, we firstly transfer the state-of-the-art deep CNN models to our data domain, and then train corresponding models based on our constructed large-scale clothing datasets. Extensive experimental results demonstrate the feasibility and efficacy of our proposed clothing-based video-advertising system.


Video advertising Deep learning Object detection Face verification Image retrieval Clothing detection 



This work was supported in part by the Natural Science Foundation of China under Grant 61572156 and in part by the Shenzhen Science and Technology Program under Grant JCYJ20170413105929681.


  1. 1.
    Li, Y., et al.: Real time advertisement insertion in baseball video based on advertisement effect. In: Proceedings of ACMMM, pp. 343–346. ACM (2005)Google Scholar
  2. 2.
    Redondo, R.P.D., et al.: Bringing content awareness to web-based IDTV advertising. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(3), 324–333 (2012)CrossRefGoogle Scholar
  3. 3.
    Mei, T., et al.: VideoSense: a contextual in-video advertising system. IEEE Trans. Circuits Syst. Video Technol. 19(12), 1866–1879 (2009)CrossRefGoogle Scholar
  4. 4.
    Zhang, H., et al.: Object-level video advertising: an optimization framework. IEEE Trans. Ind. Inform. 13(2), 520–531 (2017)CrossRefGoogle Scholar
  5. 5.
    Yadati, K., Katti, H., Kankanhalli, M.: CAVVA: computational affective video-in-video advertising. IEEE Trans. Multimedia 16(1), 15–23 (2014)CrossRefGoogle Scholar
  6. 6.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). Scholar
  7. 7.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556
  8. 8.
    He, K., et al.: Deep residual learning for image recognition. In: Proceedings of CVPR, pp. 770–778 (2016)Google Scholar
  9. 9.
    Krizhevsky, A., et al.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  10. 10.
    Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of CVPR, pp. 1–9 (2015)Google Scholar
  11. 11.
    Kiapour, M.H., et al.: Where to buy it: matching street clothing photos in online shops. In: Proceedings of ICCV, pp. 3343–3351 (2015)Google Scholar
  12. 12.
    Liu, X., et al.: Front. Comput. Sci. VIPLFaceNet: an open source deep face recognition SDK 11, 208–218 (2017)Google Scholar
  13. 13.
    Lin, K., et al.: Deep learning of binary hash codes for fast image retrieval. In: Proceedings of CVPR Workshops, pp. 27–35 (2015)Google Scholar
  14. 14.
    Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)CrossRefGoogle Scholar
  15. 15.
    Ojala, T., et al.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)CrossRefGoogle Scholar
  16. 16.
    Tan, X.Y., Triggs, B.: Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Trans. Image Process. 19(6), 1635–1650 (2010)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Murala, S., et al.: Local tetra patterns: a new feature descriptor for content-based image retrieval. IEEE Trans. Image Process. 21(5), 2874–2886 (2012)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Zhang, H., et al.: Organizing books and authors by multilayer SOM. IEEE Trans. Neural Netw. Learn. Syst. 27(12), 2537–2550 (2016)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Haijun Zhang
    • 1
  • Yuzhu Ji
    • 1
    Email author
  • Wang Huang
    • 1
  • Linlin Liu
    • 1
  1. 1.Department of Computer Science, Shenzhen Graduate SchoolHarbin Institute of TechnologyShenzhenChina

Personalised recommendations