Who’s that Actor? Automatic Labelling of Actors in TV Series Starting from IMDB Images

  • Rahaf AljundiEmail author
  • Punarjay Chakravarty
  • Tinne Tuytelaars
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10113)


In this work, we aim at automatically labeling actors in a TV series. Rather than relying on transcripts and subtitles, as has been demonstrated in the past, we show how to achieve this goal starting from a set of example images of each of the main actors involved, collected from the Internet Movie Database (IMDB). The problem then becomes one of domain adaptation: actors’ IMDB photos are typically taken at awards ceremonies and are quite different from their appearances in TV series. In each series as well, there is considerable change in actor appearance due to makeup, lighting, ageing, etc. To bridge this gap, we propose a graph-matching based self-labelling algorithm, which we coin HSL (Hungarian Self Labeling). Further, we propose a new metric to be used in this context, as well as an extension that is more robust to outliers, where prototypical faces for each of the actors are selected based on a hierarchical clustering procedure. We conduct experiments with 15 episodes from 3 different TV series and demonstrate automatic annotation with an accuracy of 90% and up.


Face Detection Side Actor Edge Cost Assignment Cost Actor Cloud 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was supported by the iMinds HiViz project. The first author’s PhD is funded by an FWO scholarship.

Supplementary material

416261_1_En_31_MOESM1_ESM.pdf (1.8 mb)
Supplementary material 1 (pdf 1870 KB)


  1. 1.
    Everingham, M., Sivic, J., Zisserman, A.: “Hello! My name is... Buffy”-automatic naming of characters in TV video. In: BMVC, vol. 2, p. 6 (2006)Google Scholar
  2. 2.
    Tapaswi, M., Bäuml, M., Stiefelhagen, R.: Knock! Knock! Who is it? Probabilistic person identification in TV-series. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2658–2665. IEEE (2012)Google Scholar
  3. 3.
    Hu, Y., Ren, J.S., Dai, J., Yuan, C., Xu, L., Wang, W.: Deep multimodal speaker naming. In: Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, pp. 1107–1110. ACM (2015)Google Scholar
  4. 4.
    Ren, J., Hu, Y., Tai, Y.W., Wang, C., Xu, L., Sun, W., Yan, Q.: Look, listen and learn-a multimodal LSTM for speaker identification. arXiv preprint arXiv:1602.04364 (2016)
  5. 5.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)Google Scholar
  6. 6.
    Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: British Machine Vision Conference (2015)Google Scholar
  7. 7.
    Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)Google Scholar
  8. 8.
    Sivic, J., Everingham, M., Zisserman, A.: Who are you? - Learning person specific classifiers from video. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1145–1152. IEEE (2009)Google Scholar
  9. 9.
    Bauml, M., Tapaswi, M., Stiefelhagen, R.: Semi-supervised learning with constraints for person identification in multimedia data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3602–3609 (2013)Google Scholar
  10. 10.
    Tapaswi, M., Bauml, M., Stiefelhagen, R.: Improved weak labels using contextual cues for person identification in videos. In: 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1, pp. 1–8. IEEE (2015)Google Scholar
  11. 11.
    Bojanowski, P., Bach, F., Laptev, I., Ponce, J., Schmid, C., Sivic, J.: Finding actors and actions in movies. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2280–2287 (2013)Google Scholar
  12. 12.
    Parkhi, O.M., Rahtu, E., Zisserman, A.: It’s in the bag: stronger supervision for automated face labelling. In: ICCV Workshop: Describing and Understanding Video & the Large Scale Movie Description Challenge. IEEE (2015)Google Scholar
  13. 13.
    Haurilet, M.L., Tapaswi, M., Al-Halah, Z., Stiefelhagen, R.: Naming TV characters by watching and analyzing dialogs. In: IEEE Winter Conference on Applications of Computer Vision (WACV) (2016)Google Scholar
  14. 14.
    Guillaumin, M., Verbeek, J., Schmid, C.: Multiple instance metric learning from automatically labeled bags of faces. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 634–647. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15549-9_46 CrossRefGoogle Scholar
  15. 15.
    Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Technical report. Citeseer (2002)Google Scholar
  16. 16.
    Pham, P.T., Tuytelaars, T., Moens, M.F.: Naming people in news videos with label propagation. IEEE Multimedia 18, 44–55 (2011)CrossRefGoogle Scholar
  17. 17.
    Kumar, V., Namboodiri, A.M., Jawahar, C.: Face recognition in videos by label propagation. In: 2014 22nd International Conference on Pattern Recognition (ICPR), pp. 303–308. IEEE (2014)Google Scholar
  18. 18.
    Patel, V.M., Gopalan, R., Li, R., Chellappa, R.: Visual domain adaptation: a survey of recent advances. IEEE Signal Process. Mag. 32, 53–69 (2015)CrossRefGoogle Scholar
  19. 19.
    Bruzzone, L., Marconcini, M.: Domain adaptation problems: a DASVM classification technique and a circular validation strategy. IEEE Trans. Pattern Anal. Mach. Intell. 32, 770–787 (2010)CrossRefGoogle Scholar
  20. 20.
    Habrard, A., Peyrache, J.P., Sebban, M.: Iterative self-labeling domain adaptation for linear structured image classification. Int. J. Artif. Intell. Tools 22, 1360005 (2013)CrossRefGoogle Scholar
  21. 21.
    Banerjee, B., Bovolo, F., Bhattacharya, A., Bruzzone, L., Chaudhuri, S., Buddhiraju, K.M.: A novel graph-matching-based approach for domain adaptation in classification of remote sensing image pair. IEEE Geosci. Remote Sens. Lett. 53, 4045–4062 (2015)CrossRefGoogle Scholar
  22. 22.
    Tuia, D., Muñoz-Marí, J., Gómez-Chova, L., Malo, J.: Graph matching for adaptation in remote sensing. IEEE Geosci. Remote Sens. 51, 329–341 (2013)CrossRefGoogle Scholar
  23. 23.
    Mathias, M., Benenson, R., Pedersoli, M., Gool, L.: Face detection without bells and whistles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 720–735. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-10593-2_47 Google Scholar
  24. 24.
    Girshick, R.B., Felzenszwalb, P.F., McAllester, D.: Discriminatively trained deformable part models, release 5.
  25. 25.
    Kuhn, H.W.: The Hungarian method for the assignment problem. Naval Res. Logistics Q. 2, 83–97 (1955)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Long, M., Wang, J., Ding, G., Sun, J., Yu, P.S.: Transfer joint matching for unsupervised domain adaptation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  27. 27.
    Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Bubeck, S., von Luxburg, U.: Nearest neighbor clustering: a baseline method for consistent clustering with arbitrary objective functions. J. Mach. Learn. Res. 10, 657–698 (2009)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Rahaf Aljundi
    • 1
    Email author
  • Punarjay Chakravarty
    • 1
  • Tinne Tuytelaars
    • 1
  1. 1.KU Leuven, ESAT-PSI, iMindsLeuvenBelgium

Personalised recommendations