Advertisement

Role of Group Level Affect to Find the Most Influential Person in Images

  • Shreya GhoshEmail author
  • Abhinav Dhall
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11130)

Abstract

Group affect analysis is an important cue for predicting various group traits. Generally, the estimation of the group affect, emotional responses, eye gaze and position of people in images are the important cues to identify an important person from a group of people. The main focus of this paper is to explore the importance of group affect in finding the representative of a group. We call that person the “Most Influential Person” (for the first impression) or “leader” of a group. In order to identify the main visual cues for “Most Influential Person”, we conducted a user survey. Based on the survey statistics, we annotate the “influential persons” in 1000 images of Group AFfect database (GAF 2.0) via LabelMe toolbox and propose the “GAF-personage database”. In order to identify “Most Influential Person”, we proposed a DNN based Multiple Instance Learning (Deep MIL) method which takes deep facial features as input. To leverage the deep facial features, we first predict the individual emotion probabilities via CapsNet and rank the detected faces on the basis of it. Then, we extract deep facial features of the top-3 faces via VGG-16 network. Our method performs better than maximum facial area and saliency-based importance methods and achieves the human-level perception of “Most Influential Person” at group-level.

Keywords

Important person Group of people Group level affect 

Notes

Acknowledgement

We acknowledge the support of NVIDIA for providing us TITAN Xp G5X 12 GB GPU for the research purpose. We are thankful to the anonymous reviewers for their insightful comments and helpful suggestions to improve the quality of this paper. We would also like to thank the members of LASII lab for their support.

References

  1. 1.
    Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Advances in Neural Information Processing Systems, pp. 577–584 (2003)Google Scholar
  2. 2.
    Barsade, S.G., Gibson, D.E.: Group emotion: a view from top and bottom. Composition (1998)Google Scholar
  3. 3.
    Chollet, F., et al.: Keras (2015)Google Scholar
  4. 4.
    Dhall, A., Goecke, R., Gedeon, T.: Automatic group happiness intensity analysis. IEEE Trans. Affect. Comput. 6(1), 13–26 (2015)CrossRefGoogle Scholar
  5. 5.
    Dhall, A., Goecke, R., Ghosh, S., Joshi, J., Hoey, J., Gedeon, T.: From individual to group-level emotion recognition: EmotiW 5.0. In: ACM ICMI (2017)Google Scholar
  6. 6.
    Dhall, A., Joshi, J., Radwan, I., Goecke, R.: Finding happiest moments in a social context. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7725, pp. 613–626. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-37444-9_48CrossRefGoogle Scholar
  7. 7.
    Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1–2), 31–71 (1997)CrossRefGoogle Scholar
  8. 8.
    Elazary, L., Itti, L.: Interesting objects are visually salient. J. Vis. 8(3), 3–3 (2008)CrossRefGoogle Scholar
  9. 9.
    Ertugrul, I.O., Jeni, L.A., Cohn, J.F.: FACSCaps: pose-independent facial action coding with capsulesGoogle Scholar
  10. 10.
    Gallagher, A.C., Chen, T.: Understanding images of groups of people. In: IEEE CVPR (2009)Google Scholar
  11. 11.
    Garcez, A.D., Zaverucha, G.: Multi-instance learning using recurrent neural networks. In: 2012 International Joint Conference on Neural Networks (IJCNN), pp. 1–6. IEEE (2012)Google Scholar
  12. 12.
    Ge, W., Collins, R.T., Ruback, R.B.: Vision-based analysis of small groups in pedestrian crowds. IEEE Trans. Pattern Anal. Mach. Intell. 34(5), 1003–1016 (2012)CrossRefGoogle Scholar
  13. 13.
    Ghosh, S., Dhall, A., Sebe, N.: Automatic group affect analysis in images via visual attribute and feature networks. In: IEEE International Conference on Image Processing (ICIP). IEEE (2018)Google Scholar
  14. 14.
    Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems, pp. 545–552 (2007)Google Scholar
  15. 15.
    Hernandez, J., Hoque, M.E., Drevo, W., Picard, R.W.: Mood meter: counting smiles in the wild. In: ACM UbiComp (2012)Google Scholar
  16. 16.
    Hou, X., Harel, J., Koch, C.: Image signature: highlighting sparse salient regions. IEEE Trans. Pattern Anal. Mach. Intell. 34(1), 194–201 (2012)CrossRefGoogle Scholar
  17. 17.
    Huang, X., Dhall, A., Zhao, G., Goecke, R., Pietikäinen, M.: Riesz-based volume local binary pattern and a novel group expression model for group happiness intensity analysis. In: BMVC (2015)Google Scholar
  18. 18.
    Hwang, S.J., Grauman, K.: Learning the relative importance of objects from tagged images for retrieval and cross-modal search. Int. J. Comput. Vis. 100(2), 134–153 (2012)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Ilse, M., Tomczak, J.M., Welling, M.: Attention-based deep multiple instance learning. arXiv preprint arXiv:1802.04712 (2018)
  20. 20.
    Jiang, M., Xu, J., Zhao, Q.: Saliency in crowd. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 17–32. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10584-0_2CrossRefGoogle Scholar
  21. 21.
    Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2584–2593. IEEE (2017)Google Scholar
  22. 22.
    Li, W.H., Li, B., Zheng, W.S.: PersonRank: detecting important people in images. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 234–241. IEEE (2018)Google Scholar
  23. 23.
    Mou, W., Gunes, H., Patras, I.: Alone versus in-a-group: a comparative analysis of facial affect recognition. In: ACM Multimedia (2016)Google Scholar
  24. 24.
    Ramachandran, P., Zoph, B., Le, Q.V.: Swish: a self-gated activation function. arXiv preprint arXiv:1710.05941 (2017)
  25. 25.
    Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3043–3053 (2016)Google Scholar
  26. 26.
    Redl, F.: Group emotion and leadership. Psychiatry 5(4), 573–596 (1942)CrossRefGoogle Scholar
  27. 27.
    Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77(1–3), 157–173 (2008)CrossRefGoogle Scholar
  28. 28.
    Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, pp. 3856–3866 (2017)Google Scholar
  29. 29.
    Smith, E.R., Seger, C.R., Mackie, D.M.: Can emotions be truly group level? Evidence regarding four conceptual criteria. J. Pers. Soc. Psychol. 93(3), 431–446 (2007)CrossRefGoogle Scholar
  30. 30.
    Solomon Mathialagan, C., Gallagher, A.C., Batra, D.: VIP: finding important people in images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4858–4866 (2015)Google Scholar
  31. 31.
    Spain, M., Perona, P.: Measuring and predicting object importance. Int. J. Comput. Vis. 91(1), 59–76 (2011)CrossRefGoogle Scholar
  32. 32.
    Wu, J., Yu, Y., Huang, C., Yu, K.: Deep multiple instance learning for image classification and auto-annotation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3460–3469 (2015)Google Scholar
  33. 33.
    Wu, J., Zhao, Y., Zhu, J.Y., Luo, S., Tu, Z.: MILCut: a sweeping line multiple instance learning paradigm for interactive image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 256–263 (2014)Google Scholar
  34. 34.
    Xu, Y., et al.: Deep learning of feature representation with multiple instance learning for medical image analysis. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1626–1630. IEEE (2014)Google Scholar
  35. 35.
    Yamaguchi, K., et al.: Understanding and predicting importance in images. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3562–3569. IEEE (2012)Google Scholar
  36. 36.
    Zhang, C., Platt, J.C., Viola, P.A.: Multiple instance boosting for object detection. In: Advances in Neural Information Processing Systems, pp. 1417–1424 (2006)Google Scholar
  37. 37.
    Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Sig. Process. Lett. 23(10), 1499–1503 (2016)CrossRefGoogle Scholar
  38. 38.
    Zhou, Z.H., Zhang, M.L.: Neural networks for multi-instance learning. In: Proceedings of the International Conference on Intelligent Information Technology, Beijing, China, pp. 455–459 (2002)Google Scholar
  39. 39.
    Zhu, J.Y., Wu, J., Xu, Y., Chang, E., Tu, Z.: Unsupervised object class discovery via saliency-guided multiple class learning. IEEE Trans. Pattern Anal. Mach. Intell. 37(4), 862–875 (2015)CrossRefGoogle Scholar
  40. 40.
    Zhu, W., Lou, Q., Vang, Y.S., Xie, X.: Deep multi-instance networks with sparse label assignment for whole mammogram classification. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 603–611. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-66179-7_69CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Learning Affect and Semantic Image analysIs (LASII) GroupIndian Institute of Technology RoparRupnagarIndia

Personalised recommendations