Abstract
Identifying the visual focus of attention (VFOA) in multi-person discussions is related to many types of social interactions such as dominance/deference, like/dislike, and trust/distrust relationships. However, identifying the VFOA of a person is challenging since it changes rapidly in dynamic discussions. We propose ICAF (Iterative Collective Attention Focus), a system that simultaneously tracks the VFOA and speaking probabilities of all people. In order to apply the system previously unseen videos, we propose a lightly supervised technique to train the model in ICAF with performance which is only slightly worse than in the fully supervised case. Our system can visualize the predicted VFOA and speaking probabilities and interaction networks in videos.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Asteriadis, S., Karpouzis, K., & Kollias, S. (2014). Visual focus of attention in non-calibrated environments using gaze estimation. International Journal of Computer Vision, 107(3), 293–316.
Ba, S. O., & Odobez, J. M. (2009). Recognizing visual focus of attention from head pose in natural meetings. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39(1), 16–33.
Ba, S. O., & Odobez, J. M. (2011). Multiperson visual focus of attention from head pose and meeting contextual cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1), 101–116.
Bai, C., Bolonkin, M., Kumar, S., Leskovec, J., Dunbar, N., Burgoon, J., & Subrahmanian, V. S. (2019a). Predicting dominance in multi-person videos, International Joint Conference on Artificial Intelligence (IJCAI).
Bai, C., Kumar, S., Leskovec, J., Metzger M., Nunamaker, J. F., & Subrahmanian, V. S. (2019b). Predicting the visual focus of attention in multi-person discussion videos. International joint conference on artificial intelligence (IJCAI).
Baltrusaitis, T., Zadeh, A., Lim, Y. C., & Morency, L. P. (2018). Openface 2.0: Facial behavior analysis toolkit. In 2018 13th IEEE international conference on automatic face gesture recognition (FG 2018) (pp. 59–66).
Knapp, M. L., Hall, J. A., & Horgan, T. G. (2013). Nonverbal communication in human interaction. Cengage Learning; 8th Edition (January 1, 2013).
Kong, X., Yu, P. S., Ding, Y., & Wild, D. J. (2012). Meta path-based collective classification in heterogeneous information networks. In Proceedings of the 21st ACM international conference on information and knowledge management (pp. 1567–1571). ACM.
Masse, ́. B., Ba, S., & Horaud, R. (2017). Tracking gaze and visual focus of attention of people involved in social interaction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 2711.
McCowan, I., Carletta, J., Kraaij, W., Ashby, S., Bourban, S., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., & Karaiskos, V. (2005). The ami meeting corpus. In Proceedings of the 5th international conference on methods and techniques in behavioral research (Vol. 88, p. 100).
Rayner, K. (2009). Eye movements and attention in reading, scene perception, and visual search. The Quarterly Journal of Experimental Psychology, 62(8), 1457–1506.
Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., & EliassiRad, T. (2008). Collective classification in network data. AI Magazine, 29(3), 93.
Stiefelhagen, R., Yang, J., & Waibel, A. (2002). Modeling focus of attention for meeting indexing based on multiple cues. IEEE Transactions on Neural Networks, 13(4), 928–938.
Acknowledgement
We are grateful to the Army Research Office for funding much of the work reported in this book under Grant W911NF-16-1-0342.
Funding Disclosure
This research was sponsored by the Army Research Office and was accomplished under Grant Number W911NF-16-1-0342. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Bai, C., Kumar, S., Leskovec, J., Metzger, M., Nunamaker, J.F., Subrahmanian, V.S. (2021). Iterative Collective Classification for Visual Focus of Attention Prediction. In: Subrahmanian, V.S., Burgoon, J.K., Dunbar, N.E. (eds) Detecting Trust and Deception in Group Interaction. Terrorism, Security, and Computation. Springer, Cham. https://doi.org/10.1007/978-3-030-54383-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-54383-9_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54382-2
Online ISBN: 978-3-030-54383-9
eBook Packages: Computer ScienceComputer Science (R0)