Bag of Deep Features for Instructor Activity Recognition in Lecture Room
This research aims to explore contextual visual information in the lecture room, to assist an instructor to articulate the effectiveness of the delivered lecture. The objective is to enable a self-evaluation mechanism for the instructor to improve lecture productivity by understanding their activities. Teacher’s effectiveness has a remarkable impact on uplifting students performance to make them succeed academically and professionally. Therefore, the process of lecture evaluation can significantly contribute to improve academic quality and governance. In this paper, we propose a vision-based framework to recognize the activities of the instructor for self-evaluation of the delivered lectures. The proposed approach uses motion templates of instructor activities and describes them through a Bag-of-Deep features (BoDF) representation. Deep spatio-temporal features extracted from motion templates are utilized to compile a visual vocabulary. The visual vocabulary for instructor activity recognition is quantized to optimize the learning model. A Support Vector Machine classifier is used to generate the model and predict the instructor activities. We evaluated the proposed scheme on a self-captured lecture room dataset, IAVID-1. Eight instructor activities: pointing towards the student, pointing towards board or screen, idle, interacting, sitting, walking, using a mobile phone and using a laptop, are recognized with an 85.41% accuracy. As a result, the proposed framework enables instructor activity recognition without human intervention.
KeywordsHuman activity recognition Instructor activity recognition Motion templates Academic quality assurance
Sergio A Velastin has received funding from the Universidad Carlos III de Madrid, the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no 600371, el Ministerio de Economía, Industria y Competitividad (COFUND2014-51509) el Ministerio de Educación, cultura y Deporte (CEI-15-17) and Banco Santander.
- 4.Kim, H.-J., Lee, J.S., Yang, H.-S.: Human action recognition using a modified convolutional neural network. In: Liu, D., Fei, S., Hou, Z., Zhang, H., Sun, C. (eds.) ISNN 2007. LNCS, vol. 4492, pp. 715–723. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72393-6_85CrossRefGoogle Scholar
- 6.Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
- 7.Li, W., Wen, L., Chang, M.C., Lim, S.N., Lyu, S.: Adaptive RNN tree for large-scale human action recognition. In: ICCV, pp. 1453–1461 (2017)Google Scholar
- 11.Nazir, S., Yousaf, M.H., Velastin, S.A.: Evaluating a bag-of-visual features approach using spatio-temporal features for action recognition. Computers & Electrical Engineering (2018)Google Scholar
- 13.O’Hara, S., Draper, B.A.: Introduction to the bag of features paradigm for image classification and retrieval. arXiv preprint arXiv:1101.3354 (2011)
- 14.Orrite, C., Rodriguez, M., Herrero, E., Rogez, G., Velastin, S.A.: Automatic segmentation and recognition of human actions in monocular sequences. In: 2014 22nd International Conference on Pattern Recognition (ICPR), pp. 4218–4223. IEEE (2014)Google Scholar
- 16.Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)Google Scholar
- 17.Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
- 19.Yousaf, M.H., Azhar, K., Sial, H.A.: A novel vision based approach for instructor’s performance and behavior analysis. In: 2015 International Conference on Communications, Signal Processing, and Their Applications (ICCSPA), pp. 1–6. IEEE (2015)Google Scholar
- 20.Yousaf, M.H., Habib, H.A., Azhar, K.: Fuzzy classification of instructor morphological features for autonomous lecture recording system. Inf. J. 16(8), 6367 (2013)Google Scholar