Abstract
In the modern era, there is a growing need for surveillance to ensure the safety and security of the people. Real-time object detection is crucial for many applications such as traffic monitoring, security, search and rescue, vehicle counting, and classroom monitoring. Computer-enabled laboratories are generally equipped with video surveillance cameras in the smart campus. But, from the existing literature, it is observed that the use of video surveillance data obtained from smart campus for any unobtrusive behavioral analysis is seldom performed. Though there are several works on the students’ and teachers’ behavior recognition from devices such as Kinect and handy cameras, there exists no such work which extracts the video surveillance data and predicts the behavioral patterns of both the students and the teachers in real time. Hence, in this study, we unobtrusively analyze the students’ and teachers’ behavioral patterns inside a teaching laboratory (which is considered as an indoor scenario of a smart campus). Here, we propose a deep convolution network architecture to classify and recognize an object in the indoor scenario, i.e., the teaching laboratory environment of the smart campus with modified Single-Shot MultiBox Detector approach. We used six different class labels for predicting the behavioral patterns of both the students and the teachers. We created our dataset with six different class labels for training deep learning architecture. The performance evaluation demonstrates that the proposed method performs better with an accuracy of 0.765 for classification and localization.
Similar content being viewed by others
Notes
The word “multimodal” used in the proposed methodology refers to the intra-image multimodality where the features of the head, hand gesture, and body posture of each student present in the single image frame.
More details are provided in the supplementary document.
More details on methodology are provided in the supplementary document.
References
Zhang, S., Zhou, H., Zhang, B., Han, Z., Guo, Y.: “Signal, image and video processing” special issue: semantic representations for social behavior analysis in video surveillance systems. Signal Image Video Process. 8, 73–74 (2014)
Su, K., Li, J., Fu, H.: Smart city and the applications. In: 2011 International Conference on Electronics, Communications and Control (ICECC), pp. 1028–1031. IEEE (2011)
Goold, B.J.: Public area surveillance and police work: the impact of cctv on police behaviour and autonomy. J. Surveill. Soc. 1(2), 191–203 (2003)
Li, S., Wang, S., Zhang, D., Feng, C., Shi, L.: Real-time smoke removal for the surveillance images under fire scenario. Signal Image Video Process. 13(5), 1037–1043 (2019)
Yu, H., Wang, J., Sun, X.: Surveillance video online prediction using multilayer elm with object principal trajectory. Signal Image Video Process. 13, 1–9 (2019)
ElTantawy, A., Shehata, M.S.: Local null space pursuit for real-time moving object detection in aerial surveillance. Signal Image Video Process. 14, 1–9 (2019)
Sooksatra, S., Kondo, T., Bunnun, P., Yoshitaka, A.: Headlight recognition for night-time traffic surveillance using spatial-temporal information. Signal Image Video Process. 14, 1–8 (2019)
Xiang, T., Gong, S.: Video behavior profiling for anomaly detection. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 893–908 (2008)
Ashwin, T., Guddeti, R.M.R.: Unobtrusive students’ engagement analysis in computer science laboratory using deep learning techniques. In: 2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT), pp. 436–440. IEEE (2018)
Garcia, C., Delakis, M.: Convolutional face finder: a neural architecture for fast and robust face detection. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1408–1423 (2004)
Espinace, P., Kollar, T., Soto, A., Roy, N.: Indoor scene recognition through object detection. In: 2010 IEEE International Conference on Robotics and Automation, pp. 1406–1413. IEEE (2010)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Alam, M.S., Natesha, B., Ashwin, T., Guddeti, R.M.R.: Uav based cost-effective real-time abnormal event detection using edge computing. Multimed. Tools Appl. 78, 1–16 (2019)
Nanda, A., Sa, P.K., Choudhury, S.K., Bakshi, S., Majhi, B.: A neuromorphic person re-identification framework for video surveillance. IEEE Access 5, 6471–6482 (2017)
Gupta, S.K., Ashwin, T., Guddeti, R.M.R.: Students’ affective content analysis in smart classroom environment using deep learning techniques. Multimed. Tools Appl. 78, 1–28 (2019)
Sacchetti, R., Teixeira, T., Barbosa, B., Neves, A.J., Soares, S.C., Dimas, I.D.: Human body posture detection in context: the case of teaching and learning environments. In: SIGNAL 2018 Editors p. 87 (2018)
Zaletelj, J., Košir, A.: Predicting students’ attention in the classroom from kinect facial and body features. EURASIP J. Image Video Process. 2017(1), 80 (2017)
Wu, Y., Jia, Z., Ming, Y., Sun, J., Cao, L.: Human behavior recognition based on 3d features and hidden markov models. Signal Image Video Process. 10(3), 495–502 (2016)
Mliki, H., Zaafouri, R., Hammami, M.: Human action recognition based on discriminant body regions selection. Signal Image Video Process. 12(5), 845–852 (2018)
Zeng, S., Lu, G., Yan, P.: Enhancing human action recognition via structural average curves analysis. Signal Image Video Process. 12(8), 1551–1558 (2018)
Gao, X.: A post-processing scheme for the performance improvement of vehicle detection in wide-area aerial imagery. Signal Image Video Process. 14, 1–9 (2019)
Rao, K.S., Koolagudi, S.G.: Recognition of emotions from video using acoustic and facial features. Signal Image Video Process. 9(5), 1029–1045 (2015)
Nascimento, J.C., Marques, J.S.: Performance evaluation of object detection algorithms for video surveillance. IEEE Trans. Multimed. 8(4), 761–774 (2006)
Yu, M., Xu, J., Zhong, J., Liu, W., Cheng, W.: Behavior detection and analysis for learning process in classroom environment. In: 2017 IEEE Frontiers in Education Conference (FIE), pp. 1–4. IEEE (2017)
Wahyono, T., Heryadi, Y., Achmad: anomaly detection to evaluate in-class learning process using distance and density approach of machine learning. In: 2017 International Conference on Innovative and Creative Information Technology (ICITech), pp. 1–6. IEEE (2017)
Sun, R.C., Shek, D.T.: Classroom misbehavior in the eyes of students: a qualitative study. Sci. World J. 2012, 1–8 (2012)
Thomas, C., Jayagopi, D.B.: Predicting student engagement in classrooms using facial behavioral cues. In: Proceedings of the 1st ACM SIGCHI International Workshop on Multimodal Interaction for Education, pp. 33–40. ACM (2017)
Kaur, A., Mustafa, A., Mehta, L., Dhall, A.: Prediction and localization of student engagement in the wild. In: 2018 Digital Image Computing: Techniques and Applications (DICTA), pp. 1–8. IEEE (2018)
Barrett, P., Davies, F., Zhang, Y., Barrett, L.: The impact of classroom design on pupils’ learning: final results of a holistic, multi-level analysis. Build. Environ. 89, 118–133 (2015)
Bbox annotation tool. https://github.com/puzzledqs/BBox-Label-Tool. Accessed 12 Mar 2020
Ashwin, T., Guddeti, R.M.R.: Unobtrusive behavioral analysis of students in classroom environment using non-verbal cues. IEEE Access 7, 150693–150709 (2019)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical approval
Authors have obtained all ethical approvals from the Institutional Ethics Committee (IEC) of the National Institute of Technology Karnataka Surathkal, Mangalore, India, and written consent was also obtained from the human subjects.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 2 (mp4 10691 KB)
Rights and permissions
About this article
Cite this article
Banerjee, S., Ashwin, T.S. & Guddeti, R.M.R. Multimodal behavior analysis in computer-enabled laboratories using nonverbal cues. SIViP 14, 1617–1624 (2020). https://doi.org/10.1007/s11760-020-01705-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-020-01705-4