Skip to main content
Log in

Multimodal behavior analysis in computer-enabled laboratories using nonverbal cues

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

In the modern era, there is a growing need for surveillance to ensure the safety and security of the people. Real-time object detection is crucial for many applications such as traffic monitoring, security, search and rescue, vehicle counting, and classroom monitoring. Computer-enabled laboratories are generally equipped with video surveillance cameras in the smart campus. But, from the existing literature, it is observed that the use of video surveillance data obtained from smart campus for any unobtrusive behavioral analysis is seldom performed. Though there are several works on the students’ and teachers’ behavior recognition from devices such as Kinect and handy cameras, there exists no such work which extracts the video surveillance data and predicts the behavioral patterns of both the students and the teachers in real time. Hence, in this study, we unobtrusively analyze the students’ and teachers’ behavioral patterns inside a teaching laboratory (which is considered as an indoor scenario of a smart campus). Here, we propose a deep convolution network architecture to classify and recognize an object in the indoor scenario, i.e., the teaching laboratory environment of the smart campus with modified Single-Shot MultiBox Detector approach. We used six different class labels for predicting the behavioral patterns of both the students and the teachers. We created our dataset with six different class labels for training deep learning architecture. The performance evaluation demonstrates that the proposed method performs better with an accuracy of 0.765 for classification and localization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. The word “multimodal” used in the proposed methodology refers to the intra-image multimodality where the features of the head, hand gesture, and body posture of each student present in the single image frame.

  2. More details are provided in the supplementary document.

  3. More details on methodology are provided in the supplementary document.

References

  1. Zhang, S., Zhou, H., Zhang, B., Han, Z., Guo, Y.: “Signal, image and video processing” special issue: semantic representations for social behavior analysis in video surveillance systems. Signal Image Video Process. 8, 73–74 (2014)

    Article  Google Scholar 

  2. Su, K., Li, J., Fu, H.: Smart city and the applications. In: 2011 International Conference on Electronics, Communications and Control (ICECC), pp. 1028–1031. IEEE (2011)

  3. Goold, B.J.: Public area surveillance and police work: the impact of cctv on police behaviour and autonomy. J. Surveill. Soc. 1(2), 191–203 (2003)

    Article  Google Scholar 

  4. Li, S., Wang, S., Zhang, D., Feng, C., Shi, L.: Real-time smoke removal for the surveillance images under fire scenario. Signal Image Video Process. 13(5), 1037–1043 (2019)

    Article  Google Scholar 

  5. Yu, H., Wang, J., Sun, X.: Surveillance video online prediction using multilayer elm with object principal trajectory. Signal Image Video Process. 13, 1–9 (2019)

  6. ElTantawy, A., Shehata, M.S.: Local null space pursuit for real-time moving object detection in aerial surveillance. Signal Image Video Process. 14, 1–9 (2019)

  7. Sooksatra, S., Kondo, T., Bunnun, P., Yoshitaka, A.: Headlight recognition for night-time traffic surveillance using spatial-temporal information. Signal Image Video Process. 14, 1–8 (2019)

  8. Xiang, T., Gong, S.: Video behavior profiling for anomaly detection. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 893–908 (2008)

    Article  Google Scholar 

  9. Ashwin, T., Guddeti, R.M.R.: Unobtrusive students’ engagement analysis in computer science laboratory using deep learning techniques. In: 2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT), pp. 436–440. IEEE (2018)

  10. Garcia, C., Delakis, M.: Convolutional face finder: a neural architecture for fast and robust face detection. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1408–1423 (2004)

    Article  Google Scholar 

  11. Espinace, P., Kollar, T., Soto, A., Roy, N.: Indoor scene recognition through object detection. In: 2010 IEEE International Conference on Robotics and Automation, pp. 1406–1413. IEEE (2010)

  12. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

  13. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)

  14. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

  15. Alam, M.S., Natesha, B., Ashwin, T., Guddeti, R.M.R.: Uav based cost-effective real-time abnormal event detection using edge computing. Multimed. Tools Appl. 78, 1–16 (2019)

    Article  Google Scholar 

  16. Nanda, A., Sa, P.K., Choudhury, S.K., Bakshi, S., Majhi, B.: A neuromorphic person re-identification framework for video surveillance. IEEE Access 5, 6471–6482 (2017)

    Google Scholar 

  17. Gupta, S.K., Ashwin, T., Guddeti, R.M.R.: Students’ affective content analysis in smart classroom environment using deep learning techniques. Multimed. Tools Appl. 78, 1–28 (2019)

    Article  Google Scholar 

  18. Sacchetti, R., Teixeira, T., Barbosa, B., Neves, A.J., Soares, S.C., Dimas, I.D.: Human body posture detection in context: the case of teaching and learning environments. In: SIGNAL 2018 Editors p. 87 (2018)

  19. Zaletelj, J., Košir, A.: Predicting students’ attention in the classroom from kinect facial and body features. EURASIP J. Image Video Process. 2017(1), 80 (2017)

    Article  Google Scholar 

  20. Wu, Y., Jia, Z., Ming, Y., Sun, J., Cao, L.: Human behavior recognition based on 3d features and hidden markov models. Signal Image Video Process. 10(3), 495–502 (2016)

    Article  Google Scholar 

  21. Mliki, H., Zaafouri, R., Hammami, M.: Human action recognition based on discriminant body regions selection. Signal Image Video Process. 12(5), 845–852 (2018)

    Article  Google Scholar 

  22. Zeng, S., Lu, G., Yan, P.: Enhancing human action recognition via structural average curves analysis. Signal Image Video Process. 12(8), 1551–1558 (2018)

    Article  Google Scholar 

  23. Gao, X.: A post-processing scheme for the performance improvement of vehicle detection in wide-area aerial imagery. Signal Image Video Process. 14, 1–9 (2019)

  24. Rao, K.S., Koolagudi, S.G.: Recognition of emotions from video using acoustic and facial features. Signal Image Video Process. 9(5), 1029–1045 (2015)

    Article  Google Scholar 

  25. Nascimento, J.C., Marques, J.S.: Performance evaluation of object detection algorithms for video surveillance. IEEE Trans. Multimed. 8(4), 761–774 (2006)

    Article  Google Scholar 

  26. Yu, M., Xu, J., Zhong, J., Liu, W., Cheng, W.: Behavior detection and analysis for learning process in classroom environment. In: 2017 IEEE Frontiers in Education Conference (FIE), pp. 1–4. IEEE (2017)

  27. Wahyono, T., Heryadi, Y., Achmad: anomaly detection to evaluate in-class learning process using distance and density approach of machine learning. In: 2017 International Conference on Innovative and Creative Information Technology (ICITech), pp. 1–6. IEEE (2017)

  28. Sun, R.C., Shek, D.T.: Classroom misbehavior in the eyes of students: a qualitative study. Sci. World J. 2012, 1–8 (2012)

    Google Scholar 

  29. Thomas, C., Jayagopi, D.B.: Predicting student engagement in classrooms using facial behavioral cues. In: Proceedings of the 1st ACM SIGCHI International Workshop on Multimodal Interaction for Education, pp. 33–40. ACM (2017)

  30. Kaur, A., Mustafa, A., Mehta, L., Dhall, A.: Prediction and localization of student engagement in the wild. In: 2018 Digital Image Computing: Techniques and Applications (DICTA), pp. 1–8. IEEE (2018)

  31. Barrett, P., Davies, F., Zhang, Y., Barrett, L.: The impact of classroom design on pupils’ learning: final results of a holistic, multi-level analysis. Build. Environ. 89, 118–133 (2015)

    Article  Google Scholar 

  32. Bbox annotation tool. https://github.com/puzzledqs/BBox-Label-Tool. Accessed 12 Mar 2020

  33. Ashwin, T., Guddeti, R.M.R.: Unobtrusive behavioral analysis of students in classroom environment using non-verbal cues. IEEE Access 7, 150693–150709 (2019)

    Article  Google Scholar 

  34. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. S. Ashwin.

Ethics declarations

Ethical approval

Authors have obtained all ethical approvals from the Institutional Ethics Committee (IEC) of the National Institute of Technology Karnataka Surathkal, Mangalore, India, and written consent was also obtained from the human subjects.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 502 KB)

Supplementary material 2 (mp4 10691 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Banerjee, S., Ashwin, T.S. & Guddeti, R.M.R. Multimodal behavior analysis in computer-enabled laboratories using nonverbal cues. SIViP 14, 1617–1624 (2020). https://doi.org/10.1007/s11760-020-01705-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-020-01705-4

Keywords

Navigation