Multimodal behavior analysis in computer-enabled laboratories using nonverbal cues

Banerjee, Sayani; Ashwin, T. S.; Guddeti, Ram Mohana Reddy

doi:10.1007/s11760-020-01705-4

Multimodal behavior analysis in computer-enabled laboratories using nonverbal cues

Original Paper
Published: 29 May 2020

Volume 14, pages 1617–1624, (2020)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Sayani Banerjee¹^na1,
T. S. Ashwin ORCID: orcid.org/0000-0002-1690-1626¹^na1 &
Ram Mohana Reddy Guddeti¹

427 Accesses
3 Citations
Explore all metrics

Abstract

In the modern era, there is a growing need for surveillance to ensure the safety and security of the people. Real-time object detection is crucial for many applications such as traffic monitoring, security, search and rescue, vehicle counting, and classroom monitoring. Computer-enabled laboratories are generally equipped with video surveillance cameras in the smart campus. But, from the existing literature, it is observed that the use of video surveillance data obtained from smart campus for any unobtrusive behavioral analysis is seldom performed. Though there are several works on the students’ and teachers’ behavior recognition from devices such as Kinect and handy cameras, there exists no such work which extracts the video surveillance data and predicts the behavioral patterns of both the students and the teachers in real time. Hence, in this study, we unobtrusively analyze the students’ and teachers’ behavioral patterns inside a teaching laboratory (which is considered as an indoor scenario of a smart campus). Here, we propose a deep convolution network architecture to classify and recognize an object in the indoor scenario, i.e., the teaching laboratory environment of the smart campus with modified Single-Shot MultiBox Detector approach. We used six different class labels for predicting the behavioral patterns of both the students and the teachers. We created our dataset with six different class labels for training deep learning architecture. The performance evaluation demonstrates that the proposed method performs better with an accuracy of 0.765 for classification and localization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

A review of convolutional neural networks in computer vision

Article Open access 23 March 2024

Convolutional neural network: a review of models, methodologies and applications to object detection

Article 20 December 2019

Notes

The word “multimodal” used in the proposed methodology refers to the intra-image multimodality where the features of the head, hand gesture, and body posture of each student present in the single image frame.
More details are provided in the supplementary document.
More details on methodology are provided in the supplementary document.

References

Zhang, S., Zhou, H., Zhang, B., Han, Z., Guo, Y.: “Signal, image and video processing” special issue: semantic representations for social behavior analysis in video surveillance systems. Signal Image Video Process. 8, 73–74 (2014)
Article Google Scholar
Su, K., Li, J., Fu, H.: Smart city and the applications. In: 2011 International Conference on Electronics, Communications and Control (ICECC), pp. 1028–1031. IEEE (2011)
Goold, B.J.: Public area surveillance and police work: the impact of cctv on police behaviour and autonomy. J. Surveill. Soc. 1(2), 191–203 (2003)
Article Google Scholar
Li, S., Wang, S., Zhang, D., Feng, C., Shi, L.: Real-time smoke removal for the surveillance images under fire scenario. Signal Image Video Process. 13(5), 1037–1043 (2019)
Article Google Scholar
Yu, H., Wang, J., Sun, X.: Surveillance video online prediction using multilayer elm with object principal trajectory. Signal Image Video Process. 13, 1–9 (2019)
ElTantawy, A., Shehata, M.S.: Local null space pursuit for real-time moving object detection in aerial surveillance. Signal Image Video Process. 14, 1–9 (2019)
Sooksatra, S., Kondo, T., Bunnun, P., Yoshitaka, A.: Headlight recognition for night-time traffic surveillance using spatial-temporal information. Signal Image Video Process. 14, 1–8 (2019)
Xiang, T., Gong, S.: Video behavior profiling for anomaly detection. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 893–908 (2008)
Article Google Scholar
Ashwin, T., Guddeti, R.M.R.: Unobtrusive students’ engagement analysis in computer science laboratory using deep learning techniques. In: 2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT), pp. 436–440. IEEE (2018)
Garcia, C., Delakis, M.: Convolutional face finder: a neural architecture for fast and robust face detection. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1408–1423 (2004)
Article Google Scholar
Espinace, P., Kollar, T., Soto, A., Roy, N.: Indoor scene recognition through object detection. In: 2010 IEEE International Conference on Robotics and Automation, pp. 1406–1413. IEEE (2010)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Alam, M.S., Natesha, B., Ashwin, T., Guddeti, R.M.R.: Uav based cost-effective real-time abnormal event detection using edge computing. Multimed. Tools Appl. 78, 1–16 (2019)
Article Google Scholar
Nanda, A., Sa, P.K., Choudhury, S.K., Bakshi, S., Majhi, B.: A neuromorphic person re-identification framework for video surveillance. IEEE Access 5, 6471–6482 (2017)
Google Scholar
Gupta, S.K., Ashwin, T., Guddeti, R.M.R.: Students’ affective content analysis in smart classroom environment using deep learning techniques. Multimed. Tools Appl. 78, 1–28 (2019)
Article Google Scholar
Sacchetti, R., Teixeira, T., Barbosa, B., Neves, A.J., Soares, S.C., Dimas, I.D.: Human body posture detection in context: the case of teaching and learning environments. In: SIGNAL 2018 Editors p. 87 (2018)
Zaletelj, J., Košir, A.: Predicting students’ attention in the classroom from kinect facial and body features. EURASIP J. Image Video Process. 2017(1), 80 (2017)
Article Google Scholar
Wu, Y., Jia, Z., Ming, Y., Sun, J., Cao, L.: Human behavior recognition based on 3d features and hidden markov models. Signal Image Video Process. 10(3), 495–502 (2016)
Article Google Scholar
Mliki, H., Zaafouri, R., Hammami, M.: Human action recognition based on discriminant body regions selection. Signal Image Video Process. 12(5), 845–852 (2018)
Article Google Scholar
Zeng, S., Lu, G., Yan, P.: Enhancing human action recognition via structural average curves analysis. Signal Image Video Process. 12(8), 1551–1558 (2018)
Article Google Scholar
Gao, X.: A post-processing scheme for the performance improvement of vehicle detection in wide-area aerial imagery. Signal Image Video Process. 14, 1–9 (2019)
Rao, K.S., Koolagudi, S.G.: Recognition of emotions from video using acoustic and facial features. Signal Image Video Process. 9(5), 1029–1045 (2015)
Article Google Scholar
Nascimento, J.C., Marques, J.S.: Performance evaluation of object detection algorithms for video surveillance. IEEE Trans. Multimed. 8(4), 761–774 (2006)
Article Google Scholar
Yu, M., Xu, J., Zhong, J., Liu, W., Cheng, W.: Behavior detection and analysis for learning process in classroom environment. In: 2017 IEEE Frontiers in Education Conference (FIE), pp. 1–4. IEEE (2017)
Wahyono, T., Heryadi, Y., Achmad: anomaly detection to evaluate in-class learning process using distance and density approach of machine learning. In: 2017 International Conference on Innovative and Creative Information Technology (ICITech), pp. 1–6. IEEE (2017)
Sun, R.C., Shek, D.T.: Classroom misbehavior in the eyes of students: a qualitative study. Sci. World J. 2012, 1–8 (2012)
Google Scholar
Thomas, C., Jayagopi, D.B.: Predicting student engagement in classrooms using facial behavioral cues. In: Proceedings of the 1st ACM SIGCHI International Workshop on Multimodal Interaction for Education, pp. 33–40. ACM (2017)
Kaur, A., Mustafa, A., Mehta, L., Dhall, A.: Prediction and localization of student engagement in the wild. In: 2018 Digital Image Computing: Techniques and Applications (DICTA), pp. 1–8. IEEE (2018)
Barrett, P., Davies, F., Zhang, Y., Barrett, L.: The impact of classroom design on pupils’ learning: final results of a holistic, multi-level analysis. Build. Environ. 89, 118–133 (2015)
Article Google Scholar
Bbox annotation tool. https://github.com/puzzledqs/BBox-Label-Tool. Accessed 12 Mar 2020
Ashwin, T., Guddeti, R.M.R.: Unobtrusive behavioral analysis of students in classroom environment using non-verbal cues. IEEE Access 7, 150693–150709 (2019)
Article Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)

Download references

Author information

S. Banerjee and T. S. Ashwin have contributed equally to this work.

Authors and Affiliations

Department of Information Technology, National Institute of Technology Karnataka Surathkal, Mangalore, India
Sayani Banerjee, T. S. Ashwin & Ram Mohana Reddy Guddeti

Authors

Sayani Banerjee
View author publications
You can also search for this author in PubMed Google Scholar
T. S. Ashwin
View author publications
You can also search for this author in PubMed Google Scholar
Ram Mohana Reddy Guddeti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to T. S. Ashwin.

Ethics declarations

Ethical approval

Authors have obtained all ethical approvals from the Institutional Ethics Committee (IEC) of the National Institute of Technology Karnataka Surathkal, Mangalore, India, and written consent was also obtained from the human subjects.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 502 KB)

Supplementary material 2 (mp4 10691 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Banerjee, S., Ashwin, T.S. & Guddeti, R.M.R. Multimodal behavior analysis in computer-enabled laboratories using nonverbal cues. SIViP 14, 1617–1624 (2020). https://doi.org/10.1007/s11760-020-01705-4

Download citation

Received: 12 July 2019
Revised: 12 March 2020
Accepted: 28 April 2020
Published: 29 May 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s11760-020-01705-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal behavior analysis in computer-enabled laboratories using nonverbal cues

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

A review of convolutional neural networks in computer vision

Convolutional neural network: a review of models, methodologies and applications to object detection

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 502 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multimodal behavior analysis in computer-enabled laboratories using nonverbal cues

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

A review of convolutional neural networks in computer vision

Convolutional neural network: a review of models, methodologies and applications to object detection

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 502 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation