Abstract
The analysis of teacher behavior of massive teaching videos has become a surge of research interest recently. Traditional methods rely on accurate manual analysis, which is extremely complex and time-consuming for analyzing massive teaching videos. However, existing works on action recognition are difficultly transplanted to the teacher behavior recognition, because it is difficult to extract teacher’s behavior from complex teaching scenario, and teacher’s behaviors are given professional educational semantics. These methods are not adequate for the need of the teacher behavior recognition. Thus, a novel and simple recognition method of teacher behavior in the actual teaching scene for massive teaching videos is proposed, which can provide technical assistance for analyzing teacher behavior and fill the blank of automatic recognition of teacher behavior in actual teaching scene. Firstly, we discover the educational pattern which it be named “teacher set”, that is, the spatial region of the video of the whole class where teachers should exist. Based on this, the algorithm of teacher set identification and extraction (Teacher-set IE algorithm) is studied to identify the teacher in the teaching video, and reduce the interference factors of classroom background. Then, an improved behavior recognition network based on 3D bilinear pooling (3D BP-TBR) is presented to enhance fusion representation of three-dimensional features thus identifying the categories of teacher behavior, and experiments show that 3D BP-TBR can achieve better performance on public and self-built dataset (TAD-08). Hence, our whole approach can increase recognition accuracy of teacher behavior in the actual teaching scene to utilize the deep integration of educational characteristics and action recognition technology.
Similar content being viewed by others
References
Van den Hurk HTG, Houtveen AAM, Van de Grift WJCM (2016) Fostering effective teaching behavior through the use of data-feedback. Teach Teach Educ60:444–451
Hadie SNH, Hassan A, Talip SB et al (2018) The Teacher Behavior Inventory: validation of teacher behavior in an interactive lecture environment. Teacher Development, pp 1–14
Gebhard JG (1998) Teaching English as a foreign or second language: A teacher self-development and methodology guide. University of Michigan Press, Michigan
Cheng K H, Tsai C C (2019) A Case Study of Immersive Virtual Field Trips in an Elementary Classroom: Students’ Learning Experience and Teacher-student Interaction Behaviors. Comput Educ 140:103600
Nagro S A, Cornelius K E (2013) Evaluating the evidence base of video analysis: a special education teacher development tool. Teach Educ Special Educ 36(4):312–329
Mintzes J J (1982) Relationships between student perceptions of teaching behavior and learning outcomes in college biology. J Res Sci Teach 19(9):789–794
Flanders N A (1961) Analyzing teacher behavior. Educ Leadersh 19(3):173
Kucuk S, Sisman B (2017) Behavioral Patterns of Elementary Students and Teachers in one-to-one Robotics Instruction. Comput Educ 111:31–43
Zhang J, Zhu K (2012) The analytical research on teaching behavior based on classroom observation. Mod Educ Technol 22(4):25–28
Man X (2018) An Analysis of Japanese Teaching Behavior Based on the Combination Membership Function. In: International Conference on Intelligent Transportation, Big Data & Smart City, pp 258–261
Simonyan K, Zisserman A Two-stream Convolutional Networks for Action Recognition in Videos. arXiv:1406.2199
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning Spatiotemporal Features with 3D Convolutional Networks. In: IEEE International Conference on Computer Vision, pp 4489–4497
Wang L, Xiong Y, Wang Z, Qiao Y, Lin D et al Temporal Segment Networks: Towards Good Practices for Deep Action Recognition. arXiv:1608.00859
Zhou B, Andonian A, Oliva A, Torralba A Temporal Relational Reasoning in Videos. arXiv:1711.08496
Zolfaghari M, Singh K, Brox T (2018) ECO: Efficient Convolutional Network for Online Video Understanding. In: Lecture Notes in Computer Science, pp 713–730
Qiu Z, Yao T, Mei T (2017) Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks. In: IEEE International Conference on Computer Vision, pp 5534– 5542
Diba A, Fayyaz M, Sharma V et al Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classiffcation. arXiv:1711.08200
Carreira J, Zisserman A (2017) Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 4724–4733
Ren H, Xu G (2002) Human Action Recognition in Smart Classroom. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp 417–422
Raza A, Yousaf M H, Sial H A, Raja G (2015) HMM-Based Scheme for Smart Instructor Activity Recognition in a Lecture Room Environment. Smart Comput Rev 5(6):578–590
Nida N, Yousaf M H, Irtaza A, Velastin S A (2019) Instructor activity recognition through deep spatiotemporal features and feedforward extreme learning machines. Math Probl Eng:1–13
Reinke WM, Herman KC, Newcomer L (2016) The Brief Student–Teacher Classroom Interaction Observation: Using Dynamic Indicators of Behaviors in the Classroom to Predict Outcomes And Inform Practice. Assessment for Effective Intervention, pp 1–11
Flanders N A (1963) Intent, action and feedback: a preparation for teaching. J Teach Educ 14 (3):251–260
Kiemer K, Gröschner A, Pehmer A K, Seidel T (2015) Effects of a classroom discourse intervention on teachers’ practice and students’ motivation to learn mathematics and science. Learn Instr 35(1):94–103
Wang H, Schmid C (2013) Action Recognition with Improved Trajectories. In: IEEE International Conference on Computer Vision, pp 3551–3558
Mahjoub A B, Atri M (2019) An Efficient end-to-end Deep Learning Architecture for Activity Classification. Analog Integr Circ Sig Process 99:23–32
Wang X, Gao L, Song J, Shen H (2017) Beyond Frame-level CNN: Saliency-aware 3D CNN with LSTM for Video Action Recognition. IEEE Signal Process Lett 24(4):510–514
He K, Zhang X, Ren S, Sun J (2016) Deep Residual Learning for Image Recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Gao H, Liu Z, Laurens VDM, Kilian QW (2017) Densely Connected Convolutional Networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2261–2269
Xiong X, Min W, Zheng W, et al. (2020) S3d-CNN: Skeleton-based 3D Consecutive-low-pooling Neural Network for Fall Detection. Appl Intell 50:3521–3534
Song H, Wu X, Zhu B, Wu Y, Chen M, Jia Y (2019) Temporal action localization in untrimmed videos using action pattern trees. IEEE Trans Multimed 21(3):717–730
Purwanto D, Pramono R R A, Chen Y T, Fang W H (2019) Three-Stream Network with bidirectional Self-Attention for action recognition in extreme Low-Resolution videos. IEEE Signal Process Lett 26 (8):1187–1191
Li Z, Gavrilyuk K, Gavves E, Jain M, Snoek C G M (2017) VideoLSTM Convolves, Attends and Flows for Action Recognition. Comput Vis Image Underst 166:41–50
Soomro K, Zamir A R, Shah M (2012) UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild. arXiv:1212.0402
Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: A Large Video Database for Human Motion Recognition. In: International Conference on Computer Vision, pp 2556–2563
Heilbron FC, Escorcia V, Ghanem B, Niebles JC (2015) ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 961–970
Gu C, Chen S, David R et al (2018) AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions. In: IEEE International Conference on Computer Vision, pp 6047–6056
Pan J, Chen S, Shou Z, Shao J, Li H (2020) Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization. arXiv:2006.07976
Linstone H, Turoff M (1975) The Delphi Method. Techniques and Applications
Okoli C, Pawlowski SD (2004) The Delphi Method as A Research Tool: An Example, Design Considerations and Applications - Sciencedirect. Inf Manag 42(1):15–29
Belton I, Macdonald A, Wright G, Hamlin I (2019) Improving the Practical Application of The Delphi Method in Group-based Judgment: A Six-step Prescription for A Well-founded and Defensible Process. Technol Forecast Soc Change 147:72–82
Valtonen T, Sointu E, Kukkonen J, Kontkanen S et al (2017) TPACK Updated to Measure Pre-service Teachers’ Twenty-first Century Skills. Austral J Educ Technol 33(3):15–31
Liu Q, Zhang N, Chen W, Wang Q, Yuan Y, Xie K (2020) Categorizing Teachers’ Gestures in Classroom Teaching: From the Perspective of Multiple Representations. Social Semiotics, pp 1–21
He K, Gkioxari G, Dollar P, Girshick R (2017) Mask R-CNN. IEEE Trans Pattern Anal Mach Intell 42(2):386–397
Wojke N, Bewley A, Paulus D (2017) Simple Online and Realtime Tracking with a Deep Association Metric. In: IEEE International Conference on Image Processing, pp 3645–3649
Lin TY, RoyChowdhury A, Maji S (2015) Bilinear CNN Models for Fine-Grained Visual Recognition. In: IEEE International Conference on Computer Vision, pp 1449–1457
Yu C, Zhao X, Zheng Q, Zhang P, You X (2018) Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition. In: European Conference on Computer Vision, pp 595– 610
Szegedy C, Ioffe S, Vanhoucke V. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. arXiv:1602.07261
Majd M, Safabakhsh R (2019) A Motion-aware convLSTM Network for Action Recognition. Appl Intell 49(7):2515– 2521
Ray J, Chang S F, Paluri M ConvNet Architecture Search for Spatiotemporal Feature Learning. arXiv:1708.05038
Liu Z, Li Z, Wang R, Zong M, Ji W (2020) Spatiotemporal Saliency-based Multi-stream Networks with Attention-aware LSTM for Action Recognition. Neural Computing & Application (11)
Khowaja S A, Lee S (2020) Semantic image networks for human action recognition. Int J Comput Vis 128:393–419
Zhang Z, Lv Z, Gan C, Zhu Q (2020) Human Action Recognition using Convolutional LSTM and Fully-connected LSTM with Different Attentions. Neurocomputing 410:304–316
Zong M, Wang R, Chen Z, et al. (2020) Multi-cue based 3D Residual Network for Action Recognition. Neural Comput Appl:1–15
Zheng Z, An G, Wu D, Ruan Q (2019) Spatial-temporal Pyramid based Convolutional Neural Network for Action Recognition. Neurocomputing 358:446–455
Qiu ZF, Yao T, Ngo CW, Tian XM, Mei T (2019) Learning Spatio-Temporal Representation With Local and Global Diffusion. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12056–12065
Yao G, Lei T, Zhong J, et al. (2019) Learning Multi-temporal-scale deep Information for Action Recognition. Appl Intell 49:2017–2029
Zhu Y, Liu G (2020) Fine-grained Action Recognition using Multi-view Attentions. Vis Comput 36:1771–1781
Fang M, Bai X, Zhao J, et al. (2020) Integrating gaussian mixture model and dilated residual network for action recognition in videos. Multimed Syst 26:715–725
Li J, Liu X, Zhang M, Wang D (2020) Spatio-temporal Deformable 3D ConvNets with Attention for Action Recognition. Pattern Recogn 98(2020):107037
Acknowledgements
This work was supported by the Research on Automatic Segmentation and Recognition of Teaching Scene with the Characteristics of Teaching Behavior of National Natural Science Foundation of China [61977034]; and the Project named Research on Outdoor Experiential Learning Environment Construction Method Based on Scene Perception granted by the Humanities and Social Science project of Chinese Ministry of Education[17YJA880104]; and the Research on Key Technology of Intelligent Education Evaluation and Service Based on Blockchain Technology (CCNU20ZN004) financially supported by self-determined research funds of CCNU from the colleges basic research and operation of MOE. We also thank the anonymous reviewers for their valuable comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Declaration of interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Rights and permissions
About this article
Cite this article
Gang, Z., Wenjuan, Z., Biling, H. et al. A simple teacher behavior recognition method for massive teaching videos based on teacher set. Appl Intell 51, 8828–8849 (2021). https://doi.org/10.1007/s10489-021-02329-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02329-y