Skip to main content
Log in

An English video teaching classroom attention evaluation model incorporating multimodal information

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

In order to solve the problem of low detection efficiency and long working time in the traditional video surveillance system for abnormal behavior detection and identification methods. A multimodal abnormal behavior detection and identification method based on video surveillance is proposed and applied to an online video classroom concentration evaluation task for college students in English. The model works by capturing abnormal behaviors and facial expressions and building a joint network that fuses abnormal behaviors and facial expressions. By testing on two open-source datasets and self-built classroom real-time datasets, the results verify that the model in this paper has better recognition performance compared to current mainstream models while maintaining real-time performance. The model proposed in this paper provides a new way of thinking about building smart classrooms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Amiryousefi M (2019) The incorporation of flipped learning into conventional classes to enhance EFL learners’ L2 speaking, L2 listening, and engagement. Innov Lang Learn Teach 13(2):147–161

    Article  Google Scholar 

  • Arifani Y, Asari S, Anwar K, Budianto L (2020) Individual or collaborative whatsapp learning? A flipped classroom model of EFL writing instruction. Teach Engl Technol 20(1):122–139

    Google Scholar 

  • Asmali M (2018) Integrating technology into ESP classes: Use of student response system in English for specific purposes instruction. Teach Engl Technol 18(3):86–104

    Google Scholar 

  • Barra P, Mnasri Z, Greco D (2023a), July Multimodal Emotion Recognition from Voice and Video Signals. In IEEE EUROCON 2023-20th International Conference on Smart Technologies (pp. 169–174). IEEE

  • Barra P, Cantone AA, Francese R, Giammetti M, Sais R, Santosuosso OP, Vitiello G (2023b), August MetaCUX: Social Interaction and Collaboration in the Metaverse. In IFIP Conference on Human-Computer Interaction (pp. 528–532). Cham: Springer Nature Switzerland

  • Chen C-M, Wang J-Y (2018) Effects of online synchronous instruction with an attention monitoring and alarm mechanism on sustained attention and learning performance. Interact Learn Environ 26(4):427–443

    Article  Google Scholar 

  • Chen C, Wang J, Yu C (2017) Assessing the attention levels of students by using a novel attention aware system based on brainwave signals. Br J Edu Technol 48(2):348–369

    Article  Google Scholar 

  • Chien S-Y, Hwang G-J, Jong MS-Y (2020) Effects of peer assessment within the context of spherical video-based virtual reality on EFL students’ english-speaking performance and learning perceptions. Comput Educ 146:103751

    Article  Google Scholar 

  • Chuang H, Weng C, Chen C (2018) Which students benefit most from a flipped classroom approach to language learning? Br J Edu Technol 49(1):56–68

    Article  Google Scholar 

  • Dankwa S, Yang L (2021) An efficient and accurate depth-wise separable convolutional neural network for cybersecurity vulnerability assessment based on CAPTCHA breaking. Electronics 10(4):480

    Article  Google Scholar 

  • English LD, King D, Smeed J (2017) Advancing integrated STEM learning through engineering design: Sixth-grade students’ design and construction of earthquake resistant buildings. J Educational Res 110(3):255–271

    Article  Google Scholar 

  • Fatimah AS, Santiana S (2017) Teaching in 21st century: students-teachers’ perceptions of technology use in the classroom. Scr Journal: J Linguistic Engl Teach 2(2):125

    Google Scholar 

  • Goharinejad S, Goharinejad S, Hajesmaeel-Gohari S, Bahaadinbeigy K (2022) The usefulness of virtual, augmented, and mixed reality technologies in the diagnosis and treatment of attention deficit hyperactivity disorder in children: an overview of relevant studies. BMC Psychiatry 22(1):1–13

    Article  Google Scholar 

  • Hodgson TR, Cunningham A, McGee D, Kinne LJ, Murphy TJ (2017) Assessing behavioral engagement in flipped and non-flipped mathematics classrooms: teacher abilities and other potential factors. Int J Educ Math Sci Technol 5(4):248–261

    Article  Google Scholar 

  • Jia N, Zheng C, Sun W (2022) A multimodal emotion recognition model integrating speech, video and MoCAP. Multimedia Tools Appl 81(22):32265–32286

    Article  Google Scholar 

  • Jiang L, Ren W (2021) Digital multimodal composing in L2 learning: ideologies and impact. J Lang Identity Educ 20(3):167–182

    Article  Google Scholar 

  • Kabooha R, Elyas T (2018) The effects of YouTube in multimedia instruction for vocabulary learning: perceptions of EFL students and teachers. Engl Lang Teach 11(2):72–81

    Article  Google Scholar 

  • Kizi GMG, Shadjalilovna SM (2022) Developing diagnostic assessment, assessment for learning and assessment of learning competence via task based language teaching. Academicia Globe: Inderscience Res 3(04):34–38

    Google Scholar 

  • Köroglu ZÇ, Çakir A (2017) Implementation of flipped instruction in language classrooms: an alternative way to develop speaking skills of pre-service English language teachers. Int J Educ Dev Using Inform Communication Technol 13(2):42–55

    Google Scholar 

  • Kuo Y-C, Chu H-C, Tsai M-C (2017) Effects of an integrated physiological signal-based attention-promoting and English listening system on students’ learning performance and behavioral patterns. Comput Hum Behav 75:218–227

    Article  Google Scholar 

  • Leontjev D, DeBoer MA (2022) Multimodal mediational means in assessment of processes: an argument for a hard-CLIL approach. Int J Bilingual Educ Biling 25(4):1275–1291

    Article  Google Scholar 

  • Lim FV, Toh W, Nguyen TTH (2022) Multimodality in the English language classroom: a systematic review of literature. Linguistics Educ 69(1):101048

    Article  Google Scholar 

  • Liu T, Yu S, Xu B, Yin H (2018) Recurrent networks with attention and convolutional networks for sentence representation and classification. Appl Intell 48:3797–3806

    Article  Google Scholar 

  • Mercer N, Warwick P, Ahmed A (2017) An oracy assessment toolkit: linking research and development in the assessment of students’ spoken language skills at age 11–12. Learn Instruction 48:51–60

    Article  Google Scholar 

  • Nash BL, Brady RB (2022) Video games in the secondary English language arts classroom: a state-of‐the‐art review of the literature. Reading Res Q 57(3):957–981

    Article  Google Scholar 

  • Shadiev R, Huang Y-M, Hwang J-P (2017) Investigating the effectiveness of speech-to-text recognition applications on learning performance, attention, and meditation. Education Tech Research Dev 65:1239–1261

    Article  Google Scholar 

  • Shadiev R, Wu T-T, Huang Y-M (2018) Enhancing learning performance, attention, and meditation using a speech-to-text recognition application: evidence from multiple data sources. Learning analytics. Routledge, pp 107–119

  • Shohel Parvez M, Tasnim N, Talapatra S, Ruhani A, Hoque ASMM (2022) Assessment of musculoskeletal problems among Bangladeshi University students in relation to classroom and library furniture. J Institution Eng (India): Ser C, 1–14

  • Smith BE, Pacheco MB, Khorosheva M (2021) Emergent bilingual students and digital multimodal composition: a systematic review of research in secondary classrooms. Reading Res Q 56(1):33–52

    Article  Google Scholar 

  • von Aufschnaiter C, Alonzo AC (2018) Foundations of formative assessment: introducing a learning progression to guide preservice physics teachers’ video-based interpretation of student thinking. Appl Measur Educ 31(2):113–127

    Article  Google Scholar 

  • Wang S-H, Zhou Q, Yang M, Zhang Y-D (2021) ADVIAN: Alzheimer’s disease VGG-inspired attention network based on convolutional block attention module and multiple way data augmentation. Front Aging Neurosci 13:687456

    Article  Google Scholar 

  • Wulff P, Buschhüter D, Westphal A, Mientus L, Nowak A, Borowski A (2022) Bridging the gap between qualitative and quantitative assessment in science education research with machine learning—A case for pretrained language models-based clustering. J Sci Edu Technol 31(4):490–513

    Article  Google Scholar 

  • Zainuddin Z, Perera CJ (2019) Exploring students’ competence, autonomy and relatedness in the flipped classroom pedagogical model. J Furth High Educ 43(1):115–126

    Google Scholar 

  • Zhao Y, Chen J, Xu X, Lei J, Zhou W (2021) SEV-Net: residual network embedded with attention mechanism for plant disease severity detection. Concurrency Computation: Pract Experience, 33(10), e6161

  • Zou S (2017) Designing and practice of a college English teaching platform based on artificial intelligence. J Comput Theor Nanosci 14(1):104–108

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lemin Li.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Miao, Q., Li, L. & Wu, D. An English video teaching classroom attention evaluation model incorporating multimodal information. J Ambient Intell Human Comput (2024). https://doi.org/10.1007/s12652-024-04800-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12652-024-04800-3

Keywords

Navigation