Abstract
More and more educational institutions are making lecture videos available online. Since 100+ empirical studies document that captioning a video improves comprehension of, attention to, and memory for the video [1], it makes sense to provide those lecture videos with captions. However, studies also show that the words themselves contribute only 7% and how we say those words with our tone, intonation, and verbal pace contributes 38% to making messages clear in human communication [2]. Consequently, in this paper, we address the question of whether an AI-based visualization of voice characteristics in captions helps students further improve the watching and learning experience in lecture videos. For the AI-based visualization of the speaker’s voice characteristics in the captions we use the WaveFont technology [3–5], which processes the voice signal and intuitively displays loudness, speed and pauses in the subtitle font. In our survey of 48 students, it could be shown that in all surveyed categories—visualization of voice characteristics, understanding the content, following the content, linguistic understanding, and identifying important words—always a significant majority of the participants prefers the WaveFont captions to watch lecture videos.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In this research we refer to interlingual translation as subtitles and transcription in the same language as captions.
- 2.
BMBF funding number: 16DHB3006; running time 1.1.2020–31.12.2022.
- 3.
- 4.
References
Gernsbacher, M.A.: Video captions benefit everyone. Policy Insights Behav. Brain Sci. 2(1), 195–202 (2015)
Marteney, J.: Verbal and nonverbal communication. ASCCC Open Educational Resources Initiative (OERI). https://socialsci.libretexts.org/@go/page/67152 (2020)
Wölfel, M., Schlippe, T., Stitz, A.: Voice driven type design. In: International Conference on Speech Technology and Human-Computer Dialog (SpeD), Bucharest, Romania (2015)
Schlippe, T., Wölfel, M., Stitz, A.: Generation of text from an audio speech signal. US Patent 10043519B2 (2018)
Schlippe, T., Alessai, S., El-Taweel, G., Wölfel, M., Zaghouani, W.: Visualizing voice characteristics with type design in closed captions for Arabic, International Conference on Cyberworlds (CW 2020), Caen, France (2020)
United Nations: Sustainable Development Goals: 17 goals to transform our world. https://www.un.org/sustainabledevelopment/sustainable-development-goals (2021)
Correia, A.P., Liu, C., Xu, F.: Evaluating videoconferencing systems for the quality of the educational experience. Distance Educ. 41(4), 429–452 (2020)
Koravuna, S., Surepally, U.K.: Educational gamification and artificial intelligence for promoting digital literacy. Association for Computing Machinery, New York, NY, USA (2020)
Chen, L., Chen, P., Lin, Z.: Artificial intelligence in education: A review. IEEE Access 8, 75264–75278 (2020). https://doi.org/10.1109/ACCESS.2020.2988510
Rakhmanov, O., Schlippe, T.: Sentiment analysis for Hausa: Classifying students’ comments. The 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages (SIGUL 2022). Marseille, France (2022)
Libbrecht, P., Declerck, T., Schlippe, T., Mandl, T., Schiffner, D.: NLP for student and teacher: Concept for an AI based information literacy tutoring system. In: The 29th ACM International Conference on Information and Knowledge Management (CIKM2020). Galway, Ireland (2020)
Sawatzki, J., Schlippe, T., Benner-Wickner, M.: Deep learning techniques for automatic short answer grading: Predicting scores for English and German answers. In: The 2nd International Conference on Artificial Intelligence in Education Technology (AIET 2021). Wuhan, China (2021)
Schlippe, T., Sawatzki, J.: Cross-lingual automatic short answer grading. In: The 2nd International Conference on Artificial Intelligence in Education Technology (AIET 2021). Wuhan, China (2021)
Bothmer, K., Schlippe, T.: Investigating natural language processing techniques for a recommendation system to support employers, job seekers and educational institutions. In: The 23rd International Conference on Artificial Intelligence in Education (AIED) (2022)
Bothmer, K., Schlippe, T.: Skill Scanner: Connecting and supporting employers, job seekers and educational institutions with an AI-based recommendation system. In: Proceedings of The Learning Ideas Conference 2022 (15th Annual Conference), New York, 15–17 June (2022)
Schlippe, T., Sawatzki, J.: AI-based multilingual interactive exam preparation. In: Guralnick, D., Auer, M.E., Poce, A. (eds.) TLIC 2021. LNNS, vol. 349, pp. 396–408. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-90677-1_38
Wölfel, M.: Towards the automatic generation of pedagogical conversational agents from lecture slides. In: International Conference on Multimedia Technology and Enhanced Learning (2021)
Ou, C., Joyner, D.A., Goel, A.K.: Designing and developing video lessons for online learning: A seven-principle model. Online Learn. 23(2), 82–104 (2019)
Wang, J., Antonenko, P., Dawson, K.: Does visual attention to the instructor in online video affect learning and learner perceptions? An eye-tracking analysis. Comput. Educ. 146 (2020). https://doi.org/10.1016/j.compedu.2019.103779
Perego, E., Del Missier, F., Porta, M., Mosconi, M.: The cognitive effectiveness of subtitle processing. Media Psychol. 13, 243–272 (2010)
Linebarger, D.L.: Learning to read from television: The effects of using captions and narration. J. Educ. Psychol. 93, 288–298 (2001)
Bowe, F.G., Kaufman, A.: Captioned Media: Teacher Perceptions of Potential Value for Students with No Hearing Impairments: A National Survey of Special Educators. Described and Captioned Media Program, Spartanburg, SC (2001)
Guo, P.J., Kim, J., Rubin, R.: How video production affects student engagement: An empirical study of MOOC videos. In: L@S’14: Proceedings of the First ACM Conference on Learning. March 2014, pp. 41–50 (2014). https://doi.org/10.1145/2556325.2566239
Alfayez, Z.H.: Designing educational videos for university websites based on students’ preferences. Online Learn. 25(2), 280–298 (2021)
Persson, J.R., Wattengård, E., Lilledahl, M.B.: The effect of captions and written text on viewing behavior in educational videos. Int. J. Math Sci. Technol. Educ. 7(1), 124–147 (2019)
Vy, Q.V., Fels, D.I.: Using placement and name for speaker identification in captioning. In: Miesenberger, K., Klaus, J., Zagler, W., Karshmer, A. (eds.) ICCHP 2010. LNCS, vol. 6179, pp. 247–254. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14097-6_40
Brown, A., et al.: Dynamic subtitles: The user experience. In: TVX (2015)
Fox, W.: Integrated titles: An improved viewing experience. In: Eyetracking and Applied Linguistics (2016)
Ohene-Djan, J., Wright, J., Combie-Smith, K.: Emotional subtitles: A system and potential applications for deaf and hearing impaired people. In: CVHI (2007)
Rashid, R., Aitken, J., Fels, D.: Expressing emotions using animated text captions. Web Design for Dyslexics: Accessibility of Arabic Content (2006)
Bessemans, A., Renckens, M., Bormans, K., Nuyts, E., Larson, K.: Visual prosody supports reading aloud expressively. Visible Lang. 53, 28–49 (2019)
Gernsbacher, M.: Video captions benefit everyone. Policy Insights Behav. Brain Sci. 2, 195–202 (2015)
El-Taweel, G.: Conveying emotions in Arabic SDH: The case of pride and prejudice. Master thesis, Hamad Bin Khalifa University (2016)
de Lacerda Pataca, C., Costa, P.D.P.: Speech modulated typography: Towards an affective representation model. In: International Conference on Intelligent User Interfaces, pp. 139–143 (2020)
de Lacerda Pataca, C., Dornhofer Paro Costa, P.: Hidden bawls, whispers, and yelps: Can text be made to sound more than just its words? (2022). arXiv:2202.10631
Bringhurst, R.: The elements of typographic style, vol. 3.2, pp. 55–56. Hartley and Marks Publishers (2008)
Unger, G.: Wie man’s liest, pp. 63–65. Niggli Verlag (2006)
Bai, Q., Dan, Q., Mu, Z., Yang, M.: A systematic review of emoji: Current research and future perspectives. Front. Psychol. 10, 2221 (2019). https://doi.org/10.3389/fpsyg.2019.02221
Rayner, S.G.: Cognitive styles and learning styles. In: Wright, J.D. (ed.) International Encyclopedia of Social and Behavioral Sciences, vol. 4, 2nd edn, pp. 110–117. Elsevier, Oxford (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Schlippe, T., Fritsche, K., Sun, Y., Wölfel, M. (2023). AI-Based Visualization of Voice Characteristics in Lecture Videos’ Captions. In: Cheng, E.C.K., Wang, T., Schlippe, T., Beligiannis, G.N. (eds) Artificial Intelligence in Education Technologies: New Development and Innovative Practices. AIET 2022. Lecture Notes on Data Engineering and Communications Technologies, vol 154. Springer, Singapore. https://doi.org/10.1007/978-981-19-8040-4_8
Download citation
DOI: https://doi.org/10.1007/978-981-19-8040-4_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8039-8
Online ISBN: 978-981-19-8040-4
eBook Packages: EngineeringEngineering (R0)