Using texts generated by STR and CAT to facilitate student comprehension of lecture content in a foreign language

  • Rustam Shadiev
  • Ai SunEmail author


In this study, we applied a combination of speech-to-text recognition (STR) and computer-aided translation (CAT) technologies during lectures in English as a foreign language to facilitate student comprehension of the lecture content. The instructor lectured in English, the STR system generated texts from the voice input, and the CAT system then simultaneously translated the STR texts into the students’ native language. We aimed to test the feasibility of applying combined STR and CAT technologies to facilitate student comprehension of lecture content in a foreign language. To this end, we designed an experiment. Three groups with twenty students each were formed. All students attended the same lectures: (a) students in the control group attended lectures without any support, (b) students in experimental group 1 attended lectures with STR support (i.e., they were presented with texts in English generated from the instructor’s speech by STR), and (c) students in experimental group 2 attended lectures with STR and CAT support (i.e., they were presented with texts in their native language that were translated from English by STR and CAT). We compared the posttest results of the students in the three groups. We also explored the effects of our approach with respect to different levels of foreign language ability. Finally, we surveyed the perceptions of students in experimental group 2 about the usefulness of the translated texts for their learning. Our results showed that applying STR and CAT technologies together was a useful approach: the translated texts helped significantly improve student learning performance compared to that of the students in the control condition. Translated texts were beneficial for students, as the students were able (a) to confirm some words that were not clearly spoken by the instructor or to find the meaning of some words with which the students were not familiar and (b) to complement spoken lecture content with translated content to help information processing and enhance comprehension. When comparing students with different language abilities, we found that students with low language abilities benefited from the translated texts the most. The students’ language ability was significantly lower than that of the high-ability students before the experiment; however, the low-ability students’ learning performance showed no significant difference from the high-ability students after the experiment. Finally, most students perceived translated texts as useful for their learning, and they intended to use the texts in the future for learning purposes.


Comprehension Lecture Foreign language STR CAT 



There is no funding to report for this study.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.


  1. Bloomfield, A., Wayland, S. C., Rhoades, E., Blodgett, A., Linck, J., & Ross, S. (2010). What makes listening difficult? Factors affecting second language listening comprehension. College Park: University of Maryland.CrossRefGoogle Scholar
  2. Camiciottoli, B. C. (2005). Adjusting a business lecture for an international audience: A case study. English for Specific Purposes,24(2), 183–199.CrossRefGoogle Scholar
  3. Chen, S. J. (2004). Linguistic dimensions of subtitling. Perspectives from Taiwan. Meta: Journal des traducteursMeta:/Translators’ Journal,49(1), 115–124.CrossRefGoogle Scholar
  4. Chen, I. J., & Chang, C. C. (2009). Cognitive load theory: An Empirical study of anxiety and task performance in language learning. Electronic Journal of Research in Educational Psychology,7(2), 729–746.Google Scholar
  5. Chen, I. J., Chang, C. C., & Yen, J. C. (2012). Effects of presentation mode on mobile language learning: A performance efficiency perspective. Australasian Journal of Educational Technology,28(1), 122–137.CrossRefGoogle Scholar
  6. Cheng, X. (2000). Asian students’ reticence revisited. System,28(3), 435–446.CrossRefGoogle Scholar
  7. Clark, R. C., & Mayer, R. E. (2016). E-learning and the science of instruction: Proven guidelines for consumers and designers of multimedia learning (4th ed.). San Francisco, CA: Wiley.CrossRefGoogle Scholar
  8. Debuse, J. C., Hede, A., & Lawley, M. (2009). Learning efficacy of simultaneous audio and on-screen text in online lectures. Australasian Journal of Educational Technology,25(5), 748–762.CrossRefGoogle Scholar
  9. Diao, Y., Chandler, P., & Sweller, J. (2007). The effect of written text on comprehension of spoken English as a foreign language. The American Journal of Psychology,120(2), 237–261.Google Scholar
  10. ElShiekh, A. A. A. (2012). Google translate service: transfer of meaning, distortion or simply a new creation? An investigation into the translation process & problems at google. English Language and Literature Studies,2(1), 56–68.Google Scholar
  11. Ferreira, F., Bailey, K. G., & Ferraro, V. (2002). Good-enough representations in language comprehension. Current Directions in Psychological Science,11(1), 11–15.CrossRefGoogle Scholar
  12. Ferreira, F., Engelhardt, P. E., & Jones, M. W. (2009). Good enough language processing: A satisficing approach. In Proceedings of the 31st annual conference of the cognitive science society. Austin: Cognitive Science Society.Google Scholar
  13. First, E. (2013). EF English proficiency index. Retrieved on March 17, 2017.Google Scholar
  14. Godwin-Jones, R. (2011). Emerging technologies: Mobile apps for language learning. Language Learning & Technology,15(2), 2–11.Google Scholar
  15. Graham, S. (2011). Self-efficacy and academic listening. Journal of English for Academic Purposes,10(2), 113–117.CrossRefGoogle Scholar
  16. Hermet, M., & Desilets, A. (2009). Using first and second language models to correct preposition errors in second language authoring. In Proceedings of the fourth workshop on innovative use of NLP for building educational applications (pp. 64–72). Association for Computational Linguistics.Google Scholar
  17. Hwang, W. Y., Shadiev, R., Kuo, T. C. T., & Chen, N. S. (2012). Effects of speech-to-text recognition application on learning performance in synchronous cyber classrooms. Educational Technology & Society,15(1), 367–380.Google Scholar
  18. Kalyuga, S. (2014). The expertise reversal principle in multimedia learning. In R. E. Mayer (Ed.), The Cambridge handbook of multimedia learning (pp. 576–597). New York: Cambridge University Press.CrossRefGoogle Scholar
  19. Kalyuga, S., Ayres, P., Chandler, P., & Sweller, J. (2003). The expertise reversal effect. Educational Psychologist,38(1), 23–31.CrossRefGoogle Scholar
  20. Keysar, B., Hayakawa, S. L., & An, S. G. (2012). The foreign-language effect: Thinking in a foreign tongue reduces decision biases. Psychological Science,23(6), 661–668.CrossRefGoogle Scholar
  21. Kheir, R., & Way, T. (2006, June). Improving speech recognition to assist real-time classroom note taking. In Rehabilitation Engineering Society of North America (RESNA). Conference, Atlanta.Google Scholar
  22. Krashen, S. D. (1985). The input hypothesis: Issues and implications. Reading: Addison-Wesley Longman Ltd.Google Scholar
  23. Krashen, S. (2014). Case histories and the comprehension hypothesis. TESOL Journal,9(1), 70–91.Google Scholar
  24. Kuo, T. C., Shadiev, R., Hwang, W. Y., & Chen, N. S. (2012). Effects of applying STR for group learning activities on learning performance in a synchronous cyber classroom. Computers & Education,58(1), 600–608.CrossRefGoogle Scholar
  25. Kurz, I. (2009). The impact of non-native English on students’ interpreting performance. In G. Hansen, A. Chesterman, & H. Gerzymisch-Arbogast (Eds.), Efforts and models in interpreting and translation research (pp. 179–192). Amsterdam: John Benjamins.CrossRefGoogle Scholar
  26. Lee, C. H., & Kalyuga, S. (2011). Effectiveness of on-screen pinyin in learning Chinese: An expertise reversal for multimedia redundancy effect. Computers in Human Behavior,27(1), 11–15.CrossRefGoogle Scholar
  27. Mayer, R. E. (2009). Multimedia learning. New York: Cambridge University Press.CrossRefGoogle Scholar
  28. Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational Psychologist,38(1), 43–52.CrossRefGoogle Scholar
  29. Mellebeek, B., Khasin, A., Owczarzak, K., Van Genabith, J., & Way, A. (2005). Improving online machine translation systems. In Proceedings of the X MT summit (pp. 290–297). Phuket: Thailand.Google Scholar
  30. Miller, L. (2007). Issues in lecturing in a second language: Lecturer’s behaviour and students’ perceptions. Studies in Higher Education,32(6), 747–760.CrossRefGoogle Scholar
  31. Nisbet, P., Wilson, A., & Aitken, S. (2005). Speech Recognition for Students with Disabilities. In Proceedings of the inclusive and supportive education congress, ISEC 2005 conference. Delph: Inclusive Technology.Google Scholar
  32. Omar, H., Embi, M. A., & Yunus, M. M. (2012). ESL learners’ interaction in an online discussion via Facebook. Asian Social Science,8(11), 67–74.CrossRefGoogle Scholar
  33. Paas, F., & Sweller, J. (2014). Implication of cognitive load theory for multimedia. In R. E. Mayer (Ed.), The Cambridge handbook of multimedia learning (pp. 27–42). New York: Cambridge University Press.CrossRefGoogle Scholar
  34. Parmar, J. R., Tejada, F. R., Lang, L. A., Purnell, M., Acedera, L., & Ngonga, F. (2015). Assessment of communications-related admissions criteria in a three-year pharmacy program. American Journal of Pharmaceutical Education. Scholar
  35. Pearce, K., & Scutter, S. (2010). Podcasting of health sciences lectures: Benefits for students from a non-English speaking background. Australasian Journal of Educational Technology,26(7), 1028–1041.CrossRefGoogle Scholar
  36. Ranchal, R., Taber-Doughty, T., Guo, Y., Bain, K., Martin, H., Robinson, J. P., et al. (2013). Using speech recognition for real-time captioning and lecture transcription in the classroom. IEEE Transactions on Learning Technologies,6(4), 299–311.CrossRefGoogle Scholar
  37. Ryba, K., McIvor, T., Shakir, M., & Paez, D. (2006). Liberated Learning: Analysis of University students’ perceptions and experiences with continuous automated speech recognition. Journal of Instructional Science and Technology,9(1), 1–19.Google Scholar
  38. Shadiev, R., & Huang, Y. M. (2016). Facilitating cross-cultural understanding with learning activities supported by speech-to-text recognition and computer-aided translation. Computers & Education,98, 130–141.CrossRefGoogle Scholar
  39. Shadiev, R., Hwang, W. Y., Chen, N. S., & Huang, Y. M. (2014). Review of Speech-to-Text Recognition technology for enhancing learning. Educational Technology & Society,17(4), 65–84.Google Scholar
  40. Shadiev, R., Hwang, W. Y., Huang, Y. M., & Liu, C. J. (2016). Investigating applications of speech to text recognition for face to face seminar to assist learning of non-native English participants. Technology, Pedagogy and Education,25(1), 119–134.CrossRefGoogle Scholar
  41. Shadiev, R., Wu, T. T., & Huang, Y. M. (2017). Enhancing learning performance, attention, and meditation using a speech-to-text recognition application: Evidence from multiple data sources. Interactive Learning Environments, 25(2), 249–261.CrossRefGoogle Scholar
  42. Shadiev, R., Wu, T. T., Sun, A., & Huang, Y. M. (2018). Applications of speech-to-text recognition and computer-aided translation for enhancing cross-cultural learning: Issues and their solutions. Educational Technology Research and Development, 66(1), 191–214.CrossRefGoogle Scholar
  43. Sweller, J. (1994). Cognitive load theory, learning difficulty, and instructional design. Learning and Instruction,4(4), 295–312.CrossRefGoogle Scholar
  44. Sweller, J. (2017). Cognitive load theory and teaching English as a second language to adult learners. Contact Magazine,43(1), 10–14.Google Scholar
  45. Sweller, J., Ayres, P., & Kalyuga, S. (Eds.) (2011a). The redundancy effect. In Cognitive load theory (pp. 141–154). Springer New York.Google Scholar
  46. Sweller, J., Ayres, P., & Kalyuga, S. (Eds.) (2011b). Cognitive load theory. New York: Springer.CrossRefGoogle Scholar
  47. Venkatesh, V., & Davis, F. D. (2000). A theoretical extension of the technology acceptance model: Four longitudinal field studies. Management Science,46(2), 186–204.CrossRefGoogle Scholar
  48. Wald, M., & Bain, K. (2008). Universal access to communication and learning: The role of automatic speech recognition. International Journal Universal Access in the Information Society,6(4), 435–447.CrossRefGoogle Scholar
  49. Yeung, A. S. (1999). Cognitive load and learner expertise: Split-attention and redundancy effects in reading comprehension tasks with vocabulary definitions. The Journal of Experimental Education,67(3), 197–217.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Education ScienceNanjing Normal UniversityNanjingChina
  2. 2.School of Humanities and Social SciencesHarbin Institute of Technology in ShenzhenShenzhenChina

Personalised recommendations