Encyclopedia of Educational Innovation

Living Edition
| Editors: Michael A. Peters, Richard Heraud

Multimedia in the Teaching and Assessment of Listening

  • Ruslan SuvorovEmail author
Living reference work entry
DOI: https://doi.org/10.1007/978-981-13-2262-4_87-1

Framing the Issue

Listening is a complex, multidimensional skill that plays a pivotal role in second language (L2) acquisition and is critical for overall communicative competence. Listening can be defined as “the process of receiving, constructing meaning from, and responding to spoken and/or nonverbal messages” (ILA 1995, p. 4). However, listening is not limited only to the cognitive processes of receiving and processing auditory and visual information, but it is also affected by the listener’s short-term and long-term memory, background knowledge, affective factors (such as motivation or anxiety), communicative context (e.g., listening to an academic lecture or an informal conversation), and other situational variables (such as accent and speech rate of the interlocutor). Despite the growing body of research on various aspects and dimensions of listening, it is still widely recognized as the most undervalued, under-researched, and neglected language skill. Furthermore, due to its complex and fugacious nature, listening is considered to be one of the most challenging skills to teach and assess.

There are three main approaches used in L2 listening instruction: a bottom-up approach, a top-down approach, and an integrated approach (Wolvin 2010). A bottom-up approach focuses on the development of L2 learners’ ability to extract lexical information from the stream of sound, including word-recognition and word-segmentation skills. On the other hand, the primary focus of a top-down approach is the development of metacognitive knowledge about listening, such as L2 learners’ understanding of the types of listening skills required to complete a particular listening task and effective strategies for improving listening comprehension. In an integrated approach, both top-down and bottom-up aspects of listening instruction are addressed.

While technology has been an integral part of listening instruction since the times when the recording and transmission of acoustic signals became a reality, the inclusion of multimedia into the teaching and assessment of listening represents a more recent development. Multimedia can be defined as the presentation of “both words (such as spoken text or printed text) and pictures (such as illustrations, photos, animation, or video)” (Mayer 2005, p. 2). That is to say, the use of multimedia entails presenting content in both verbal and pictorial forms. In the context of L2 listening instruction, multimedia also refers to the integration of text, graphics, and/or video for scaffolding purposes in the form of captions, subtitles, transcripts, and annotations (Mohsen 2016).

Some empirical evidence implies that using multimedia in the teaching and assessment of L2 listening can be advantageous for language learners. Such advantages include the potential of multimedia to enhance the input, make it more comprehensible, and facilitate its saliency. Video use, for instance, can provide learners with paralinguistic cues such as kinesics (i.e., body language, hand gestures, and facial expressions) that can reinforce or complement the auditory information. Given the fact that in most real-life contexts listening comprehension is multimodal and mobilizes both a verbal and a visual channel, including a multimedia component into listening instruction allows for augmenting the authenticity of listening activities (Ockey and Wagner 2018). Meanwhile, enhancing listening instruction with multimedia elements may pose certain challenges, as evidenced by the examination of existing empirical research and theoretical perspectives on multimedia and L2 listening provided in the next section.

Making the Case

One of the most dominant theoretical perspectives to explicate how the human mind processes multimedia input is Mayer’s (2005) cognitive theory of multimedia learning. Underlying this theory are three assumptions: (a) humans process visual and auditory information via two separate channels (dual channels assumption), (b) humans can process only a limited amount of information via each channel at a time (limited capacity assumption), and (c) humans engage in active learning by employing a panoply of cognitive processes to attend to, organize, and integrate incoming information with existing knowledge (active processing assumption; Mayer 2005). Related to the dual channels assumption is Paivio’s (1986) dual coding theory, which postulates the existence of verbal and nonverbal (i.e., visual) information processing channels and suggests that providing information in both modes can aid learning. Both Mayer’s (2005) and Paivio’s (1986) theories have guided many studies exploring multimedia in L2 listening and have been influential in shaping the understanding of how L2 learners process auditory and visual information.

Driven by these theoretical perspectives, many studies exploring the use of multimedia in L2 listening appear to be based on the assumption that multimedia is beneficial for the teaching and assessment of this skill. One of the main perceived benefits of multimedia is visual cues that are believed to make multimodal listening easier than listening in an audio-only mode for L2 learners. Another reported benefit is that the authenticity of visuals makes L2 listening activities more closely resemble the type of listening that people are normally exposed to in most real-life contexts (with a few exceptions, such as talking on the phone or listening to a radio). While there is some empirical evidence of multimedia use being beneficial for the teaching and assessment of listening, research findings indicate that the overall state of affairs is a lot more complex.

Existing research on multimedia in L2 listening can be roughly divided into three main strands: (a) research on the role and impact of multimedia on L2 listening comprehension and development, (b) research on the use of audio-visual texts in L2 listening tests, and (c) research on the use of multimedia as help options for L2 listening comprehension. In the first strand of research, studies exploring emic perspectives on the usefulness of multimedia for L2 listening comprehension have found that, overall, L2 learners view multimedia favorably. However, the quantitative results have been mixed regarding the supporting role of multimedia on L2 listening comprehension and development. Kinesics, for instance, are generally deemed to have a facilitative effect on the processing and interpretation of auditory messages; however, there can be significant variations among L2 listeners in the ways they use paralinguistic information and in the extent to which such information aids their listening comprehension. Whether multimedia has beneficial or deleterious effects can depend on the types of visuals, their semantic complexity, the degree of congruence between the visual and the auditory information, and the overall proficiency in the target language, with less proficient L2 learners being more easily distracted by visuals or overwhelmed by the need to process both the verbal and the visual input, which poses an additional cognitive load. Learners’ individual differences, including prior knowledge, learning styles, learning strategies, and affective responses such as motivation and interest, have also been found to contribute to differential effects of multimedia on L2 listening comprehension and development (Vandergrift and Goh 2012).

The second strand of research deals with the use of audio-visual texts in L2 listening tests and the implications of including nonverbal components for the construct of L2 listening ability measured by such tests. Traditionally, research in this area has been limited to comparative studies investigating differences between L2 learners’ performance on audio-only listening tests versus listening tests mediated by images or video. The results have been contradictory and inconclusive: While some studies have revealed that the inclusion of the visual channel increases L2 learners’ test performance, other studies have found no difference in test performance, yet others evinced decreased test performance (Ockey and Wagner 2018). The reasons for such disparate findings are multitudinous and are mostly due to the lack of homogeneity among the studies, including the use of different types of visuals and item formats in listening tests, variations in the reliability of listening tests, as well as differences in proficiency levels of L2 learners. One of the key limitations of such comparative studies is the underlying assumption that learners always interact with visuals. However, there is strong empirical evidence from eye-tracking research (Suvorov 2015) suggesting significant variability in viewing behavior, with some L2 test-takers avoiding any eye contact with the visual input during L2 listening tests.

In the third strand of research are studies that have explored the use of multimedia for scaffolding purposes in the form of captions, subtitles, transcripts, and annotations to aid L2 learners’ listening comprehension and development. The availability of such help options provides learners with control over the extent of their interaction with the listening content and enables them, for instance, to look up the meaning of unknown vocabulary items (annotations) or discern what was said in the audio stream by reading its written form (transcription). Despite somewhat mixed results in this area of research (Mohsen 2016), evidence suggests that, in general, multimedia help options can be propitious for L2 listening comprehension and vocabulary acquisition, although L2 learners vary in how they use them and to what extent. The extent to which this facilitative effect can be attributed to the improved ability to comprehend the acoustic signal rather than to the ability to read the written text in the help options, however, remains a moot point. In fact, overreliance on help options has been criticized as a potential distraction that encourages L2 learners to resort to reading instead of utilizing their listening skills.

Educational Innovations and Implications

Given the proliferation of technology in the increasingly multimedia-rich globalized world, multimedia-enhanced L2 listening instruction is becoming the norm rather than the exception. The omnipresence of mobile devices in daily lives and untrammeled access to the Internet have created unique affordances for educational innovations and ubiquitous learning, or learning outside of the spatial and temporal confinements of a language classroom. Nowadays, the web offers ample opportunities for the development of multimodal listening, including online resources for practicing L2 listening comprehension, tools for creating multimedia content and engaging in multimodal communication, and innovative technologies that can enable new types of interactive listening activities.

There is a plethora of online resources for the development of L2 listening comprehension, many of which are offered as open educational resources (OER) that are freely available for educational purposes. These resources include authentic multimedia materials such as TED Talks (https://www.ted.com/talks) and academic lectures from open courseware and projects such as Khan Academy (https://www.khanacademy.org), MIT OpenCourseWare (https://ocw.mit.edu/index.htm), and Open Yale Courses (https://oyc.yale.edu). Another valuable resource for multimodal listening are video podcasts (or vodcasts), which are short videos on specific topics. Unlike TED Talks and academic lectures that have to be watched online, vodcasts can be subscribed to and downloaded for future listening or watching. Finally, there are websites with listening materials and activities designed specifically for language learning purposes, such as Randall’s ESL Cyber Listening Lab (https://esl-lab.com) and esl-lounge (http://www.esl-lounge.com).

In addition to using the existing multimedia resources for listening, language educators and learners can create their own multimedia listening materials. Technologies available for this purpose comprise (a) tools for recording and/or editing audio and multimedia content, such as Vocaroo (https://vocaroo.com), Audacity (https://www.audacityteam.org), and QuickTime (https://support.apple.com/quicktime); (b) mobile instant messaging applications that allow for recording and transmitting multimedia content, such as WhatsApp (https://www.whatsapp.com) and Viber (https://www.viber.com); (c) web-conferencing tools that enable multimodal communication, such as Skype (https://www.skype.com) and Appear.in (https://appear.in); and (d) platforms for multimodal interactive discussions, such as Flipgrid (https://flipgrid.com) and VoiceThread (https://voicethread.com).

More importantly, the inexorable and rapid technological progress is constantly creating affordances for innovations in L2 listening instruction and assessment. Augmented reality (AR) and virtual reality (VR) applications, for instance, can generate meaningful, contextualized, and place-based language learning experiences that immerse language learners in communicative situations that simulate real-world situations. Furthermore, innovative open-source authoring tools such as H5P (https://h5p.org/) can be leveraged to create interactive multimedia content that can prompt L2 learners to complete specific tasks in an interactive video. Such H5P-enabled interactive videos can support the development of interactive listening skills and encourage L2 learners to use metacognitive processes and reflect on their own listening comprehension.

The theoretical perspectives and research evidence discussed in the previous section have several important implications for the use of multimedia in L2 listening. First, language instructors should integrate multimedia in L2 listening materials in order to create comprehensible input for their students. In doing so, instructors need to ensure semantic congruence between the visual and the verbal input, which – as research evidence suggests – has the strongest beneficial effect on L2 listening. Furthermore, multimedia integration can also enhance L2 learners’ noticing of certain aspects of the auditory input. However, while multimedia may look enticing and promising, its use does not automatically facilitate listening comprehension or enhance the development of L2 listening skills, and, consequently, multimedia should not be viewed as a panacea. It is therefore essential for instructors to teach their students how to increase their metacognitive awareness, how to listen effectively, and how to avail themselves of the multimedia help options, when the latter are available.

Although empirical evidence suggests that multimedia help options may be beneficial for L2 listening development – and particularly helpful for lower-level L2 learners who may need assistance with specific vocabulary and comprehension of connected speech – such help options should be used sparingly and only for scaffolding purposes. Overreliance on help options can be detrimental for the development of listening skills at the expense of reading and does not represent what happens in most authentic communicative situations where listening is primarily an acoustic activity that is not accompanied by written translations of the auditory input. Taking into consideration that multimedia help options may also pose additional cognitive constraints, language learners should be provided with an option to turn them off when they find them to be distracting or unhelpful.

When teaching L2 listening with multimedia, language educators should consider adopting an integrated model of teaching that supports the use of both bottom-up approaches (e.g., with a focus on phonetic and lexical recognition and segmentation) and top-down approaches (e.g., with a focus on metacognitive awareness and reflection). For instance, L2 learners can be asked to watch a video with subtitles that have omitted words and write down those omitted words while watching the video (a bottom-up approach). Learners can also be invited to watch an interactive video with embedded tasks that encourage them to answer a question about what they have just heard or reflect on their own listening comprehension (a top-down approach).

Another implication for the use of multimedia is the importance of interactive listening. With interactive listening being an active, rather than a passive, skill that requires the listener not only to “hear” the auditory message but also to engage with it and the interlocutor, the positive impact on the development of L2 listening comprehension is most prominent when learners actively interact with the multimedia input rather than just function as passive listeners. Interactive listening can be best effectuated through technology that enables the creation and delivery of interactive multimedia content.

There are also several important caveats regarding the use of multimedia in L2 listening assessment. First, whether to integrate multimedia in a listening test or not should depend on whether the target language use (TLU) domain – that is, a real-life context in which an L2 user would be expected to perform the same language tasks – utilizes visuals. For instance, if a particular assessment task aims to measure the test-takers’ ability to talk on the phone, no visuals should be included in the task. However, if the purpose of the assessment task is to test L2 learners’ ability to comprehend an academic lecture, integration of the visuals is essential for the authenticity of the task. Furthermore, language instructors need to make an informed decision regarding the types of visuals to be used for L2 listening assessment and have a valid justification for their choice of visuals. Visuals should be selected and/or designed carefully so that they resemble as much as possible the characteristics of the visual input an L2 learner would normally encounter in a TLU domain. For example, a video of a professor in front of the board containing some visual content is a more authentic representation of an academic lecture setting than a video of the professor’s talking head. Finally, language instructors should avoid the use of multiple-choice questions in multimedia-enhanced L2 listening assessment tasks designed to check their students’ listening comprehension. When answering multiple-choice questions, language learners oftentimes over-rely on the use of test-wiseness strategies such as guessing, a problem that is particularly prevalent among students with lower-level proficiency in the target language. Designing listening assessment tasks that elicit open-ended responses from L2 learners can help obviate this problem.

The final recommendation for language instructors is that any use of multimedia in L2 listening instruction should be learner-centered rather than technology-centered. In other words, when pursuing the idea of multimedia implementation in their teaching and assessment practices, language instructors should be driven not by the desire to incorporate multimedia because of some new capabilities of the latest cutting-edge technology but by a clear understanding of how – and to what extent – such capabilities can foster the development of their students’ L2 listening skills in a pedagogically sound way that is also in line with current theoretical perspectives on how the human mind processes multimedia input.

Undoubtedly, listening is indispensable for interpersonal and interpretive modes of communication. Considering the potential and perceived benefits of multimedia discussed above, its implementation in L2 listening instruction appears to be essential for improving L2 learners’ overall communicative competence.



  1. International Listening Association (ILA). (1995). A ILA definition of listening. The Listening Post, 53(1), 4–5.Google Scholar
  2. Mayer, R. E. (Ed.). (2005). The Cambridge handbook of multimedia learning. New York: Cambridge University Press.Google Scholar
  3. Mohsen, M. A. (2016). The use of help options in multimedia listening environments to aid language learning: A review. British Journal of Educational Technology, 47(6), 1232–1242.CrossRefGoogle Scholar
  4. Ockey, G. J., & Wagner, E. (2018). Assessing L2 listening: Moving towards authenticity. Philadelphia: John Benjamins.CrossRefGoogle Scholar
  5. Paivio, A. (1986). Mental representation: A dual coding approach. New York: Oxford University Press.Google Scholar
  6. Suvorov, R. (2015). The use of eye tracking in research on video-based second language (L2) listening assessment: A comparison of context videos and content videos. Language Testing, 32(4), 463–483.CrossRefGoogle Scholar
  7. Vandergrift, L., & Goh, C. C. M. (Eds.). (2012). Teaching and learning second language listening: Metacognition in action. New York: Routledge.Google Scholar
  8. Wolvin, A. D. (Ed.). (2010). Listening and human communication in the 21st century. Oxford, UK: Wiley-Blackwell.Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.University of Hawaiʻi at MānoaHonoluluUSA

Section editors and affiliations

  • Okim Kang
  • Alyssa Kermad
    • 1
  1. 1.Languages, Literatures, and CulturesAppalachian State UniversityBooneUSA