Encyclopedia of Educational Innovation

Living Edition
| Editors: Michael A. Peters, Richard Heraud

Technology and Second-Language Listening

  • Paul GrubaEmail author
  • Ruslan Suvorov
Living reference work entry

Later version available View entry history

DOI: https://doi.org/10.1007/978-981-13-2262-4_142-1

Introduction: A Brief Historical Overview

For millennia, humans have used all of their senses to understand their environment and to communicate through touch, sight, smell, taste, and hearing. The increasing use of technology, particularly from the Industrial Age to the present, has focused ever greater attention to the concept of “listening.” Stethoscopes, for example, were invented through a realization that the body could express symptoms through sound, and it took nearly a century of socialization for audiences to learn to “listen” in silence during musical performances (Hendy 2013). Technology, then, forged a “new type of listening” that accelerated with the invention of audio recording and the telephone in the late nineteenth century. Listening became a distinct skill that needed its own pedagogy and materials, and educators soon began to stress the need for attentive listening to master difficult concepts. By the turn of the twentieth century, audio recordings were introduced to classrooms, and soon the widespread use of radio was common in educational settings (Hendy 2013).

Listening instruction was a particular challenge to the growing numbers of immigrants, or second-language (L2) learners, which spurred innovations. Technology allowed exact phrases to be repeated and speech to be carefully analyzed and provided exposure to a variety of speech styles and voices. By the 1930s, for example, the Walt Disney Studios had created educational films intended solely for use with non-native speakers of English. Second and foreign language education made use of the television services, including closed-circuit broadcasts that were introduced in the mid-1940s. For the large part, though, language teachers generally remained skeptical of “telecourses” for the next few decades until the introduction of accessible video equipment in the late 1970s and early 1980s. One key benefit of the video-based instruction, according to language educators, was the marked improvement of L2 listening comprehension skills. Meanwhile, the invention of reel-to-reel tapes, cassette tapes, and CDs in the second half of the twentieth century continued fueling the popularity of language labs and the use of the audiolingual method that emphasized listening comprehension practice through pattern drills and repetition (Jones 2008).

Since the introduction of the World Wide Web to the general public in the early 1990s, digital audio and video media have become ever increasingly popular; in early 2019 on one large video-sharing site, YouTube, over one billion hours of content were watched daily across 91 countries in 80 languages (YouTube for Press n.d.). With pervasive and ubiquitous mobile computing having gradually become an integral part of many people’s daily lives, language learners now have diverse opportunities for listening outside of the classroom in the form of podcasts and streaming multimedia. Contemporary virtual reality and augmented reality now blur the boundaries between the digital and real-world experiences that may afford more opportunities for second-language listening education in ways yet to be explored.

The Complex Nature of Second-Language Listening

Listening is widely recognized as a complex cognitive activity comprising neurological, linguistic, semantic, and pragmatic processing (Rost 2011). While there is no unanimously recognized definition of listening, perhaps the most widely agreed view is that listening can be understood as “the process of receiving, constructing meaning from, and responding to spoken and/or nonverbal messages” (International Listening Association [ILA] 1995, p. 4), with recognition of the role of memory in this process.

Several approaches have been proposed to explain the process of second-language listening. According to the bottom-up processing approach, a listener processes the acoustic input by first analyzing individual phonemes; then combining them into words, phrases, and sentences; and, ultimately, using verbal information to build more complex mental representations. In the top-down processing approach, processing of the input starts with the activation of background knowledge, or schema activation, and entails using this knowledge to help decode the acoustic signal. The third approach, called an interactive or integrated approach, combines the processes underlying the first two approaches and is considered to be more flexible in explaining individual variation in second-language listening (Flowerdew and Miller 2005).

The complexity of listening as a language skill is further compounded by the existence of multiple factors that affect listening comprehension. These factors can be divided into four broad categories: (a) individual listener characteristics such as working memory, overall proficiency in the target language, motivation, anxiety, and metacognitive strategies; (b) individual speaker characteristics such as speech rate, accent, pauses, and gestures; (c) text characteristics such as text length, text type, text complexity, and the use of visuals; and (d) task characteristics, including control of the number of replays, time limit for completing the task, type of expected or required response, and ability to take notes during the task. Unsurprisingly, another layer of complexity in second-language listening is created by technology itself.

Technology in Second-Language Listening

Advances in technology continue to change how second-language listening is practiced, taught, assessed, and researched. The introduction and widespread use of personal mobile devices have encouraged language learners to listen to various types of content on the go and practice their skills outside of the classroom. As listening is understood to be developed alongside speaking, learners can now leverage web-conferencing and VoIP technologies for audio- and video-mediated interaction in the target language. The availability of a wide variety of technologies has also engendered a new generation of active language learners who own, personalize, and appropriate technologies to suit their individual learning needs (Conole 2008).

New technology continues to enhance opportunities for L2 listeners. Contemporary learners now access web-based resources for listening that expose them to different contexts, speakers, accents, rates of speech, content type, and text complexity. Secondly, technology offers them a range of enhanced capabilities for interacting with the media and listening support, including manipulation of the speed of the audio/video track, embedded interactive elements in the video such as comprehension questions, links to external web resources, automatic subtitles, and speech-to-text translation. Another benefit is related to the relative ease of recording, storing, and streaming media content such as podcasts and vodcasts using personal devices. Furthermore, technology provides a plethora of affordances for L2 listening education. Examples of such affordances (or mediating characteristics; Hubbard 2017) include archiving and indexing of sound files, streaming and downloading media files, pausing and replaying, and transforming text to speech.

To better understand the role of technology in second-language listening, this topic can be examined from three perspectives: technology for the learning and teaching of L2 listening, technology for L2 listening assessment, and technology for research on L2 listening.

Technology for the Learning and Teaching of L2 Listening

Because it offers instructors the ability to play, pause, replay, and distribute listening texts, technology has a prominent position in the learning and teaching of second languages. Teachers can draw on global sites such as YouTube or Vimeo to source authentic materials for use in the classroom. Furthermore, motivated and self-directed students themselves can now locate and use existing content for autonomous learning activities, such as curated collections of podcasts hosted on iTunes or similar platforms.

Teachers can also make use of technology to create, edit, and publish their own listening resources that are tailored to fit a particular purpose. For example, a teacher may record a series of differing accents as a way to expose L2 listeners to the need to adjust their skills to a variety of speakers. Audio files are often created using open-source tools such as Audacity that allow for recording and editing audio files. A wealth of commercial tools exist, too. For the creation of video-based listening materials, educators must take into account what platform will host the video content (hosting), whether the video content should be available for downloading or streaming (format), whether access to the content should be restricted in some way (access), as well as how the content will be used over the long term (sustainability).

Generally speaking, listening for the purposes of L2 instruction can be characterized along three main factors: degree of formality, options for interaction, and number of speakers who are involved in the event. For example, a recorded academic lecture would likely have a formal register, be presented by a single lecturer, and have few, if any, options for interaction and questioning. On the other hand, a listener at a lively party may encounter lots of informal talk among many people and be expected to interact in ways that correspond to the topics being discussed. A range of listening options could be devised using these three factors. Other factors such as age, class, and gender could be taken into account as well when searching for or creating new listening materials.

Beyond relatively simple audio and video recordings, current applications allow instructors to create or leverage existing immersive virtual reality experiences, augmented reality experiences, and interactive multimedia texts such as video materials that have embedded captions, external weblinks, interactive activities, and assessment tasks. The potential of listening materials to be augmented has eased the production of a number of “help options” that can assist listeners with developing their proficiency in ways that align with the principles of second-language acquisition. Such options, for example, may help students to “notice” particular linguistic features and include glosses, captions, subtitles, transcripts, and annotations, as well as “cultural notes” that may provide an explanation of the appropriate meanings of utterances, phrases, or actions within a specific context (Mohsen 2016).

Technology for L2 Listening Assessment

When used for assessment purposes, technology takes on roles that may differ from those inherent in learning and teaching. Specifically, technology now becomes a much more prominent factor in the “ecosystem” of listening in that its very use and presence may influence assessment outcomes. Recordings must now be of professional quality and have elements in them (such as specific terms or phrases, types of accent, appearance of an actor) that have been identified as salient to the determination of a listening score and relevant to the construct that the assessment instrument aims to measure. When using technology to create listening prompts, a series of factors need to be considered. These factors may require making decisions on whether to record fully scripted or unscripted texts, what types of speech varieties to use, and what types of visuals, if any, to integrate in the prompts in order to increase the authenticity of the listening assessment. If listening is part of a larger construct of oral communication, integrated and interactive oral tasks will need to be developed.

Because of the inherent variability of listening texts in factors ranging from rate of speech, choice of topic, and linguistic complexity, test developers require that the material used to prompt a test candidate be closely analyzed. Briefly, the material must provide a sufficient basis on which to demonstrate a certain level of proficiency, yet be generic enough such that any resulting score be reliable and valid in other contexts. While test developers may decide, for instance, to use video texts instead of audio texts as prompts in order to increase authenticity of the assessment instrument and ensure that its design takes into account the characteristics of the target language use domain, the use of visuals may introduce additional context-specific variables threatening the validity of inferences and interpretations that can be made about L2 learners’ overall listening abilities on the basis of their test scores.

Once produced in accord with test specifications, the actual delivery of the test material will need to involve technology, too. Compliance with standards such as the Web Content Accessibility Guidelines (WCAG) becomes a greater issue when the assessment is to be used across global networks, and aspects of human-computer interaction design may also need to be taken into account to ensure that the test is well-designed. The security of computer-delivered tests is another important issue that must be addressed during the test development and test administration.

Technology in L2 Listening Research

In the field of L2 education, and its associated discipline of applied linguistics, there has been relatively little research about listening compared to other traditional language skills. Of the research that does exist, much of it has involved the use of recorded materials with a focus on the pedagogical or assessment implications of listening comprehension. Given this focus, researchers have sought to ensure that their work with participants has been able to be repeated; that is, recorded materials have been needed to isolate variables or factors that may influence listening abilities. Both audio and video technologies are used as modes of presentation in L2 listening research.

Research designs in L2 listening studies often seek to compare one element to another in an effort to determine which of them most influences listening comprehension. Such studies have compared, for instance, the extent to which listener performances are affected by the presence or absence of help options such as captions, subtitles, and transcripts; other studies have examined the ways audio texts may affect listener ability when set against video texts. A longitudinal view of comparison studies used in listening research, however, reveals that their results are oftentimes mixed and inconclusive. Factors such as small sample sizes, poor construct definitions, and the inability to disentangle multiple variables have plagued efforts in the field.

Second-language listening research involves attention to the interaction of three factors: the listener, the text, and the task. Second-language listeners are often screened for research purposes on the basis of their language proficiency, first language and cultural background, and level of education. For the text, it is scrutinized on the basis of topic, complexity, and presumed level of difficulty (unlike reading research, no common metric exists to set the level of listening text difficulty). Finally, a task must be constructed such that it stimulates a response from the listener in relation to the text; task construction is grounded in designs that align with second-language acquisition principles that include how an element may be noticed by a listener. Technology, often computer based, is used in the presentation of both the text and the task informed by work in human-computer interaction and related fields.

When researching pedagogical goals, researchers who make use of digital audio technologies often seek to better understand how aural information is best presented to L2 listeners; that is, their work may isolate variables related to speed, accent, complexity, or tone such that any variation in one factor may cause comprehension difficulties. Researchers using digital video often seek to investigate how, and why, visual elements may influence the way a listener interprets a message. Outcomes of such research not only spur theory development but also contribute to the development of learning and teaching materials.

Concerns in research on the assessment of L2 listening often center around the issues of practicability, validity, and reliability. Listening subsections of large-scale commercial language tests, such as the Test of English as a Foreign Language (TOEFL) or the International English Language Testing System (IELTS), provide audio texts and a static photograph to test candidates. Their use of audio files is justifiable in that small files are more practical for global instruments and, importantly, measures of validity and reliability are more sound with a reduced set of variables. Research on the use of audiovisual texts as a mode of presentation in listening assessment has produced conflicting results, and it is fair to say that further research on listening to audiovisual texts is sorely needed.

Listening research involves technology at each phase of its development. At the start, researchers use technology to record material that they think is suitable to the intended participants. After recording, some editing may be required to trim mistakes, select key elements, or even add captions to an audiovisual text. Technology, often in the form of a personal computer along with headphones, is then used to present the material to individual participants. The listeners also use the computer to respond to listening tasks. From there, the collected data are analyzed with the use of quantitative and/or qualitative analysis tools, and the results are written up for publication.

Technology also plays a pivotal role in validation research on L2 listening. Eye tracking, for instance, can be leveraged to gather validity evidence based on language learners’ response processes. Specifically, this technology can be used to record direct evidence of the elements of the audiovisual text that language learners were looking at during the listening assessment, how and to what extent they interacted with individual test items, and how much time they spent completing each listening task. Eye-movement recordings can also be used as a stimulus to retrospectively elicit information about the test-taking strategies or cognitive processes that language learners utilized while taking the listening test.

In little over a century, the use of technology has become increasingly intertwined with the concept of listening, and now the fast pace of globalization has fostered an ever-increasing interest in L2 skills in areas of learning, assessment, and research. Trends in each domain of listening now align with those in the wider field of educational technology and include a stronger emphasis on social interaction, the latest technological innovations such as augmented and virtual reality, and the use of global resources.



  1. Conole, G. (2008). Listening to the learner voice: The ever changing landscape of technology use for language students. ReCALL, 20(2), 124–140.CrossRefGoogle Scholar
  2. Flowerdew, J., & Miller, L. (2005). Second language listening: Theory and practice. New York: Cambridge University Press.CrossRefGoogle Scholar
  3. Hendy, D. (2013). Noise: A human history of sound and listening. New York: Ecco.Google Scholar
  4. Hubbard, P. (2017). Technologies for teaching and learning L2 listening. In C. A. Chapelle & S. Sauro (Eds.), The handbook of technology and second language teaching and learning (pp. 93–106). Oxford: Wiley-Blackwell.CrossRefGoogle Scholar
  5. International Listening Association (ILA). (1995). A ILA definition of listening. The Listening Post, 53(1), 4–5.Google Scholar
  6. Jones, L. C. (2008). Listening comprehension technology: Building the bridge from analog to digital. CALICO Journal, 25(3), 400–419.CrossRefGoogle Scholar
  7. Mohsen, M. A. (2016). The use of help options in multimedia listening environments to aid language learning: A review. British Journal of Educational Technology, 47(6), 1232–1242.CrossRefGoogle Scholar
  8. Rost, M. (2011). Teaching and researching listening (2nd ed.). Harlow: Pearson.Google Scholar
  9. YouTube for Press. (n.d.). Retrieved March 27 2019 From https://www.youtube.com/yt/about/press/

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.University of MelbourneMelbourneAustralia
  2. 2.University of Hawaiʻi at MānoaHonoluluUSA

Section editors and affiliations

  • David Parsons
    • 1
  1. 1.The Mind LabAucklandNew Zealand