Introduction

Based on the sociocultural theory of second language acquisition (SLA), it is the social context and interaction that mediate language learning; therefore, they have been considered significant in the SLA process (Ellis 2008). Additionally, the person and the world are connected in an inseparable relationship (Lantolf 2005). A context-dependent social interaction is most important to SLA because it provides L2 learners with essential scaffolding for acquiring an L2 (Vygotsky 1978). Swain (2000) suggested that language learning occurs both inside the head of the learner and in the world in which the learner experiences the learning. In short, internal mediation (mental activity) is originated through external mediation (Ellis 2008).

What has been emphasized in the inseparability of external and internal mediation during context-dependent interaction in sociocultural SLA is in line with the argument of embodied cognition. The formative role of the environment (context) plays in the development of cognitive process (Cowart 2005) is emphasized, specifically on the “interaction between perception, action, the body and the environment” (Barsalou 2008), which is different from the traditional perspective that the body plays a small role in cognition. Studies in line with embodied cognition have observed different roles of actions in cognitive processes and have suggested that human mind is closely connected to sensorimotor experience. Several general theories of embodied cognition, such as those proposed by Glenberg et al. (Glenberg et al. 2004; Glenberg and Goldberg 2011; Glenberg and Kaschak 2002) and Barsalou (2008) argued that the cognitive process develops when a tightly coupled system emerges from interactions between organisms and their environment, with the interactions being real-time and goal-directed (Cowart 2005).

What has been found in terms of embodied language processing that a person’s bodily sensations and actions will impact how he/she comprehends language is consistent with the indexical hypothesis which states that an understanding of language results from a simulation of the actions implied by the meaning of the sentence (Glenberg and Kaschak 2002). Accumulating evidence from embodied cognition research supports the argument that action enhances comprehension (Asher 1977; Glenberg and Goldberg 2011; Glenberg et al. 2004; Tellier 2008). In recent years, the findings obtained from brain research also echo the view that language processing is an embodied process (Aziz-Zadeh and Damasio 2008; Willems and Casasanto 2011); that bodily action in the contextual environment and the person’s perceptual experiences are inseparable during the cognition process. Rueschemeyer et al. (Rueschemeyer et al. 2010) state that intentional actions activating the brain resources used for the motor system are also engaged in lexical-semantic processing and language comprehension. Additionally, the motor system is automatically activated under the following three situations: when a person (a) observes manipulable objects; (b) processes action verbs; and (c) observes the actions of another individual (Mahon and Caramazza 2008).

Virtual immersion environments, such as Second Life (SL, a multiuser virtual environment) or Massively Multiplayer Online Role-Playing Games (MMORPGs), have gained the attention of cross-disciplinary researchers (Lan 2014; Wang and Burton 2013) because they make both avatar-self movement and different interactions between the learner and the virtual environments possible (Lan et al. 2013). Thus, such environments have been recognized to provide learners with an embodied learning experience (Schubert et al. 1999). In contrast to controlling an avatar via a mouse or a keyboard (like SL and MMORPGs), gesture-based technologies (like Play Station Move and MS Kinect) that mainly involve gestures, or body motion, have also been widely used to support the physical effects on learning (Chao et al. 2013; Chang et al. 2013; Hung et al. 2014). However, the abovementioned embodied motion and the interaction obtained in a virtual environment (VE) are accomplished not via the learners’ physical bodies, but their avatars. Thus, we may have to wonder whether the avatar-based embodied motions are sufficient and strong enough to originate the essential internal mediation in learners’ brains and consequently have an effect on language comprehension and acquisition. Obviously, more cross-disciplinary evidence is needed to answer the abovementioned questions and to add to the knowledge base of embodied cognition and language learning in virtual worlds.

This special issue aims at providing a platform for researchers to present their efforts on studies that may offer insights into the relationship between virtually embodied cognition and language acquisition. These are open questions worthy of further exploration. It is expected that through the publication of this special issue, we can help develop a further understanding of embodied cognition and language learning. After a rigorous review process, eight high-quality research papers have been accepted for publication in this special issue, and these papers clearly explain the relationship between embodied cognition and language learning in VEs from different perspectives. We hope that these studies will inspire future research in this direction.

In the first paper entitled “The effects of collaborative models in Second Life on French learning,” Hsiao, Yang, and Chu investigated the effects of employing different collaborative models on learners’ French performance and their perceptions about learning French. Twenty-three college students participated in the study. After analyzing student-created movies and students’ collaborative process, it was found that by virtually involving the students in context-inclusive collaboration, their speaking capability and inter-peer collaboration can be positively influenced. In the second paper with the title of “Second language acquisition of Mandarin Chinese vocabulary: context of learning effects,” Lan, Fang, Legault, and Li built three learning contexts (zoo, supermarket, and kitchen), in both the virtual and the traditional web-based environments, for learners of Chinese as a foreign language. Thirty-one monolingual English speakers, randomly assigned to the VE and the traditional learning contexts, participated in a training study in which they were learning 90 Chinese words. After the experiment, the participants’ behavioral performance with regard to accuracy, reaction time, and exposure were collected and analyzed by using variance and mixed-effects modeling. The results showed a larger acceleration in the learning trajectory for the participants in the VE context than those in the traditional learning context. The results suggest that simulated embodied experience in the VE may have aided vocabulary acquisition in a second language. Huang and Huang confirmed the effect of embedding a scaffolding strategy in a handheld sensor-based vocabulary game on the learning motivation and performance of students of English as a foreign language. In their paper with the title of “A scaffolding strategy to develop handheld sensor-based vocabulary games for improving students’ learning motivation and performance,” they found that with the supports of scaffolding, the low-achieving students’ motivation and performance in vocabulary learning improved significantly. Next, in the fourth paper, Pasfield-Neofitou, Huang, and Grant investigated the relationship between virtual embodiment and language learning. Through two case studies, they found that the multimodal communication established by the participants in virtual worlds made the distinguishability between participants’ real identities and their virtual avatars vague. In other words, the boundaries between the real and VE are highly permeable, with students moving freely between the two, and the actions of their avatars in the VE having a direct and real cognitive impact on the students themselves.

In the fifth paper, Lin, Chao, and Huang investigated the effects of one of the most discussed affective factors, anxiety, on college students’ learning of the Japanese language. They developed an intelligent affective tutoring system to recognize the facial expressions of learners of Japanese language for providing them with adequate feedback. The results show that this proposed system is beneficial for the learning of the Japanese language, reducing learning anxiety, and improving learning effectiveness. Different from previous papers with general languages as the target languages (Chinese, English, French, or Japanese), in the sixth paper, Hung, Hsu, Chen, and Kinshuk investigated the effects of a situated embodiment-based strategy with flag semaphore on learning sign languages. A total of 60 college students with no experience in learning flag semaphore participated in their study. The results show that the proposed strategy with situated embodiment-based learning effectively improved participants’ procedural knowledge construction and enhanced their attention level with the lower extrinsic cognitive load in the learning process. In the seventh paper entitled “Vocabulary learning in massively multiplayer online games: context and action before words,” Zheng, Bischoff, and Gilliland investigated how vocabulary was learnt by FL learners in Word of Warcraft (WOW), a massively multiplayer online game by analyzing both chat and avatar action data obtained from a 2-h co-play between two players. The analytic results provide the readers with an alternative explanation of how players embodied in their avatars appropriated semiotic resources imbued in WOW and make vocabulary learning salient in this context. Last but not least, in the eighth paper, Hong, Hwang, Tai, and Lin, investigated the relationship among self-efficacy, competitive anxiety, and gameplay interest of elementary students in a one-on-one competition setting. A total of 278 fifth and sixth graders participated in their study. The results show that both self-efficacy and gameplay interest are negatively associated with students’ competitive anxiety.

The abovementioned papers will likely provide readers with a deep and extensive understanding of the relationship between embodied cognition and language learning in VEs. The covered target languages include Chinese, English, French, Japanese, sign, and non-verbal. Moreover, the participants were drawn from a range of ages, from elementary school students to college students. The investigated variables relating to language learning are various, including performance, motivation, anxiety, self-efficacy, and sense of identity. Although the papers included in this special issue have covered a broad range of issues in embodied cognition and language learning in virtual worlds, there are other issues that may further attract researchers’ attention in the future. For example, longer-term longitudinal studies, rather than studies in short terms like several hours, days, or 1 or 2 months, are needed. Additionally, the analysis of learners’ learning behaviors in virtual worlds relating to SLA is essential, such as their communication patterns and strategy usage. As argued by Lan (2015), learning a language in virtual worlds can improve learners’ performance, motivation, as well as provide them with authentic contexts for conducting embodied and game-liked learning activities that meet people’s growing experiences in the digital era. Researchers’ efforts on exploring the potential of virtual worlds for SLA will be worthwhile because it presents a potential solution to the problems encountered in today’s second language education.