Learning across media in a second language

The present study investigated the impact of the medium on learning in L2. Specifically, learning performances from L2 material were compared across three media: text, video, and subtitled video. The participants were 126 undergraduate students who were randomly assigned across three conditions: a text condition, a video condition, and a subtitles condition. First, students were asked to complete three questionnaires on control variables. Second, students were asked to read/watch a learning material and answer comprehension, recall, transfer, and calibration questions twice: immediately and a week after. Results reveal that the participants in the video condition outperformed those in the text condition in delayed comprehension and recall. Overall, learning performances were substantially equivalent across media when assessed immediately after processing the material, but subtitled videos had the potential to boost deeper learning performances only in highly skilled learners.


Introduction
Learners across the world rely on the web to complete assigned projects and study.Around 61% of the websites (whose content language is known) is in English (W3Techs, n.d.), so it must be expected that most of the learning material are also produced in this language.This means that non-native English speakers have access to most of the learning material available on the web in their second language (L2).For instance, in the popular online encyclopedia Wikipedia, most of the articles are written in English (6,313,265 articles,Wikimedia,n.d.), more than double the articles written in the second most popular language, German (2,584,9861 ).Moreover, since the popularization of video-sharing platforms (e.g., YouTube in 2005), videos have become a major medium to get access to educational material (Belt & Lowenthal, 2021).However, most of the available platforms for educational videos are mainly designed for English-speaking learners (e.g., Coursera, Ted, or Khan Academy).
Despite the exponentially growing literature on digital and multimedia learning, it is still unclear to what extent the medium (i.e., a channel or system of communication, information, or entertainment, for example text or video) used to access the material influences the learning performance and in what direction (Salmerón et al., 2020;Wannagat et al., 2017).The L2 issue calls into question the role of subtitled videos, as non-native English speakers may activate subtitles (often also in English) when watching educational videos in L2.The issue is relevant for inclusivity (Lambert, 2020), especially considering that L2 presentation of academic material decreases content learning (Roussel et al., 2017).Indeed, the Web Content Accessibility Guidelines 2.0 (World Wide Web Consortium, 2008) prescribe subtitles for any video published on the web to ensure accessibility to diverse audiences and support non-native English speakers.
The present study investigated the impact of the medium on learning in L2.Learning performances from L2 material were compared across three media: text, video, and subtitled video.

Learning in L2
The increased internalization of higher education institutions (see European cooperation programs, study abroad programs, educational mobility, joint degrees, and MOOCs) has stimulated the growth of multilingual learning environments (Henderikx & Jansen, 2018).Students are increasingly exposed to educational texts and videos in L2 (oftentimes English).Learning in L2 may not involve cognitive processes in the same way as learning in L1 does.
The most prominent model for learning from text is Kintsch's (1998) foundational model, according to which three types (or levels) of memory representations of the text can be constructed by the reader: surface level, textbase level, and situation model level.The surface level is a representation of the words included in the text on the basis of decoding processes.The textbase level is a representation of the network of concepts and propositions included in the text.The situation model level is a coherent representation of the events described by the whole text, which requires the integration of textual information with prior knowledge.The situation model is formed through different types of inference.
When reading in L1, the construction of a coherent representation of the text is a manageable task given that word identification processes are automatic and require little cognitive effort (Tomasello, 2000).When reading in L2, instead, processes are less automatic, even for highly proficient bilinguals (MacWhinney, 2001).Lack of automaticity leads to the consumption of cognitive resources (Hasegawa et al., 2002), which, in turn, means that fewer cognitive resources are left to construct higher discourse-level representations (Rai et al., 2011).According to the competition model (Bates & MacWhinney, 1982;MacWhinney, 2001), L2 reading is even more complicated for late L2 learners as transfer and interference from L1 always occur to some degree (Grabe & Stoller, 2011), even for proficient L2 readers (MacWhinney, 2001).
The most prominent model for learning from video (and text with images) is Mayer's cognitive theory of multimedia learning (CTML; Mayer, 2002), which is based on Paivio's dual-coding theory (1991).According to the CTML, learners process multimedia by coordinating dual channels for visual/pictorial and auditory/verbal processing.Each channel has limited resources to dedicate to processing; thus, learners should select relevant information and organize it in a coherent representation that integrates the verbal and pictorial representations with each other and, in turn, with their prior knowledge.Narrated videos are processed in the auditory-verbal channel for their (oral) textual component and in the visual-pictorial channel for their pictorial component (Mayer, 2002).While videos have potential to boost learning performances (e.g., showing authentic situations, demonstrating procedures, providing a narrative for understanding complex phenomena, Derry et al., 2014), at the same time, it may pose challenges to students, especially if produced in L2.L2 students first need to integrate unfamiliar speech features presented to the audio channel (e.g., speech rate and prosody) and then integrate it with other unfamiliar features across the audio and visual channel (e.g., vocabulary, syntactic structure).
Learning from videos in L2 calls listening comprehension into question.As a comprehension process, listening shares many important processes with reading (Kintsch, 1998).Indeed, according to the Simple View of Reading model (Hoover & Gough, 1990), reading is based on oral language comprehension processes, in interaction with word identification processes.On the other hand, listening is a more cognitively demanding process than reading (Vandergrift & Goh, 2011).As a real-time and transient process, listening cannot be reviewed if comprehension is lost and allows little control on the pace (Vandergrift & Baker, 2015).
Subtitled videos (or on-screen-texted videos) differ from narrated video as they are processed in the visual-verbal and in the visual-pictorial channels (if the audio is missing, otherwise, they involve the auditory-verbal channel too).Subtitled videos differ also from static texts as they offer to learners fleeting text on a dynamic background.Thus, learners have to adjust the reading pace to the pace with which subtitles appear on the screen.If, on the one hand, same-language subtitles have the potential to improve students' learning processes (Matthew, 2020), this result may not extent to subtitles in L2.
L2 subtitled videos have been found to have positive effects on language learning (Montero Perez et al., 2013); however, this effect may not transfer to content learning in L2 (van der Zee et al., 2017).The few studies that investigated this issue have shown that students have better learning performances after watching L2 videos when subtitles are enabled (Hayati & Mohmedi, 2011;Markham, 1999).

Learning in L2 Across Media
When learning across media, students are asked to integrate text and graphics information into coherent mental models (Hochpöchler et al., 2013).The research on learning across media in L1 is characterized by contradictory findings.Some evidence suggests that videos are more effective than texts as they reduce cognitive load (Mayer, 2002), increase learners' attention (Alley et al., 2014) and affective engagement (Yadav et al., 2011).Conversely, other studies suggest that these effects do not transfer to effective learning (Caspi et al., 2005).Finally, some evidence hint towards a substantial equivalence between videos and texts in terms of learning if videos are interactive, thus handing down to the viewer more control over the processing, just as it happens with texts (Merkt et al., 2011).
Integrating text and graphics can be more complex when learning in L2.What concerns the comparison between videos with and without subtitles, Chan et al. (2020) compared learning performances in L2 in undergraduate students assigned to the following conditions: video with foreign-accented narrated voice and full-text subtitles, video with foreign-accented narrated voice and summarized subtitles, video with foreign-accented narrated voice without subtitles, video with native-accented narrated voice and full-text subtitles, video with native-accented narrated voice and summarized subtitles, and video with native-accented narrated voice without subtitles.According to the findings, subtitles hindered learning performances when compared to the no-text conditions.Interestingly, subtitles showed a significant negative impact on transfer accuracy but not on retention accuracy, suggesting that on-screen texts only negatively impact deeper processing of the materials and transfer of knowledge to problem-solving in a new context.Negi and Mitra (2022) randomly assigned participants (16-18 years old) to L1 subtitles, L2 subtitles, and video conditions.According to the results, the subtitles conditions were characterized by higher learning gains than the video condition.No differences between the two subtitles conditions were found.
What concerns the comparison between the text and the video condition, in Schroeders et al. 's study (2010), viewing comprehension and reading comprehension (listening comprehension was included too) were compared in high school students, although the dependent variable was L2 competence and not content learning.The authors found a high correlation between viewing and reading comprehension, a result that was interpreted as evidence in favor of a higher-order ability to comprehend content regardless of the sensory input (Buck, 2001;Schroeders et al., 2010).
What concerns the comparison between videos with or without subtitles, past studies have shown that L2 learners spend a significant amount of time looking at the subtitles when learning from videos (43%, Kruger et al., 2014).Subtitles are supposed to be beneficial when learning from videos in L2 as reading comprehension skills are generally more developed than listening comprehension in L2 students (Danan, 2004).In a study on the effects of subtitles on learning from online educational resources, no significant effect was found, contradicting the lines of research supporting a beneficial or detrimental effect of subtitles.Moreover, L2 competence did not moderate the effect of subtitles (van der Zee et al., 2017).Lee and Mayer (2018) investigated learning from video in L2 across three media: narrated video, subtitled video, and subtitled narrated video.According to their results, providing subtitles was associated with better performances than the other two conditions.According to the authors, on-screen text, which is detrimental in L1 learners, becomes useful for L2 learners as it gives them more time to process unfamiliar or difficult-to-encode words.
The number of studies that investigated learning performances in L2 across different media is very low and, to the best of our knowledge, no previous study has compared learning from text, narrated video, and subtitled video.A study with university students was only focused on L1 (Tarchi et al., 2021).It revealed a substantial equivalence across conditions (digital text, narrated video, same-language subtitled video) when questions were asked immediately after the learning phase, whereas the subtitled condition was associated with lower performances for deeper comprehension a few weeks after the learning phase.
The media effect on learning may depend on prior knowledge.According to the cognitive theory of multimedia learning (Mayer, 2002), design principles that are effective for low-knowledge students may not work well for high-knowledge students.For instance, while low-knowledge students may benefit from a picture-plus-text presentation of learning material, high-knowledge students may learn better when presented with diagrams only (Mayer, 2002).This phenomenon is also known as the expertise reversal effect (Kalyuga, 2007).
According to the concept of level or depth of processing, which finds its roots in cognitive psychology originally developed for L1 (Craik & Lockhart, 1972) but then also applied to L2 (e.g., Leow & Mercer, 2015), remembering information depends on the depth of information processing, besides the attention paid during its occurrence.If learners process incoming information in L2 using their prior knowledge and employing cognitive effort, they are more likely to retain such information (Leow & Mercer, 2015).Thus, according to this theory, prior knowledge plays an even more crucial role in L2 than it does in L1.

The Present Study
The main aim of the present study was to investigate the effect of media on students' performances when learning academic content in L2.In specific, we compared the effect of three media (text, narrated video, and subtitled video) on measures of comprehension, recall, and transfer both immediately after being exposed to the learning material (immediate assessment) and a week after (delayed assessment).Moreover, we assessed the effect of the medium on students' calibration, that is, the contrast between predicted and actual performance (Alexander, 2013).Indeed, nowadays, students perceive themselves as digital natives and may over-judge their competences in learning from digital sources (List, 2018), but this preference may not transfer to better performances (Singer & Alexander, 2017).In the subtitled condition, the audio was removed to increase the equivalence across the three conditions.Indeed, we were interested in verifying the effect of the medium given one only source of verbal information.
When investigating content learning in L2, three variables need to be taken into consideration: students' perceived competence, as it represents a motivational resource for strategic and sustained effort in learning (Liu, 2013), L2 competence (Leow & Mercer, 2015), and prior knowledge.These last two variables may interact in influencing comprehension performances.Indeed, L2 competence is directly associated with content learning in L2, but its effect may be moderated by students' prior knowledge (Leow & Mercer, 2015).It is unclear, however, whether this moderation effect may depend on the medium in which learners process the material.
Overall, the following research questions were investigated: RQ1: Does the medium in which the learning material is presented influence students' comprehension, recall, transfer, and calibration of performance immediately after watching/reading the material (immediate assessment)?
RQ2: Does the presentation medium of the learning material influence students' comprehension, recall, transfer, and calibration of performance a week after watching/reading the material (delayed assessment)?
RQ3: Does the medium in which the learning material is presented moderate the interaction between L2 competence, prior knowledge, and students' performances in either or both in the immediate and delayed assessment?
What concerns RQ1, we expected the subtitled video condition to be associated with better learning performances than the narrated video condition, as more evidence has been found in support of the beneficial effect of subtitles hypothesis (Danan, 2004;Lee & Mayer, 2018) compared to the detrimental effect of subtitles hypothesis (van der Zee et al., 2017).Moreover, based on previous studies in L1, we expected the text condition to be associated with better learning performances than the subtitled video condition (Tarchi et al., 2021).In contrast, no substantial differences in performance between the text and the narrated video conditions were expected, as suggested by previous studies conducted in L2 (Buck, 2001;Schroeders et al., 2010).We also expected worst calibration of performance in the narrated video condition as compared to the other two conditions (List, 2018;Singer & Alexander, 2017).
What concerns RQ2, some studies suggest that effects in a delayed assessment may be different than those in an immediate assessment when it comes to comparing learning performances across conditions (Tarchi et al., 2021).Following the reasoning outlined for RQ1, we expected for the beneficial effect of the subtitled condition as compared with the other conditions to be higher in the delayed assessment than in the immediate assessment.
Finally, what concerns RQ3, past studies suggested that a moderation effect of prior knowledge on the association between L2 competence and learning performances should be expected (Leow & Mercer, 2015).Moreover, past studies on multimedia learning suggested that the effect of media on learning may not be equivalent in students with different levels of prior knowledge (Kalyuga, 2007;Mayer, 2002).Given this, we expect a moderating effect of condition on the moderation exerted by prior knowledge on the association between L2 competence and comprehension.However, no specific hypothesis could be formulated on whether these interactions differ across conditions (see Fig. 1).

Participants
The participants in the study were 126 undergraduate students enrolled in a public university in central Italy (mean age = 23.40 ± 2.88; 83 females, 40 males, 1 preferred not to declare the gender, 2 did not choose any option).Students were enrolled in different bachelor's and master's degree courses.All participants were Italian and spoke Italian as their primary language.The study followed all the indications of the Declaration of Helsinki (World Medical Association, 2013) and was approved by the Ethics Committee of the University of Florence (Italy).The participation was anonymous.The data of two students were excluded from the statistical analysis as they reported having a learning disorder.Our sample size was justified by an a priori power analysis performed in G*power (Faul et al., 2007), based on α = 0.05, 1 − β = 0.85, and an estimated medium effect size (f = 0.25).

Procedure
The data were collected online and remotely through the platform Qualtrics.The participants received a link to the study and could complete the tasks autonomously.The data were collected over 2 weeks in October 2020.The participants were randomly distributed across three conditions: a text condition (n = 41), a video condition (n = 42), and a subtitle condition (n = 41).First, students were asked to complete three questionnaires on control variables.Second, students were asked to read/watch a learning material, answer a series of multiple-choice and open-ended questions, and judge their performance.A week after, students were asked again the same multiple-choice and open-ended questions.Through the analysis of reading/viewing times, we verified that participants did not pause or rewind the video to improve their understanding of the content.
Of notice, control variable measures were assessed in L1 (Italian), except for the L2 reading and listening comprehension test (English).The learning materials and the assessment questions were asked in L2 (English).The questions to assess students' judgment of comprehension (i.e., calibration error) were asked in L1.See the supporting materials for the tests and texts given to the participants.

Learning material
Students were assigned a material about the topic of stress and memory.This topic was relevant for some of the participants' area of study (e.g., psychology); thus, we included a prior topic knowledge test.The original source was a TED-Ed video (https:// www.youtu be.com/ watch?v= hyg7l cU4g8E).The video discussed the stages of how memory stores information and how short-term stress impacts this process.The video was created as an animated slideshow with an embedded narrating voice presenting information in English.In the video condition, students were provided with the original video, which was 4 min and 43 s long.The narration included 712 words.In the subtitles condition, the audio-track was removed, and subtitles were added, reproducing the exact content of the original audiotrack, in sync with the corresponding slide.The amount of text in each slide was similar to closed-captioned videos (1-2 lines of text).In the text condition, participants received a text to read which reproduced the exact content of the original audio-track (712 words).To maintain equivalence across conditions, in the text condition we also included 24 significant images from the video.Students were not encouraged to take notes or implement any strategy while viewing/reading the learning material.

Outcome variables
Immediate comprehension, recall, and transfer After reading/watching the learning material participants were asked to answer a series of multiple-choice and open-ended questions (see supporting material).The questions were designed by two professors in psychology, experts in the topic.To assess immediate comprehension, we asked 14 literal comprehension (i.e., "Corticosteroids are: A. Hormones; B. Neurotransmitters; C. Type of brain cells; D. Organs") and five inferential multiple-choice questions ("Suggesting someone to think harder may: A. Decrease their retrieval performance; B. Increase their retrieval performance; C. Act as a facilitator for memorization; D. Be a useful strategy for memorization").The reliability of this measure was acceptable (α = 0.74).
To assess immediate recall, three open-ended questions were asked: "How does stress affect the three stages of memory?", "Why doesn't some stress help us to remember facts?", and "How can physical exercise regularly affect your memory when taking a test?".Each answer was coded by two independent raters, who achieved a high inter-rater agreement (k = 0.97).All disagreements were discussed and resolved.Each answer received a score from 0 to 2: 0 points were awarded for incorrect answers; 1 point was awarded for partially correct answers (in which some key elements were mentioned, whereas some others were neglected); 2 points were awarded for correct and complete answers.For instance, answers to the first question ("How does stress affect the three stages of memory?") were awarded two points if the participants mentioned that "moderate stress can actually help experiences enter your memory" and "even though some stress can be helpful, extreme and chronic stress can have the opposite effect."If only one of these elements were mentioned, the answer was awarded 1 point.The reliability of this measure was acceptable (α = 0.70).The scores obtained for each answer were summed to calculate a composite score (range = 0-6).
To assess immediate transfer, four open-ended questions were asked: "How may longterm stress impact learning?","Does stress influence memory in a time-dependent manner?","What happens when we are presented with completely new information that does not relate to any of our current memories?",and "Why do you think our memory performs well under controlled amounts of stress but then gets worse as stress levels rise?".Each answer was coded by two independent raters, who achieved an acceptable inter-rater agreement (k = 0.92).All cases of disagreement were discussed and resolved.Each answer that received a score from 0 to 3:0 points were awarded for incorrect answers; 1 point was awarded for partially incorrect answers (in which some elements from the material were vaguely used); 2 points were awarded for partially correct answers (in which only one key element from the material was used for reflection); 3 points were awarded for correct answers (in which all the relevant elements from the learning material were used for reflection).For instance, answers to the second question ("Does stress influence memory in a time-dependent manner?") were awarded three points if the participant referred to "stress long before learning," "consolidation of information," and "memory encoding" in their answer.Two points were awarded if only some of these elements were mentioned and elaborated.One point was awarded if these elements were mentioned but not elaborated.The questions were asked in the following order: first comprehension, then recall, and lastly transfer questions.The reliability of this measure was acceptable (α = 0.71).The scores obtained in each question were summed to calculate a composite score (range = 0-12).

Delayed comprehension, recall, and transfer
The same questions were asked again to the participants one week later.The following measures achieved acceptable levels of reliability: delayed comprehension (α = 0.70) and delayed transfer (α = 0.73).Although the reliability for the delayed recall was lower than desirable (α = 0.68), it can still be considered within the acceptable range for measures developed and used for research purposes (Nunnally, 1978).

Calibration error
To assess calibration error, we followed a standard procedure (Schraw, 2009).The participants were asked to judge on a 1-10 scale the level of correctness of their answers to the questions asked after having read/watched the learning material (0 = no correct answer; 10 = all questions are correctly answered).The difference between judgment of comprehension and correct answers in the immediate comprehension test was calculated to determine the calibration error (calibration error = judgment of comprehension -comprehension performance).This procedure was followed in both assessment stages to calculate the immediate calibration error and the delayed calibration error.

Control variables
Perception of competence in L2 (i.e., English) This variable was assessed through four items to be rated on a 6-point Likert scale (1 = minimum, 6 = maximum).Students were asked to self-report their perceived competence in generic L2 reading comprehension, topic-specific L2 reading comprehension, learning from L2 textbooks and learning from L2 videos.A principal component analysis was performed to extract a composite score for participants' overall perception of competence in learning in L2 [KMO = 0.87; Bartlett sphericity test, χ 2 = 591.70,df = 15, p < 0.001].

Competence in reading and listening comprehension in L2
Reading and listening comprehension in English were assessed through two IELTS (International English Language Testing System) tests, which are designed to assess the language ability of candidates who need to study or work where English is the language of communication.The reading test presented to the participants a 767-word text followed by 14 questions in different formats (multiple-choice, yes/no, grid) (see supporting material).The listening test presented to the participants an audio 7 mi and 38 s long, followed by eight questions in different formats (multiple-choice, yes/no, grid).The order of presentation of these two tests was counterbalanced across participants.

Prior (topic) knowledge
It was assessed through 10 multiple-choice questions on the topic of stress and memory (e.g., "Which of the following one is a cognitive consequence of stress? A. frustration and aggressivity; B. scarce memory; C. heart attack; D. alcohol or drug abuse").The questions were designed by two professors in psychology, experts in the topic.Although the reliability for the prior knowledge test was modest (α = 0.58), reliability estimates in the 0.50 s can still be considered within the acceptable range for measures developed and used for research purposes (Nunnally, 1978).

Results
The descriptive results are reported in Table 1 and Table 2.The correlational analysis is reported in Table 3.
The correlational analysis confirmed an involvement of control variables in immediate and delayed outcomes.All the immediate outcomes (comprehension, recall and transfer) were positively associated with the three control variables (L2 perceived competence, L2 competence, and prior knowledge).Among the delayed outcomes, comprehension was positively associated with all the three control variables, recall was positively associated with L2 competence and transfer was positively associated with L2 competence and prior knowledge.Immediate calibration error was negatively associated with prior knowledge, whereas delayed calibration error was negatively associated with L2 competence.

RQ1: Learning medium and immediate outcomes
To answer the first research question, we conducted an ANCOVA with condition included as a factor, perceived competence in L2, competence in L2 reading and listening comprehension, and prior knowledge as covariates and outcome measures as dependent variables (see Table 4).The ANCOVA model was significant for all the outcome variables, except for calibration error.However, the condition was not significantly associated with any of the outcome measures.The perceived competence in L2 was significantly associated with immediate comprehension only.The competence in L2 reading and listening comprehension was significantly associated with immediate comprehension, recall, and transfer.Prior knowledge was significantly associated with all the immediate outcomes.Overall, these results suggest that learning performances in L2 immediately after processing the material are not influenced by the medium.Conversely, it is competence in L2 and, to a minor extent, prior knowledge that contribute to participants' performance.

RQ2: Learning medium and delayed outcomes
To answer the second research question, we conducted an ANCOVA with condition included as a factor, perceived competence in L2, competence in L2 reading and listening comprehension, prior knowledge, and performance in the immediate assessment as covariates, and outcome measures as dependent variables (see Table 5).The ANCOVAs model was statistically significant for all the outcome measures.Condition was significantly associated with delayed comprehension and recall.The perceived competence in L2 was significantly associated with delayed calibration error only.The competence in L2 reading and listening comprehension was significantly associated with both delayed comprehension and calibration error.Prior knowledge was not significantly associated with any of the outcome variables.Moreover, each delayed outcome was positively associated with its respective immediate outcome.
The post hoc tests confirmed that the participants in the video condition outperformed those in the text condition in delayed comprehension (mean difference = 1-01, p = 0.02) and delayed recall (mean difference = 0.68, p = 0.01).All the other comparisons between conditions were statistically not significant.Overall, the findings confirm that learning from videos in L2 is associated with better learning performance as compared to learning from texts or subtitled videos.Moreover, while perceived competence in L2 leads to an overestimation of learning performance, actual competence in L2 was once again positively associated with learning performance.

RQ3: The moderation effect of medium on the interaction between L2 competence, prior knowledge, and learning outcomes
To answer the third research question, a moderated moderation analysis was conducted through the SPSS Process Macro (Hayes, 2012).We estimated whether the condition moderated the moderation effect of prior knowledge on the interaction between L2 competence and immediate (see Table 6) and delayed learning outcomes (see Table 7).The model with immediate comprehension included as a dependent variable was statistically significant, R 2 = 0.45, F(7, 111) = 11.63,p < 0.001, with the interaction between the independent variable and the two moderators significant too, R 2 change = 0.03, F(1,111) = 4.99, p = 0.03.Prior knowledge moderated the effect of L2 competence on immediate comprehension in the text (β = − 0.18, F = 10.34,p = 0.002) and video conditions (β = − 0.09, F = 7.18, p = 0.01) but not in the subtitles condition (β = -0.003,F = 0.003, p = 0.95).Specifically, in the subtitles condition, L2 competence was associated with immediate comprehension regardless of prior knowledge levels.In the text and video conditions, prior knowledge compensated for low levels in L2 competence in immediate comprehension (see Fig. 2).
The model with immediate recall included as a dependent variable was statistically significant, R 2 = 0.40, F(7, 111) = 8.65, p < 0.001, with the interaction between the independent variable and the two moderators being significant too, R 2 change = 0.07, F(1,111) = 4.99, p = 0.001.Prior knowledge moderated the effect of L2 competence on immediate recall in the text (β = − 0.07, F = 8.01, p = 0.01) and subtitles condition (β = 0.06, F = 5.01, p = 0.03) but not in the video condition (β = − 0.01, F = 0.29, p = 0.59).Specifically, in the video condition, L2 competence was associated with immediate recall regardless of prior knowledge levels.In the text condition, high prior knowledge compensated for low L2 competence.In the subtitles condition, prior knowledge did not compensate for L2 competence  deficits, but it boosted learning performances in students with high levels of L2 competence (see Fig. 3).
The model with immediate transfer included as a dependent variable was statistically significant, R 2 = 0.27, F(7, 111) = 4.45, p < 0.001, but the interaction between the independent variable and the two moderators was not significant, R 2 change = 0.03, F(1,111) = 3.05, p = 0.08.
The model with delayed comprehension included as a dependent variable was statistically significant, R 2 = 0.65, F(8, 111) = 21.61,p < 0.001, but the interaction between the independent variable and the two moderators was not significant, R 2 change = 0.002, F(1,111) = 0.60, p = 0.44.
The model with delayed recall included as a dependent variable was statistically significant, R 2 = 0.43, F(8, 111) = 7.87, p < 0.001, but the interaction between the independent variable and the two moderators was not significant, R 2 change = 0.02, F(1,111) = 3.29, p = 0.07.
The model with delayed transfer included as a dependent variable was statistically significant, R 2 = 0.45, F(8, 111) = 8.04, p < 0.001, with the interaction between the independent variable and the two moderators significant too, R 2 change = 0.03, F(1,111) = 4.58, p = 0.04.Prior knowledge moderated the effect of L2 competence on delayed transfer in the subtitles condition (β = 0.11, F = 6.57, p = 0.01) but not in the text (β = − 0.03, F = 0.43, p = 0.52) or video condition (β = 0.04, F = 1.76, p = 0.19).Specifically, in the text and video conditions, L2 competence was not associated with delayed transfer at any of the prior knowledge levels.In the subtitles condition, prior knowledge did not compensate for L2 competence deficits, but it boosted learning performances in students with high levels of L2 competence (see Fig. 4).Overall, the results confirm that in most cases the learning performance of university students with varying levels of L2 competence and prior knowledge are influenced by the medium.As a general trend, prior knowledge can compensate for low L2 competence levels only in the text condition, and sometimes in the video condition, but never in the subtitled condition.

Discussion
All around the world, higher education institutions are relying on digital resources at an increasing pace.Most of the digital educational resources are developed in English, which is a second or foreign language for high percentages of students inside and outside English-speaking countries.Moreover, providing students with academic content in L2 is considered as a way to increase the internationalization of colleges and universities.However, digital educational resources can take many forms (texts, texts with pictures, narrated video, subtitled videos, and the like) and it is not clear to what extent learning performances depend on the medium.The present study aimed to address this issue by comparing immediate and delayed learning performances when learning academic content in L2 in three media: text, video and subtitled video.
The first research question asked whether the medium in which the academic content is presented influences students' learning.Our hypothesis was not confirmed as learning performances did not differ across media.Students' perceived competence in L2 learning was involved at a surface level of learning (i.e., immediate comprehension), whereas L2 competence and prior knowledge were involved at deeper levels of learning (i.e., immediate recall and transfer).The result confirms that students are indeed becoming more expert in learning across media, at least when their performances are assessed immediately after the exposure to content.Conversely, the results also confirm that students are not accurate in judging their own competence across domains.If, on the one hand, perceived competence in L2 and L2 competence correlated, the former variable did not contribute to learning outcomes at deeper levels.However, students' difficulty in judging their competence was not different across media, differently than it was hypothesized in the first research question.
The second research question focused on delayed learning.Condition played a contribution at the comprehension and recall levels.Surprisingly, the video condition was associated with better learning performances than the text condition, whereas we expected a substantial equivalence.The result can be interpreted in light of the cognitive theory of multimedia learning (Mayer, 2002).Both conditions presented verbal and pictorial information; however, the text condition requires an involvement (and a competition for resources) of the visual channel only, whereas in the video condition, the information is split between the visual and auditory channels.This may represent a facilitating effect when learning in L2, which is a particularly resource-demanding condition, as opposed to learning in L1.Indeed, videos presenting academic content in L2 have potential to boost learning performances (e.g., showing authentic situations, demonstrating procedures) if properly designed (Derry et al., 2014).Once again, condition was not associated with calibration error.
The control variables decreased their impact on learning outcomes in the delayed assessment.Perceived competence and prior knowledge were not significantly associated and L2 competence was associated only with surface levels of learning.Of course, their variance may have been absorbed by the inclusion of immediate learning outcomes as covariates in the ANCOVA models.
The third research question was based on the notion that in L2 reading processes are not automatic as they are in L1.This leads to higher consumption of cognitive resources, which are then not available for the deeper elaboration of the learning material (Hasegawa et al., 2002).This effect applies to L2 proficient learners too (MacWhinney, 2001).However, high levels of prior knowledge may moderate this effect (Leow & Mercer, 2015) by compensating through a more automatic retention of information into existing schemas.We investigated whether this pattern is influenced by the medium in which students are learning.The results substantially confirmed the compensation hypothesis (prior knowledge moderates the association between L2 competence and learning outcomes) and offered evidence supporting differences across conditions.The compensation hypothesis was verified for immediate comprehension and recall but not for transfer, which may be excessively cognitively demanding for a compensation to occur.Moreover, the compensation effect emerged for the text and video conditions but not for the subtitles one.Subtitled videos stand out as the most cognitively demanding medium to be processed by students lacking either L2 competence or prior knowledge.

Limitations and future research
When interpreting the findings of the current study, some limitations should be taken into account.Firstly, reading texts is a naturally self-paced process, whereas narrated and subtitled videos have an automatic pace that needs to be over-ruled by the learner.This may represent a confounding variable that could be resolved by either presenting dynamic text to learners in the text condition, or by prompting learner to pause-andplay the video to adjust it to their pace.For instance, Merkt et al. (2011) demonstrated that the effect of interactive videos on learning is at least comparable to that of print.In other words, instructional videos may have a detrimental effect on learning because they reduce the amount of control the recipient exerts on information processing.
Secondly, to ensure a high equivalence across conditions, we presented to the participants a subtitled video without audio.Most of the available videos online provide learners with both informational channels, thus future studies should include this condition in their analyses.According to the redundancy effect hypothesis, having two sources of verbal information (oral and written) may overload the learners' cognitive system and hinder the comprehension performance.Indeed, on-screen text (i.e., subtitles) may compete with visual information from the animation and/or with the narrated text (Zheng et al., 2022).
Finally, learning material in L2 can vary for several other aspects that may affect performances, besides the ones investigated in the present study.For instance, recognition of foreign accent requires additional effort in elaborating information.Thus, we could expect students to have worst performances when learning with a foreign-accented narration compared to a native narration without on-screen text (Chan et al., 2020).
Overall, the effect of the medium on learning seems to be limited to experimental studies that may lack ecological validity.In future, we need more insights about the effect of medium on learning from materials in L2 in a naturalistic setting.

Conclusions
The present study contributes to the issue of learning across media in L2.Our results confirmed the substantial equivalence of learning performances across media when the assessment takes place immediately after reading/watching the learning material.However, of notice is that students' perception of their own competence is associated only with shallow levels of comprehension, whereas deeper levels of learning are related to their actual L2 competence and prior knowledge.This calls into question the issue of what the thresholds in students' competences and knowledge should be for them to deeply learn academic content in L2.When learning performances are assessed after a week from the exposure to the learning material, the narrated video condition was associated with better learning performances in L2.This is a comforting result as educational videos are exponentially increasing their presence in the syllabi of courses all over the world.Videos in L2 have oftentimes the possibility to activate subtitles, following the indications from Web Content Accessibility Guidelines 2.0 (WCAG, 2008).Videos with subtitles in L2 have the potential to boost learning performances, but this seems to apply only to highly skilled learners.
Funding Open access funding provided by Università degli Studi di Firenze within the CRUI-CARE Agreement.

Fig. 1
Fig. 1 Expected moderated moderation model Results of ANCOVAs for delayed comprehension, recall, transfer, and calibration

Fig. 2
Fig. 2 Plot of the moderated moderation analysis for immediate comprehension

Fig. 3
Fig. 3 Plot of the moderated moderation analysis for immediate recall

Fig. 4
Fig. 4 Plot of the moderated moderation analysis for delayed transfer

Table 4
Results of ANCOVAs for immediate comprehension, recall, transfer, and calibration error

Table 6
Results of the moderated moderation analysis on immediate comprehension, recall, and transfer