Participants in Experiment 2 were 140 Japanese University learners of English. Among the 140 students, data provided by 32 students were excluded because of their error rates on the semantic or form processing tasks during the entire learning session. Further, data from 19 participants were excluded because of their error rates on the LDT. Although this attrition rate is somewhat higher than in previous studies of PLE, it is important to note that these individuals were learning novel L2 words which is considerably more difficult than the earlier L1 studies. Data from the remaining 89 participants are included in the following analyses (25 participants in the semantic processing condition, 31 participants in the form processing condition, and 33 participants in the control condition). The questionnaire for their background information revealed that most of them started learning English when they were 12 or 13 years old in their school, and the majority of them had at least 6 years of formal instruction of English.Footnote 3 Only one participant reported that he had studied English only for 4 years. Their scores of the TOEIC test indicated that their English proficiency was at the intermediate level. These results suggest that the participants in Experiment 2 belonged to a similar population to those in Experiment 1. The background information of the participants in this experiment appears in Table 4.
The power analysis for Experiment 2 indicated that a sample size of 27 in each condition allowed for the detection of small to moderate (.2 or larger) effects with 80% power. Although the sample size for the semantic condition falls slightly below this value (25 rather than 27), we accepted this small difference for present purposes, especially given the nature of the results (see Results section).
Stimuli and design of the experiment
This experiment consisted of a learning session followed by a testing session. In the learning session, there were three word-learning conditions: semantic processing, form processing, and control. In both the free and cued recall tests, we expected that participants focused on form would recall more L2 words, as they would devote a greater percentage of resources to processing L2 word forms. Conversely, we expected that individuals in the semantic learning condition would recall more L1 words as they would focus on meaning, rather than form. Following the recall testing session, there were two counterbalanced conditions in the masked form priming experiment: List 1 and List 2, as was the case with Experiment 1. Therefore, there were six between-subject conditions in total. The stimuli used in Experiment 1 were also used in this experiment. Additional stimuli used in the learning session were selected as follows:
Semantically related words to the 24 prime words, which were used in the semantic processing task, were chosen by ensuring that semantic relatedness existed between the two words in the WordNet lexical database (Princeton University, 2010) or Longman Roget’s Thesaurus dictionary. When there were several candidates, words whose familiarity ratings were relatively high in Yokokawa’s (2006) English word familiarity rating list for Japanese learners of English were chosen to better guarantee that these semantically related words were known by the participants.
Formally related words used in the form processing task were the same as target words in the masked form priming LDT.
Distractor words in the semantic and form processing tasks described later in the learning session were words that were not similar to the semantically related words nor formally related words, but had the same word length and similar levels of lexical characteristics of familiarity (Yokokawa, 2006) and frequency (Balota et al., 2007) to the corresponding semantically or formally related words.
The participants were randomly assigned to one of the six between-subject conditions. The counterbalancing in the testing session was achieved in the same way as in Experiment 1.
Apparatus and procedure
The same apparatus used in Experiment 1 was used in this experiment. As described above, the experiment consisted of a learning session and a testing session. Before the learning session began, the participants were instructed to remember 24 English words and their L1 translations as best as possible. They were also instructed that they could use any strategies to learn them except for reading them aloud and writing them down. They were also informed that they would be tested afterwards about the L2 words and corresponding L1 translations but were not informed about exactly what kind of tests they would take.
Figure 2 displays a schematic of the learning phase. The learning session was composed of four blocks with 24 trials in each block. Each trial consisted of two subparts for each to-be-learned word: the study phase and the judgment phase. In the study phase, a to-be-learned English word and its Japanese translation were shown on a PC monitor for five seconds (e.g., stow 詰める). Then, the judgment phase began. For the semantic processing task, two English words were shown on the PC monitor (e.g., wife pack), and the participants were asked to indicate which word was more similar to the to-be-learned word’s meaning by pressing the right or left control buttons on the keyboard. For the form processing task, two English words were shown on the monitor (e.g., wife stop), and the participants were asked to indicate which word was more similar to the to-be-learned word’s form for 2.5 seconds, again by pressing the right or left control buttons on the keyboard. The distractor words between the two groups were the same (e.g., wife). The participants in the control group did not have any additional tasks, and a blank screen was shown for 2.5 seconds. After the 2.5 second judgment phase, the study phase for the next to-be-learned word began and so on. The RTs and error rates of the judgments were measured for each trial.
Participants first experienced a practice session for six words in order to familiarize them with the procedure of the learning session of the experiment. After the practice session, the main experiment began. The participants were exposed to each to-be-learned words four times (24 words × four blocks = 96 trials in total). The order of presentation was pseudo-randomized. That is, the 24 pairs of English to-be-learned words and their Japanese translations were randomly presented to the participants within each block, and the order of four blocks was also randomized for each participant. Further, for the sematic and form processing tasks, the position of the two judged words presented on the monitor was balanced so that words that were semantically or formally similar to the to-be-learned words were shown on the right and left sides equally often.
After the learning session, the testing session began. The testing session consisted of two kinds of tests: four recall tests and the masked form priming LDT. For the first two recall tests, the participants completed L2 and L1 free recall for 2 minutes, respectively, and L1-to-L2 and L2-to-L1 cued recall for 4 minutes respectively. For the L2 free recall, they were instructed to remember and write down as many English to-be-learned words as possible in any order while in the L1 free recall, they were instructed to remember and write down as many Japanese translations of the to-be-learned words as possible in any order. For the L1-to-L2 cued recall, 24 Japanese translations of to-be-learned words were displayed on the monitor, and the participants were asked to write down the corresponding English equivalent, whereas for the L2-to-L1 cued recall, the 24 English to-be-learned words were shown on the monitor, and they were asked to write down the corresponding Japanese translation. Additionally, for the L2-to-L1 cued recall, the participants were asked to indicate if there were any to-be-learned words that they had known before the experiment. Any word(s) that they indicated was (were) not included in the analyses. Note that the free recall tasks were designed to assess the extent to which participants had learned new word form versus new word meaning (free recall in L1 relying largely if not completely on semantically oriented learning and free recall in L2 relying largely if not completely on form-oriented learning), whereas the cued recall tasks were included as measures of vocabulary learning overall given that they rely on the mapping component of L2 vocabulary learning.
After the four recall tests, the masked form priming LDT was administered. For this task, the 24 to-be-learned words in the learning session (e.g., stow) were used as primes. Their orthographic neighbors (e.g., STOP) were used as targets. As in Experiment 1, the primes were presented for 67 ms.
Data were not included in the analyses if (1) error rates were more than 10% for the judgment phase in the semantic or form processing tasks (15 participants in the semantic processing group and 17 participants in the form processing group); (2) error rates were more than 20% in the LDT (five participants in the semantic processing group, four participants in the form processing group, and 10 participants in the control group); (3) participants indicated they had prior knowledge about to-be-learned words (37 observations in the total 2,136 responses, 1.73%).
When scoring the four recall tests, one point was given for each correctly recalled word. All other responses were treated as errors. The results of the L2 free recall, L1 free recall, L1-to-L2 cued recall, and L2-to-L1 cued recall are shown in Fig. 3. The descriptive statistics of the four recall results are shown in Appendix Table 8.
We fitted logit mixed-effects models on the four sets of recall data separately (L2 free recall, L1 free recall, L1-to-L2 cued recall, L2-to-L1 cued recall) using the lme4 package (Bates et al., 2015) with participants and items as cross-random factors. The models we first fitted to the data had learning condition (semantic, form, control) as the fixed effect and had by-subject intercept and by-item intercept and slope of learning condition. When the model did not converge, we took steps to simplify the random effects structure until the model converged. Based on the prediction of the TOPRA model and results of previous TOPRA-model-based studies (e.g., Barcroft, 2002, 2003, 2004; Kida, 2020; Kida & Barcroft, 2018), our primary focus was different according to the types of processing and types of recall. Therefore, learning condition was dummy coded differently depending on recall types: when the target language was L2 (i.e., L2 free recall and L1-to-L2 cued recall), the form processing condition was the reference level whereas when the target language was L1 (i.e., L1 free recall and L2-to-L1 cued recall), the semantic processing condition was set as reference. We also checked variance inflation factor (VIF) values for each factor (all values were under 2.0).
Results for free recall
For L2 free recall, the model included learning condition as a fixed factor and by-subject and by-item intercepts. The results indicated that the difference between semantic processing and form processing (z = −2.14, p = .03) and between form processing versus control (z = 2.53, p = .01) were significant. These results indicated that recall of the form processing group was higher than that of the semantic processing group, and that recall of the control group was the highest. Overall results appear in Appendix Table 9.
For L1 free recall, the model included learning condition as a fixed factor and by-subject intercept and by-item intercept and slope of learning condition. The results indicated that the learning condition was significant between semantic processing versus form processing (z = −4.95, p < .01) but not significant between semantic processing versus control (z = −0.60, p = .55). These results indicated that, contrary to the results on L2 free recall, recall in the semantic processing group was higher than that of recall in the form processing group. Overall results appear in Appendix Table 10.
Results for cued recall
The same model construction procedure was carried out for the cued recall results. For L1-to-L2 cued recall, the model included learning condition as the fixed factor and by-subject intercept and by-item intercept and slope of learning condition. The results indicated that the difference was significant between semantic processing and form processing (z = −2.49, p = .01) and between form processing and control (z = 3.38, p < .01). These results indicated that, unlike the results of L2 free recall, recall in the form processing group was higher than that of the semantic processing group, but that recall in the control group was the highest. Overall results can be viewed in Appendix Table 11.
Finally, for L2-to-L1 cued recall, the model included learning condition as the fixed factors, and by-subject intercept, and by-item intercept and slope of learning condition. The results indicated that learning condition was not significant between semantic processing and form processing (z = −0.13, p > .90) but significant between semantic processing and control (z = 3.50, p < .01). These results indicated that recall of the control group was higher than that of the semantic and form processing groups. Overall results are depicted in Appendix Table 12.
In sum, the results of the two free recall tests indicated that (a) the form condition outperformed the semantic condition for L2 free recall whereas (b) the semantic condition was better than the form condition for L1 free recall. The results of the two cued recall tests indicated that (c) the form condition outperformed the semantic condition in L1-to-L2 cued recall, whereas (d) no significant difference was found between the semantic and form conditions in L2-to-L1 cued recall. These results generally confirmed that participants in the semantic processing and the form processing tasks engaged in the appropriate type of processing during the learning phase.
Masked form priming lexical decision task
Results of reaction time
The descriptive statistics of raw RT data for the results of the masked form priming LDT appear in Table 5.
Analyses were conducted only for the correct responses for word trials. Because the data in Experiment 2 skewed more strongly to the right compared with those in Experiment 1, we applied inverse transformation to the RTs (−1,000/RT) to meet the distributional assumption of LME (Kezilas et al., 2017; Nakayama et al., 2016) before the data analyses reported hereafter, and because we transformed the RT data, we did not exclude outliers in Experiment 2. Fixed factors were learning condition (semantic, form, control), prime type (related, unrelated) and their interaction. Regarding learning condition, the control condition was set at the reference level. Prime type was contrast coded (related = 0.5, unrelated = −0.5). Random effects structures were by-subject intercept and slope of prime, and by-item intercept and slope of the interaction between learning condition and prime type. We also checked VIF values for each factor (all values were under 4.0).
The results indicated that the interaction between learning condition and prime type was not significant (p > .10). Learning condition was not significant for the difference between semantic processing and control (t = 1.02, p = .31) and for between form processing and control (t = 0.35, p = .73). Additionally, contrary to the results of Experiment 1, prime type was not significant (t = 1.12, p = .27). Overall results appear in Appendix Table 13.
As in Experiment 1, we conducted simple comparisons between the related and unrelated prime conditions in each learning group. The results indicated that prime type was not significant in any of the learning groups (z = 0.62, p = .53 for the semantic processing group, z = 1.12, p = .26 for the form processing group, and z = −1.12, p = .26 for the control group). These results indicate that, in contrast to the results of Experiment 1, as a result of novel word training during the learning phase, the significant facilitative priming effect disappeared in Experiment 2.
Error rates for word targets were also analyzed by using the lme4 and lmerTest packages. The model structure was the same as that for RT data. But the model failed to converge. As in Experiment 1, we therefore changed the optimizer and used the bobyqa function to avoid convergence failure. We also checked VIF values for each factor (all values were under 3.0) The results indicated that the interaction between learning condition and prime type was not significant (p > .20). Learning condition was not significant for the difference between semantic processing and control (z = 1.30, p = .20) and for between form processing and control (z = 0.17, p = .87). Prime type was also not significant (z = −1.64, p = .10). The overall results appear in Appendix Table 14.
These results can be discussed from two perspectives: (a) acquisition of word knowledge for to-be-learned words (lexical configuration) and (b) lexicalization of these words (lexical engagement). In terms of lexical configuration (L2 and L1 free recall and, respectively, L1-to-L2 and L2-to-L1 cued recall), the results revealed that the semantic processing group outperformed the form processing group on the L1 recall test while the opposite results were obtained for the L2 recall test. This double dissociation in the effects of increased semantic versus increased form processing, a pattern also demonstrated by Barcroft (2002), is fully consistent with predictions of the TOPRA model. In addition, the results of the recall tests in the present study revealed that performance of the control group was the highest except in the case of the L1 free recall. One possible reason for this result is that the participants in the control group did not have any specific processing task during the learning phase. Therefore, when they were presented a blank screen in the judgment phase of the learning session (see Fig. 2), it allowed them to rehearse the to-be-learned words and their translations more overall than in the other two conditions. Participants in the semantic processing group and the form processing group, on the other hand, performed an additional task and had less opportunity to rehearse the to-be-learned words and their translations.
As for lexical engagement (masked form priming LDT), the mixed effect models indicated that there was no significant two-way interaction between learning condition and prime type. Simple comparisons indicated no significant priming in all conditions. It is interesting to note that, although not significant, only the control condition demonstrated inhibitory priming numerically. Consistent with these findings, in the original Barcroft (2002) study that included pleasantness ratings (semantic) versus letter counting (formal) versus control (no task), the control condition led to the highest level of vocabulary learning, a finding that, as Barcroft noted, should be interpreted with caution given that no task at all was performed in the control condition, allowing learners simply to attend to the novel words as input more than in the other conditions. During the additional “no-task” time in the control condition, participants could rehearse the L2 words more whereas for the other two conditions they were asked to do specific tasks that apparently got in their way of lexical input processing. These points being made, the present findings are consistent with this pattern observed in the 2002 study but add to it by suggesting that higher levels of vocabulary learning seem to co-occur with increased levels of lexicalization. It is for this reason that patterns consistent with more lexicalization were highest, at least numerically, in the control condition in the present study. Contrary to the findings of some previous studies (e.g., Nakayama & Lupker, 2018; Qiao & Forster, 2017), this experiment demonstrated (a) the PLE in L2 learners functioning in L2 (at least partially) and, moreover, (b) the possibility that lexical competition can operate in L2 learners whose L1 script is different than that of their L2. Possible reasons for these findings are discussed further in the next section.