Introduction

It has long been known that lexical knowledge plays a pivotal role in developing ESL/EFL reading proficiency as an indispensable variable that positively affects students’ academic reading (Cheng and Matthews, 2018; McLean et al., 2020; Nation, 2013; Qian, 2002). However, mental vocabulary is complicated, and the fundamental nature of vocabulary knowledge has often puzzled language teachers and researchers (Schmitt, 2014). Therefore, vocabulary experts have developed numerous frameworks mainly comprising the differences between vocabulary breadth or quantity (the number of words learners know) and vocabulary depth or quality (the extent to which learners know these words) (Anderson and Freebody, 1981; Nation, 2013). These vocabulary frameworks have been widely applied to language proficiency development research, especially reading proficiency (Cheng and Matthews, 2018; Moinzadeh and Moslehpour, 2012; Qian, 2002; Qian and Lin, 2020; Stæhr, 2008). However, relative to these studies investigating the role of receptive word knowledge in reading ability, research on how productive vocabulary knowledge relates to reading proficiency is scarce, especially when taking both receptive and productive word knowledge into account concurrently. Furthermore, vocabulary fluency is another critical dimension of vocabulary knowledge (Kremmel and Schmitt, 2016; Nation, 2013; Schmitt and Schmitt, 2014). However, far less research considers vocabulary fluency as a research variable in exploring the link between word knowledge and reading proficiency.

Additionally, since the language belongs to different families, it is difficult for Chinese ESL/EFL learners to develop their L2 word knowledge and improve their reading ability. Although Chinese university students spend much time reciting new words based on a dictionary or textbook word banks, they cannot recognize and understand what these words mean in the context of reading. The main reason for this is likely that Chinese university students do not know what vocabulary knowledge truly is and do not have a method to learn words to develop their reading skills purposefully. Therefore, it is necessary to explore an effective method that can help ESL/EFL learners in China and elsewhere develop their word knowledge and improve their reading proficiency.

Literature review

L2 vocabulary knowledge

Vocabulary knowledge is complicated and multifaceted. In attempting to clarify the complicated nature of word knowledge, scholars have proposed several theoretical concepts (Anderson and Freebody, 1981; Cronbach, 1942; Henriksen, 1999; Laufer and Goldstein, 2004; Nation, 2013; Paribakht and Wesche, 1993; Qian, 2002; Qian and Lin, 2020; Schmitt, 2014; Wesche and Paribakht, 1996). These concepts have typically been divided into two critical approaches: the development approach and the dimension approach (Read, 2000; Yanagisawa and Webb, 2020).

The development approach, also called the cumulative approach, presents vocabulary knowledge as a process of cumulative development. Typical research employing the developmental approach of vocabulary knowledge uses scales that signal the developmental stage of word knowledge, such as the vocabulary knowledge scale (VKS) (Paribakht and Wesche, 1993; Wesche and Paribakht, 1996). The VKS presented below was developed to capture the development stage of lexicon knowledge scoping from no knowledge to fully developed knowledge.

1. I do not remember having seen this word before.

2. I have seen this word before, but I do not know what it means.

3. I have seen this word before, and I think it means______. (Synonym or translation).

4. I know this word. It means_______. (Synonym or translation).

5. I can use this word in the sentence: ________. (If you complete this section, please also complete Section IV).

The five divisions of this scale represent how well test takers know each word, and the scale has been widely used in the lexicon research field (Read, 2000; Schmitt, 2010). However, it has been criticized for two main reasons: first, the VKS supposes that the extent of difficulty between each division of the scale is equal; second, the VKS also supposes that word knowledge improves linearly from one division to the next (Yanagisawa and Webb, 2020). Different degrees of difficulty might be involved in the development process, and linear development may not always occur. For example, if one does not know the meaning of a word, one can still use it with accurate grammar. Dialectically, despite the apparent drawbacks of this developmental approach, it is still meaningful for assessing vocabulary acquisition because learning is a gradual process from unfamiliarity to familiarity.

The dimensional approach, called the components approach, divides the word ‘knowledge’ into multiple dimensions or components. Since Cronbach (1942) first proposed five word-knowledge dimensions, many other researchers subsequently developed the dimensional approach (Anderson and Freebody, 1981; Henriksen, 1999; Laufer and Goldstein, 2004; Nation, 2013; Qian, 2002; Qian and Lin, 2020; Schmitt, 2014). Generalizing their approaches, the three main dimensions cover breadth, depth, and fluency. Nation has described the most complete and widely cited word-knowledge theoretical framework (2013), which classified lexical knowledge into three dimensions: word form, word meaning, and word use. Each dimension includes three aspects of knowledge, and each aspect is divided into receptive and productive proficiency (Table 1). Receptive knowledge involves knowing words when they are encountered during reading or listening, whereas productive knowledge involves using words in speech or writing. Receptive and productive knowledge have been noted to lie on a continuum comprising levels of knowledge, beginning with word unfamiliarity and ending with the capability to use the word in language application freely.

Table 1 Word knowledge dimensions.

The classifications of word-knowledge descriptions are comprehensive because they include almost all aspects of lexical knowledge; thus, many researchers have adopted Nation’s dimensional framework to investigate L2 word knowledge (Cheng and Matthews, 2018; Li and Kirby, 2015; McLean et al., 2020). However, it is difficult for language teachers to apply all the aspects to vocabulary teaching and for researchers to use all of them in one study. Therefore, the researchers have focused only on certain aspects of their research.

In the dimensional approach, it is essential to distinguish vocabulary breadth and depth of knowledge (Qian, 2002; Qian and Lin, 2020; Schmitt, 2014). Qian (2002) proposed four dimensions of word knowledge: breadth, depth, organization, and receptivity–productivity. According to Qian (2002), vocabulary breadth is the quantity of vocabulary in learners’ mental lexicon. It typically represents the form-meaning link because it counts learned word items. At the same time, vocabulary depth is the extent to which learners know words, described by all vocabulary features, such as phonetics, morphology, syntax, semantics, and collocations. Qian noted that it is essential for language learners to command both the breadth and depth of word knowledge and to reach their goals of vocabulary learning from receptivity to productivity. Schmitt (2014) not only stressed the importance of the three dimensions of word knowledge covering vocabulary breadth, vocabulary depth, and vocabulary fluency but also pointed out that vocabulary breadth tests focus on explicit aspects such as the form-meaning link. In contrast, vocabulary depth tests mainly focus on implied collocation and association aspects. Hence, designing appropriate lexical breadth and depth measurement instruments is meaningful based on the research aims and other related factors. Qian and Lin (2020) stressed the importance of vocabulary breadth and depth combined with receptive and productive vocabulary. The authors noted that receptive vocabulary forms the breadth and depth of a learner’s vocabulary in the mental lexicon; receptive vocabulary breadth and depth knowledge must be further developed into productive vocabulary breadth and depth knowledge for communication. Based on Qian and Lin, the four dimensions of receptive vocabulary breadth, receptive vocabulary depth, productive vocabulary breadth, and productive vocabulary depth are closely related to language proficiency. Accordingly, many researchers have focused on the link between word knowledge and language proficiency, especially reading. However, research is limited to the role of receptive lexicon knowledge in reading. As far as I know, except for Cheng and Matthews (2018), few scholars have explored the role of productive word knowledge in reading; however, Cheng and Matthews found that productive vocabulary breadth plays the most significant role in reading, which ignited an interest in exploring which dimension of lexical knowledge covering receptive vocabulary or productive knowledge is more critical for improving reading proficiency.

In addition to vocabulary breadth and depth, vocabulary fluency is another critical dimension of word knowledge (Nation, 2020; Qian and Lin, 2020; Schmitt, 2014). Nation (2020) reaffirmed the importance of fluency, which accounts for one-third of a well-balanced lesson, having equal status with meaning-input (receptive knowledge) and meaning-output (productive knowledge). Hence, language teachers should pay more attention to strengthening the fluency development of learners because developing fluency may improve the automaticity of vocabulary use; therefore, vocabulary fluency development is supposed to be the learners’ ultimate goal (Qian and Lin, 2020; Schmitt, 2014). Li and Zhang (2019) acknowledged that lexical fluency determines how fast learners can process words when using language. Here, vocabulary fluency is defined as the speed of recognizing, retrieving, and producing target words in language use.

Vocabulary fluency has been acknowledged as an essential dimension of vocabulary knowledge (Schmitt, 2014; Nation, 2020). Schmitt (2014) notes that words must be processed quickly to communicate smoothly, that is, both speakers and listeners must recognize words with sufficient speed. Nation (2020) firmly believes that fluency should account for one-third of the three aspects of meaning input, meaning output, and fluency development in a balanced lesson. However, only a handful of scholars have considered it a research variable influencing language proficiency. Li and Zhang (2019) investigated the connection between the three dimensions of lexicon knowledge, covering breadth, depth, and fluency, as well as listening proficiency, and found that vocabulary breadth is most important for L2 listening, with β = 0.36, followed by β = 0.17 for vocabulary depth and β = −0.22 for vocabulary fluency. The authors concluded that all three dimensions significantly predict L2 listening, even though there is a negative β value for vocabulary fluency, which leads to the question of whether vocabulary fluency is appropriate as an independent variable. As mentioned before, vocabulary learning starts from unfamiliarity and ends with familiarity; that is, learners first acquire breadth knowledge, then gradually develop depth knowledge, and finally, they fluently produce words in language use through practice; that is, vocabulary breadth and depth belong to knowledge, and fluency is competence. Therefore, fluency may play a facilitating role in the development process from receptivity to productivity. For example, even with skillful techniques, a cook cannot make delicious food without the ingredients. In an old Chinese saying, the role of vocabulary fluency is “the icing on the cake” rather than “a gift in the snow”. Hence, this study considers vocabulary fluency a moderation variable to determine whether it promotes the effect of fundamental dimensions of word knowledge, covering lexicon breadth and depth, on reading proficiency. However, to avoid Type-1 errors, we first examine the effect of vocabulary fluency as an independent variable.

Vocabulary and reading

Over the past two decades, many scholars have ascertained the link between lexical knowledge and reading ability (Cheng and Matthews, 2018; McLean et al., 2020; Jeon and Yamashita, 2014; Qian, 2002; Stæhr, 2008; Zhang and Zhang, 2020). Qian (2002) administered a study to investigate the effect of receptive breadth and the depth of word knowledge on reading by adopting the Vocabulary Levels Test (Nation, 1983) to evaluate receptive vocabulary breadth and using Depth of Vocabulary Knowledge (DVK) derived from the WAT, developed by Read (1993, 1995) to assess receptive vocabulary depth consisting of synonymy, polysemy, and collocation. Read found that the Pearson correlation between receptive lexicon depth and reading is r = 0.77, and r = 0.74 for receptive vocabulary breadth, indicating that the DVK is better than the VLT for the TOEFL-Reading. Stæhr (2008) ascertained the correlations among receptive vocabulary breadth and listening, reading, and writing in EFL based on the VLT version designed by Nation (1983) and improved by Schmitt et al. (2001) and found that the Pearson correlation coefficient between receptive lexical breadth and reading is r = 0.83, which is higher than that of writing and listening, indicating that it is most critical for reading to develop receptive vocabulary breadth knowledge. Cheng and Matthews (2018) examined the link between L2 word knowledge comprising receptive/productive vocabulary breadth and productive phonological lexicon by using the VLT (Schmitt et al., 2001), controlled-production vocabulary levels test (Laufer and Nation, 1999) and partial dictation (Matthews and Cheng, 2015) and they found that productive vocabulary breadth has the strongest correlation with L2 reading (r = 0.57), which ignited strong interest in investigating which dimension of L2 word knowledge is most important for reading ability; after all, reading is a receptive language skill. Through a meta-analysis, Jeon and Yamashita (2014) investigated the overall average correlation between vocabulary and grammar knowledge and reading. They found that L2 grammar knowledge had a stronger relationship with reading (r = 0.85) than with L2 lexical knowledge (r = 0.79). However, vocabulary competence is usually regarded as more essential than syntactic knowledge in communication (Laufer and Goldstein, 2004; Qian and Lin, 2020). Zhang and Zhang (2020) also ascertained the link between L2 lexical knowledge and reading through a meta-analysis. They discovered that the overall average correlation between VK and L2 reading was r = 0.57, which is lower than that in Jeon and Yamashita’s meta-analysis study by r = 0.22. The results of the two meta-analysis studies are quite different, possibly because of different research samples; therefore, it is necessary to ascertain the role of lexical knowledge in reading proficiency further empirically. McLean et al. (2020) explored the predictive strength of various test formats of lexical knowledge covering recall and recognition tests. They found that written form-recall tests and written meaning-recall tests had a stronger connection with reading ability than written meaning-recognition tests, similar to the finding of Laufer and Goldstein (2004).

Overall, this section reviews the research addressing the role of L2 lexical knowledge in reading capability. However, few studies have drawn attention to the effect of productive lexicon knowledge on reading proficiency, probably because reading is a receptive skill in researchers’ potential consciousness. Furthermore, there is still some uncertainty regarding whether receptive or productive word knowledge better predicts reading ability. Additionally, since L2 vocabulary knowledge has multiple dimensions, it is meaningful to clarify the various roles of these dimensions in reading proficiency. Last, most of the above research adopted written meaning-recognition tests instead of written form-recall tests or written meaning-recall tests, which have been empirically demonstrated not to inflate test takers’ vocabulary size.

Vocabulary test

As noted above, vocabulary knowledge is complicated and multifaceted. In attempting to investigate the degree to which learners know words, scholars have developed vocabulary test instruments such as the VKS (Paribakht and Wesche, 1993; Wesche and Paribakht, 1996), VLT (Nation, 1990; Schmitt et al., 2001), and VST (Nation and Beglar, 2007) for measuring learners’ receptive vocabulary size/breadth; the WAF (Read, 1993) and WAT (Read, 1998) for checking learners’ receptive vocabulary depth; and the PLT (Laufer and Nation, 1999) for testing learners’ productive vocabulary size/breadth. In addition, many researchers have designed modalities to measure vocabulary depth according to their research purposes, such as collocation, association, and polysemy. These scales have been very popular vocabulary research tools for researchers and language teachers, and have made the field of vocabulary research fruitful. Over time, however, the most prominent vocabulary tests have been questioned by scholars.

Size/breadth tests have been applied for many research objectives, scoping from an agent for general language capability to a diagnosis of whether learners have sufficient lexical resources for developing four language skills. Kremmel and Schmitt (2016) acknowledged that on a technical level. However, the VLT and VST are receptive vocabulary tests, they are not multiple-choice tests of the practical application of the language. In other words, multiple matching (VLT) and four-option multiple-choice (VST) tasks test passive meaning recognition, which cannot embody learners’ actual known vocabulary size. Laufer and Goldstein (2004) differentiated four vocabulary size test formats and acknowledged that passive recall predicted language proficiency in the classroom. McLean et al. (2020), having acknowledged that the VLT and VST may exaggerate estimates of learners’ vocabulary size, conducted a study to investigate various test formats of lexical knowledge in predicting L2 reading and declared that written meaning-recall tests can best indicate reading proficiency, which is congruent with the findings of Kremmel and Schmitt (2016). Stoeckel et al. (2021) and Stewart et al. (2021) asserted that written meaning-recall is the preferred option for tests measuring vocabulary knowledge for reading because all else being equal, tests with recall items of the same words demonstrate superior quality.

Likewise, vocabulary depth is related to the degree of word knowledge. Hence, its measurement should reveal how well learners know words. Vocabulary depth tests are valuable since they may represent the degree to which learners can use words freely. Read’s WAF/WAT (1993, 1998) has long been a popular tool for measuring vocabulary depth knowledge for a long time. However, the WAT is regarded as insufficient in representing test takers’ depth of vocabulary knowledge reflected by the test because it involves only collocation and multiple meanings (Yanagisawa and Webb, 2020).

Although previous vocabulary measurement tools covering breadth and depth have been questioned, no uniform test format has emerged. Scholars also suggest that the selection of the test format should conform to the objective of the test and should depend on the type of information that the test designers want to capture. In conclusion, the preceding review shows that most researchers focus on the connection between receptive lexical knowledge and reading capability, and few pay attention to productive lexical knowledge, especially productive vocabulary depth. Furthermore, previous studies have mainly applied outdated measurement instruments. Additionally, the lack of empirical research into the vocabulary fluency effect is unexpected because it is a crucial dimension for proficient language learners. In this study, we bridge the gap in the research by ascertaining the effect of the four dimensions of L2 word knowledge on reading proficiency by the moderating role of vocabulary fluency.

Conceptual framework

According to the dimensional approach and research aims mentioned above, the conceptual framework of lexical knowledge for the present study is shown in Fig. 1.

Fig. 1: Conceptual framework.
figure 1

Rec receptive, Pro productive, VK vocabulary knowledge, VF vocabulary fluency, LP language proficiency, RP reading proficiency.

As shown in Fig. 1, on the left side of the figure, the four dimensions of L2 vocabulary knowledge, consisting of receptive vocabulary breadth/depth and productive vocabulary breadth/depth, are presented. On the right side, reading proficiency is a receptive language skill and is one of four language proficiencies. The double-headed arrow represents the link between L2 word knowledge and language proficiency (reading). Vocabulary fluency (VF) in the middle moderates the role in the relationship between two variables.

Questions for the research

In the present study, we ascertain the correlation and prediction relations between the four dimensions of vocabulary knowledge and L2 reading proficiency, and we examine the moderating role of lexicon fluency on the connection between each aspect of L2 word knowledge and reading ability. The quantitative research method is adopted to analyze the link between the four dimensions of L2 lexicon knowledge and reading capability based on the concept framework of Fig. 1.

According to the concept framework, the following three research questions are presented:

RQ1. How important are the four dimensions of L2 lexical knowledge for reading proficiency?

RQ2. To what extent can reading proficiency be predicted by L2 lexical knowledge?

RQ3. Is vocabulary fluency more effective as a moderator or independent variable?

Methodology

Participants

The participants in this study were recruited in two steps. The first step was purposive sampling, and a minimum sample size of 300 was recruited from one university at which the researcher worked based on the quantitative method’s structural equation model (SEM) (MacCallum et al., 1996). The second step was simple random sampling. Six classes with 52 students per class were randomly selected from six majors: Chinese, preschool education, business, computer, biology, and chemistry. This yielded a total of 312 Chinese university students at one local university in China. Their average age was 20.5 years old, and they had no overseas learning or life experiences. It needs to be explained here that the purpose of randomly selecting six majors was not to make a comparative study of the six majors but to make up for the shortage caused by having only a single research site.

Instruments

Numerous researchers have observed that written form-recall tests and written meaning-recall tests used for measuring vocabulary breadth and depth correlate better with reading ability than meaning-recognition tests (González-Fernández and Schmitt, 2019; Laufer and Goldstein, 2004; McLean et al., 2020). Meaning-recognition tests, such as the VST and VLT, are suspected of allowing participants to guess words blindly (Kremmel and Schmitt, 2016). Therefore, we tried to avoid these pitfalls when designing the test in the present study.

Receptive vocabulary breadth test (VLT)

McLean et al. (2020) state that written meaning-recall tests require test takers to write L1 meaning based on the provided L2 word form. In this study, the VLT (meaning recognition test) developed by Schmitt et al. (2001) was adapted to the meaning recall format, and receptive recall was also acknowledged by Laufer and Goldstein (2004). It includes 2000, 3000, 5000, 10,000, and academic vocabulary levels. Each level has 30 items. The target word is presented in an English sentence, and test takers translate the target word into Chinese. An example is presented below:

That impudent boy stuck his tongue out at me. _____________.

(Please translate the bold and underlined word into Chinese).

Receptive vocabulary depth test (VDT)

According to Yanagisawa and Webb (2020), it is infeasible to measure all aspects of vocabulary depth; hence, the components of words measured should be chosen based on the testing aims. The WAT designed by Read (1998) was adapted to the testing instrument, including word parts, multiple meanings, and collocations, for measuring receptive vocabulary depth in this study because they are the first and most prerequisite knowledge in reading for Chinese learners. An example is shown below.

Sudden (adj.)

Word parts: noun: ________

Word parts: adverb: __________

Multiple meanings: s_____________(意外)

Collocation: _____________ (突变)

Productive vocabulary breadth test (PLT)

The written form-recall test was adopted for the PLT; it requires test takers to produce an L2 form according to the provided L1 meaning. The PLT developed by Laufer and Nation (1999) was adapted to form a productive recall format by Laufer and Goldstein (2004). It includes 2000, 3000, 5000, 10,000, and academic vocabulary levels. Each level has 18 items. The target word is presented in Chinese, and test takers translate the target word into English. An example is shown below.

战士们宣誓效忠国家。Swear an ________

(Please write one word according to the bold and underlined Chinese phrase).

Productive vocabulary depth test (PVDT)

The Definition Completion Test suggested by Read (1995) is used to measure productive vocabulary depth. The test takers are asked to define the given vocabulary and make a sentence, including word parts, association, collocation, and sentence structure, to check their vocabulary abilities thoroughly. There are 20 target words selected from the IELTS reading comprising five nouns, verbs, adjectives, and adverbs.

advent

Definition:

Example:

Vocabulary fluency test (VFT)

A dictation test is applied to test fluency using phonetics, morphology, pragmatics, and semantics vocabulary dimensions. The test takers need to provide the missing word or phrase within a time limit when listening to the passage derived from IELTS listening lecture materials, which examines the speed of recognizing and retrieving vocabulary; that is, it reflects test takers’ actual ability from receptive aural form recognition to meaning recognition and then to productive written form recall in the fleeting characteristics of listening.

e.g., So, welcome to your introductory geography__________. We will begin with some basics. First, what do we learn by studying__________?

Test for reading proficiency

IELTS academic reading was used as the testing tool for this research. This part consisted of 40 questions from three articles selected from periodicals, books, magazines, newspapers, etc. At least one of the articles involved detailed logical arguments. The examination types included multiple choices, short answers, completed sentences, completed notes/summaries/flow charts, classification, and matching questions, as well as finding suitable paragraph titles from a set of options to explain the author’s views or opinions and to confirm the understanding of the content of the article. For example,

Questions 1–5

Passage 1 has five paragraphs, A, B, C, D, and E. Write the letter represented by the paragraph that matches the information below.

1. Examples of wildlife other than bats that do not rely on vision to navigate ________

Procedure

Data collection

Data were collected through vocabulary and reading tests administered in pen-and-paper format in three lecture halls. The four vocabulary tests took 35 min, and the vocabulary fluency test took 25 min. Finally, the reading test was completed in one hour.

Scoring

The VLT has five levels with 30 items each. One point is awarded for one correct item; thus, the maximum score of the VLT is 150 points.

The VDT consists of two parts of speech (nouns, verbs, adjectives, or adverbs), one multiple meaning (to prevent test takers from writing arbitrarily, the initials are provided), and one collocation (the meaning of the phrase is offered). Two points are allocated for two parts of speech, 1 point is allocated for multiple meanings, and 1 point is allocated for collocation; thus, the maximum score of the VDT is 160 points.

The PLT has five levels with 18 items each. One point is awarded for each correct item; thus, the maximum score of the VLT is 90 points.

The PVDT consists of 20 target words, and the participants are asked to explain the presented word and to make a sentence using it. The maximum score is 80 points, including 2 points for explaining the meaning and 2 points for making a sentence. By definition, 1 point is awarded for the proper sentence structure, and 1 point is awarded for the appropriate explanation. After making a sentence, 1 point is awarded for proper sentence structure, and 1 point is awarded for a correct collocation or phrase. Half of 1 point is deducted for misspelling.

The VFT awards 1 point per correct target word, and the maximum score is 80 points, including four passages with 20 missing words for each message.

On reading scoring, the maximum score is 100 points for 40 questions, and each question can be awarded 2.5 points based on the standard answers provided.

Data analysis

In this research, four dimensions of L2 vocabulary knowledge, covering receptive vocabulary breadth, receptive vocabulary depth, productive vocabulary breadth, and productive vocabulary depth, are used as the independent variables; reading proficiency is used as the dependent variable; and vocabulary fluency is used as the moderation variable. We used AMOS 24.0 to analyze the data collected.

Results

Composite reliability, convergence and discriminant validity

Confirmatory factor analyses (CFA) were adopted to check the collected data of all the measurement models. First, all the factor loading values were almost higher than 0.7 (see Table 2). The composite reliability (CR) values were over 0.7, as suggested by scholars (Hair et al., 1997). Additionally, the convergence validity values evaluated by average extracted variances (AVEs) were above 0.5 (Fornell and Larcker, 1981). Hence, all data had good reliability and validity.

Table 2 Reliability and Validity.

In Table 3, the discriminant validity is acquired by calculating the square root of AVE, and all the correlations of the two variables below are lower than the AVE square roots in bold. Therefore, it is reasonable to believe that the study has good discriminating validity. Furthermore, the Pearson correlation coefficient between all vocabulary and reading variables is presented. Importantly, all correlation coefficients between independent variables are lower than r = 0.75, indicating that collinearity was absent in the model.

Table 3 Discriminant validity.

Table 3 shows not only discriminant validity but also the correlation between independent variables, all lower than r = 0.75, indicating that multicollinearity does not exist.

Pearson correlation analysis

The Pearson correlation coefficients between the observational and the latent variables are presented in Table 4.

Table 4 Pearson correlations between variables and reading.

In Table 4, the VDT and reading have the strongest correlation with (r = 0.513, p < 0.001); the VLT ranks second with (r = 0.480, p < 0.001); and the connections between productive vocabulary breadth/depth and reading are (r = 0.356, p < 0.001) and (r = 0.364, p < 0.001), respectively. The VFT correlates lowest with reading proficiency (r = 0.275, p < 0.001).

Although the degree of correlation between each exogenous latent variable (the four dimensions of L2 vocabulary knowledge) and endogenous latent variable (reading proficiency) was detected by Pearson correlation analysis, standardized regression analysis is a statistical method used to check the importance of the independent variables to the dependent variables. A structural equation model was built based on the conceptual framework to answer the first research question below.

RQ1: How important are the four dimensions of L2 lexical knowledge for reading proficiency?

Structural equation model (SEM)

The objective of the present study was to determine the relationships among the four dimensions of L2 word knowledge and reading proficiency based on the moderation effect of lexicon fluency. Based on the concept framework, the SEM of the link between lexical knowledge and reading proficiency was established. Due to uncertainty over whether lexical fluency functions better as an independent or moderator variable, two research models were developed to test authenticity. In Fig. 2, VFT is involved in the model where the goodness-of-fit indices are presented: the chi-square to degrees of freedom ratio is 2.095, GFI is 0.878, AGFI is 0.847, CFI is 0.942, TLI is 0.933, RMR is 0.045, and RMSEA is 0.059, which are all acceptable, as suggested by scholars (Fornell and Larcker, 1981). Hence, the research model is a good representation of the complete set of causal relationships. However, as shown in Fig. 2, compared to those of the other dimensions of vocabulary knowledge, the standardized regression coefficients of the VFT and reading are only β = −0.02, which indicates that this research model is perfect, but it is not appropriate to make vocabulary fluency an independent variable (see Fig. 2).

Fig. 2: Structure equation model with VFT.
figure 2

It aims to verify the plausibility of the designed research model, and calculate the effect of all independent variables (VLT, VDT, PLT, PVDT, and VFT) on the dependent variable (RP).

Figure 2 shows a research model of the relationship between L2 vocabulary knowledge and reading proficiency, among which vocabulary fluency is treated as an independent variable.

Since vocabulary fluency did not present its superiority as an independent variable, a research model without VFT is established in Fig. 3. The fit indices are also good: the chi-square divided by degrees of freedom is 2.415, GFI is 0.883, AGFI is 0.849, CFI is 0.940, TLI is 0.930, RMR is 0.067, and RMSEA is 0.045. Based on the standardized regression coefficients of the four dimensions of L2 word knowledge and reading ability, the VDT has the highest value (β = 0.38), followed by the VLT with β = 0.33; however, productive vocabulary breadth and productive vocabulary depth have lower regression values with reading proficiency (β = 0.20 and β = 0.19, respectively) (see Fig. 3).

Fig. 3: Structure equation model without VFT.
figure 3

It is the same research model except for not including one independent variable (VFT).

Figure 3 shows a research model of the relationship between L2 vocabulary knowledge and reading proficiency, among which vocabulary fluency is not included as an independent variable.

To answer RQ1, a structure equation model is established to explore the effect of the four dimensions of L2 vocabulary knowledge on reading capability (see Fig. 3). Among the four dimensions of L2 word knowledge, the standardized regression coefficient of the VDT with reading proficiency is β = 0.38, and those of the other three dimensions, consisting of the VLT, the PLT, and the PVDT, with reading proficiency, are β = 0.33, β = 0.20 and β = 0.19 respectively. Therefore, the effect of the four dimensions of L2 lexical knowledge on reading is different, and receptive vocabulary depth (VDT) has the highest regression value; that is, the VDT plays the most crucial role in reading proficiency, followed by the VLT, whereas the other productive vocabulary dimensions are almost equally important.

Meanwhile, squared multiple correlation (SMC), that is, R2, can be calculated by standardized regression analysis. Therefore, the second research question below can be answered.

RQ2: To what extent can reading proficiency be predicted by L2 lexical knowledge?

As presented in Fig. 3, the estimates indicate that the predictors of RP explain 49% of its variance. In other words, the error variance in RP is approximately 49% of the variance in RP itself. The overall predictive strength of the four dimensions of L2 word knowledge on reading proficiency is calculated with R2 = 0.64, which indicates that whole vocabulary knowledge explains 49% of the variance in reading comprehension scores.

Since vocabulary fluency does not show its advantages as an independent variable, we tried to use it as a moderator variable to empirically investigate its real value to answer the third research question below.

RQ3. Is vocabulary fluency more effective as a moderator or independent variable?

Moderation effect

A moderator variable affects the degree of the link between the independent and dependent variables. That is, the third variable M influences the connection between variable X and variable Y. The moderator variable has an interactive effect. Ping’s Single Product Indication (1995) evaluates whether vocabulary fluency influences the correlation between L2 lexicon and reading proficiency.

The moderating role of vocabulary fluency (VLT vs. RP)

Step 1: We calculate factor loadings and residual values by the main effect.

Step 2: The factor loadings and residual values gained in Step 1 are entered into the place of the corresponding factor loading and residual error of the interaction item. A significant result means that there is an interaction (see Fig. 4).

Fig. 4: Main effect model.
figure 4

It is a primary effect model of the relationship between VLT and VFT with RP.

Figure 4 shows a primary effect model of the relationship between the VLT and VFT with reading proficiency, by which all loading values and residual values of variables VLT and VFT are calculated.

All loading and residual values of variables VLT and VFT are added to Ping’s single indicator interaction. Below, we calculate a whole residual value and loading value that are added into the location, as shown in Fig. 5, in step 2, to verify that the interaction item and the dependent variable are significant (see Fig. 5).

Fig. 5: Non-linear setting and interaction.
figure 5

This is a nonlinear setting and interaction model to check the significance of the moderator variable.

Figure 5 is a nonlinear setting and interaction model by which the significance of the moderator variable is verified.

Based on the analysis result shown in Fig. 5, we present the data in Table 5.

Table 5 Moderator.

In Table 5, the p value of the moderator variable (MO) is significant, indicating an interaction; that is, vocabulary fluency moderates the link between receptive lexical breadth and reading ability.

The moderating role of vocabulary fluency (VDT vs. RP)

Step 1: We calculate factor loadings and residual values by the main effect.

Step 2: The factor loadings and residual values obtained in Step 1 are entered into the place of the corresponding factor loading and residual error of the interaction item. A significant result means an interaction (see Fig. 6).

Fig. 6: Main effect model.
figure 6

It is a primary effect model of the relationship between VDT and VFT with RP.

Figure 6 shows a primary effect model of the relationship between the VDT and VFT and reading proficiency, by which all loading values and residual values of variables VDT and VFT are calculated.

All loading values and residual values of variables VDT and VFT are added into Ping’s single indicator interaction; we calculate a whole residual value and loading value, which are added to the location, as shown in Fig. 7, in step 2 to verify that the interaction item and the dependent variable are significant (see Fig. 7).

Fig. 7: Non-linear setting and interaction.
figure 7

This is a nonlinear setting and interaction model to check the significance of the moderator variable.

Based on the analysis result of Fig. 7, we present the data in Table 6.

Table 6 Moderator.

In Table 6, the p value of the MO is significant, demonstrating that vocabulary fluency moderates the link between receptive vocabulary depth and reading.

The moderating role of vocabulary fluency (PLT vs. RP)

Step 1: We calculate factor loadings and residual values by the main effect.

Step 2: The factor loadings and residual values obtained in Step 1 are entered into the place of the corresponding factor loading and residual error of the interaction item. A significant result means an interaction (see Fig. 8).

Fig. 8: Main effect model.
figure 8

It is a primary effect model of the relationship between PLT and VFT with RP.

Figure 8 shows a primary effect model of the relationship between the PLT and VFT and reading proficiency, by which all loading values and residual values of variables PLT and VFT are calculated.

All loading values and residual values of variables PLT and VFT are added into Ping’s single indicator interaction; we calculate the whole residual value and loading value, which are added to the location, as shown in Fig. 9, in step 2 to verify that the interaction item and the dependent variable are significant (see Fig. 9).

Fig. 9: Non-linear setting and interaction.
figure 9

This is a nonlinear setting and interaction model to check the significance of the moderator variable.

Based on the analysis result of Fig. 9, we present the data in Table 7.

Table 7 Moderator.

In Table 7, the p value of the MO is significant, indicating that vocabulary fluency moderates the correlation between productive vocabulary breadth and reading.

The moderating role of vocabulary fluency (PVDT vs. RP)

Step 1: We calculate factor loadings and residual values by the main effect.

Step 2: The factor loadings and residual values obtained in Step 1 are entered into the place of the corresponding factor loading and residual error of the interaction item. A significant result means an interaction (see Fig. 10).

Fig. 10: Main effect model.
figure 10

It is a primary effect model of the relationship between PVDT and VFT with RP.

Figure 10 shows a primary effect model of the relationship between the PVDT and VFT and reading proficiency, by which all loading values and residual values of variables PVDT and VFT are calculated.

All loading values and residual values of variables PVDT and VFT are added into Ping’s single indicator interaction; we calculate the whole residual value and loading value, which are added to the location, as shown in Fig. 9, in step 2 to verify that the interaction item and the dependent variable are significant (see Fig. 11).

Fig. 11: Non-linear setting and interaction.
figure 11

This is a nonlinear setting and interaction model to check the significance of the moderator variable.

Based on the analysis result of Fig. 11, the data in Table 8 can be obtained as follows.

Table 8 Moderator.

In Table 8, p ≤ 0.001 means that an interaction exists, which indicates that vocabulary fluency moderates the relationship between productive lexicon depth and reading proficiency.

Turning to RQ3, is it more reasonable to use vocabulary fluency as an independent or moderator variable? We first consider vocabulary fluency as an independent variable, like the four dimensions of L2 lexical knowledge in Fig. 2. Unfortunately, the standardized regression coefficient of vocabulary fluency with reading proficiency is a negative value with β = −0.02, which indicates that vocabulary fluency does not have a significant effect. Hence, we further investigated whether vocabulary fluency alters the effect of the four dimensions of L2 vocabulary knowledge on reading proficiency by Ping’s Single Product Indication (1995), and we found that the interaction item and every dependent variable are significant. Therefore, it is advisable to administer vocabulary fluency as a moderator variable to affect the connection between L2 word knowledge and reading capability.

Discussion

In this study, we ascertained the relationship between the four dimensions of L2 lexical knowledge and reading proficiency across Chinese university EFL learners, achieving several enlightening findings on the understanding, teaching, and learning of vocabulary knowledge.

The effectiveness of lexical knowledge on reading

Of the four dimensions of L2 word knowledge ascertained, the VDT correlated most strongly with L2 reading proficiency (r = 0.51, p < 0.001). The correlation coefficient of receptive vocabulary breadth with L2 reading proficiency was r = 0.48, p < 0.001. The productive vocabulary breadth/reading and the productive vocabulary depth/reading were almost equal (r = 0.36, p < 0.001). These results indicate that receptive vocabulary knowledge has a medium correlation with L2 reading. Productive vocabulary knowledge has a small correlation with L2 reading based on the criterion suggested by Plonsky and Oswald (2014) that a correlation coefficient close to r = 0.25 is considered small, r = 0.40 is medium, and r = 0.60 is considered large. In contrast to the results of other researchers, Qian (2002) confirmed that the Pearson correlation between receptive vocabulary breadth and reading was r = 0.74 and r = 0.77 for receptive vocabulary depth/reading, and Stæhr (2008) found a stronger association with a reported Pearson correlation of 0.83 between receptive vocabulary breadth and reading. A probable interpretation for the large differences between these results may lie in the different measurement instruments used because the researchers adopted a recognition test format that may lead to blind guessing of words; thus, test takers may score higher than their actual levels (Kremmel and Schmitt, 2016; McLean et al., 2020). In addition, it may have collinearity if the Pearson correlation is >0.75.

Likewise, based on a meta-analysis, Jeon and Yamashita (2014) acknowledged that the mean correlation between vocabulary and reading was r = 0.79; Zhang and Zhang (2020) obtained an average correlation between word knowledge and reading of r = 0.57, with these vastly differing in the two meta-analyses because of the selection of different study samples. That is, most studies in the meta-analysis selected by Jeon and Yamashita probably used early vocabulary measurement instruments such as meaning-recognition tests (VLT) designed by Nation (1983, 1990) and Schmitt et al. (2001), whereas Zhang and Zhang included both written meaning-recall tests and written meaning-recognition tests in their meta-analysis; therefore, the r = 0.57 in Zhang and Zhang’s research may be more reliable. Furthermore, among the previous studies, only Cheng and Matthews (2018) found that productive vocabulary breadth had the strongest correlation with reading (r = 0.57), which is higher than our findings (r = 0.36); however, reading belongs to receptive language proficiency; thus, is it possible that receptive vocabulary knowledge is more effective? In addition to the Pearson correlation, for this study, we also analyzed the standardized regression coefficient, another key index, to investigate the importance of the independent variable for the dependent variable. In this study, receptive vocabulary knowledge was found to have a more critical effect on L2 reading in terms of β = 0.38 for depth and β = 0.33 for breadth in contrast to productive vocabulary knowledge covering PLT and PVDT with β = 0.20 and β = 0.19. That is, whether it is Pearson correlation or multiple regression, receptive lexical depth and breadth play important but moderate roles. Hence, we suggest that developing receptive vocabulary knowledge is the first learning task associated with successful L2 reading and accounting for two productive vocabulary dimensions. Only by objectively evaluating and balancing the learning of L2 vocabulary knowledge and other linguistic knowledge can Chinese college students make the most of their limited learning time, correctly understand the effect of L2 vocabulary knowledge on reading proficiency, rationally adjust the conflict between their needs, and inefficiency of effective vocabulary learning methods, respond positively and flexibly, and experience more extraordinary learning achievement and satisfaction.

The predictive value of vocabulary knowledge on reading

We demonstrated the value of lexicon knowledge in predicting reading proficiency because whole-word knowledge explained 49% of the variance in reading comprehension scores. Qian (2002) found that the average variance explanation of depth and breadth for reading is 56%. Additionally, receptive lexical breadth can explain 72% of the variance in L2 reading (Stæhr, 2008), and productive lexical breadth accounts for 33% (Cheng and Matthews, 2018). There are many reasons for different prediction values, but the most common reason may be different lexical measurement instruments, usually designed based on recognition tests. Since scores on the whole vocabulary knowledge can explain almost 50% of the variance in reading testing scores, we suggest that Chinese university students should focus on vocabulary learning in a balanced learning process. We suggest taking more time and energy to learn vocabulary than other linguistic knowledge because Chinese university students pay much more attention to grammar learning. The findings help clarify the predictive strength of L2 vocabulary knowledge for reading proficiency, deepening Chinese college students’ understanding of vocabulary learning from a comprehensive perspective.

The moderating role of vocabulary fluency

The third finding of this study is the moderating effect of vocabulary fluency on the link between L2 vocabulary knowledge and reading ability. To avoid a Type- I error, we first included vocabulary fluency as an independent variable in the main SEM to measure the respective effect of the five dimensions of L2 vocabulary knowledge on reading proficiency, which addressed a negative standardized regression coefficient with β = −0.02 between vocabulary fluency and reading ability, that is, it is inappropriate for lexical fluency to enter the main model as an independent variable because it does not play an important role. In contrast, Li and Zhang (2019) found that among the three dimensions of lexical knowledge, vocabulary breadth/size has the most critical effect on L2 listening (β = 0.36), followed by β = 0.17 for vocabulary depth and β = −0.22 for vocabulary fluency. The authors concluded that all three dimensions of lexical knowledge significantly predicted L2 listening proficiency. Since β is negative, the dependent variable does not increase as the independent variable increases. However, the importance of vocabulary fluency is obvious; thus, it has the potential to take on other essential responsibilities.

Given that it was demonstrated inappropriate for vocabulary fluency to be included in the main structure model, we examined vocabulary fluency as a moderator variable of the effect of word knowledge on reading proficiency by the interaction effect. The results show that vocabulary fluency potentially moderates the relationship between L2 word knowledge and reading proficiency, which aligns with the view mentioned in the literature review that fluency is not a gift in the snow but icing on the cake. That is, the primary task of learners is to master receptive vocabulary knowledge and to develop it into productive vocabulary knowledge. In the process, from receptivity to productivity, they are more fluent and more proficient. Fluency, then, acts as a “lubricant” or “booster”.

Conclusion

We demonstrated that the four dimensions of L2 word knowledge correlate differently with IELTS reading scores and the moderation effect of vocabulary fluency. Our research findings affirmed the contribution of receptive vocabulary knowledge to the assessment of L2 reading with a medium correlation, especially receptive vocabulary depth. However, productive vocabulary knowledge was only slightly correlated with reading. Another contribution of our findings is that vocabulary fluency effectively accelerates the continuum process of vocabulary knowledge from receptivity to productivity.

First, the findings undoubtedly reflect the emphasis given to receptive lexical knowledge in evaluating L2 reading. Second, the findings also show the potential importance of overall vocabulary knowledge. Finally, our findings also address the critical role of vocabulary fluency used as a moderator variable. Some suggestions can be given to teachers and students for the teaching and learning of vocabulary knowledge. Teachers and students should selectively teach and learn vocabulary instead of attempting to utilize all the words in the word list of the textbook or dictionary; for example, receptive vocabulary depth and breadth should be prioritized. Furthermore, the various methods of acquiring vocabulary knowledge should be focused on by teachers and students, not only form-meaning links but also other aspects of knowledge of a word to command vocabulary knowledge effectively and use it in a communicative context. Additionally, much more attention should be given to vocabulary fluency development to improve reading proficiency based on the contribution of L2 vocabulary knowledge.

Through these findings, Chinese university students’ vocabulary learning experiences can be gradually enriched, self-evaluation of vocabulary learning can be constantly revised, and effective vocabulary learning methods can ultimately be achieved.

Of course, there are some inevitable limitations to this study that hinder its generalizability. First, although the large research sample was taken from students of six majors, the single-site study was relatively insufficient; therefore, multisite samples should be recruited in future research. Second, only three aspects are adopted, covering two kinds of word parts, one multiple meaning, and one collocation, involved in the receptive vocabulary depth test. In contrast to vocabulary depth, which should include all aspects of vocabulary knowledge, our test items were relatively inadequate. Future studies should consider other aspects of vocabulary depth to obtain a more precise picture of test takers’ depth of lexical knowledge based on the research purpose.