Reading comprehension depends on the ability to read words accurately and fluently (García & Cain, 2014). However, there has been little research on what specific levels of word recognition ability must be achieved for progression in reading. To date, six studies have addressed this question in Danish (Juul et al., 2014), English (Magliano et al., 2022; O’Connor, 2018; Wang et al., 2019), and German (Karageorgos et al., 2019, 2020). Each has located inflexion points in the relationship between word recognition and reading comprehension, or between word reading accuracy and word reading speed, confirming that readers must surpass a basic word recognition threshold to advance in the development of reading competence. In addition, one study identified an upper word recognition threshold, after which better word recognition does not result in improvements in reading comprehension (O’Connor, 2018).

In the current study we propose and test the existence of a new critical moment in reading development at which the level of word recognition allows comparable performance between listening and reading comprehension of narrative texts (Sánchez & García, 2021): The oral-written matching threshold. This is in accordance with the Simple View of Reading (SVR: Hoover & Tunmer, 2018). We hypothesize the existence of two versions of this threshold, depending on the mastery of word recognition required, the functional and the efficient threshold. Our second aim is to confirm that each threshold represents an increasing challenge, with the functional and the efficient oral-written matching thresholds sitting between the basic and the upper ones. A third aim is to test whether these thresholds depend on the reading comprehension materials.

Finding and ordering thresholds and milestones in learning to read is important for understanding reading development, and for informing assessment and interventions to remediate reading difficulties (Karageorgos et al., 2020). Identification of the oral-written matching thresholds will enable the specification of a well-defined trajectory for early reading development.

Prior research examining word recognition thresholds

Studies that have confirmed the existence of a basic and an upper word recognition threshold have defined these thresholds differently and focused on different measures of word recognition. Two definitions have been proposed for the basic threshold and one for the upper threshold.

Basic word recognition threshold as the level that enables the self-teaching mechanism

The first definition of the basic threshold is the accuracy level of decoding required before word-reading speed starts to improve (Juul et al., 2014; Karageorgos et al., 2019, 2020). This principle is in line with the Self-Teaching Hypothesis (Share, 1995), according to which accurate word recognition allows students to match unfamiliar printed words with spoken words, enabling readers to accumulate the lexical knowledge necessary to use orthographic representations to read fluently.

Juul et al. (2014) studied beginner readers and assessed both accuracy and speed of reading aloud of words in lists. They found the expected “accuracy-before-speed pattern”: Speed did not improve until participants reached an accuracy level of 70%, and the correlation between accuracy and speed was not significant amongst participants scoring below this level. In addition, participants who took longer to reach this level gained less in speed than those who reached this level earlier.

Karageorgos et al. (2019) reproduced the basic threshold finding with grade 4 readers, confirming that a specific word-recognition accuracy level (in this case, 71% success in a lexical decision task) was required before word-reading speed (in the same lexical decision task) started to improve. They also found that children above this basic accuracy level achieved higher reading comprehension scores than those below this threshold, after an intervention to improve accuracy and speed. In a second study, Karageorgos et al. (2020) found that students who achieved the basic word-recognition accuracy threshold by the end of grade 1 improved more (and performed better by the end of grade 4) in word-recognition speed and reading comprehension, than those who reached the threshold at later grades (or who did not reach it).

In sum, according to the first definition of the basic threshold, the critical indicator is accuracy of word reading (as a prerequisite for speed). Studies to date have measured this indicator using two methods of decontextualized word reading: A list of words (Juul et al., 2014) or a lexical decision task (Karageorgos et al., 2019, 2020).

Basic word recognition threshold as the level that enables reading comprehension processes

The second definition of the basic word recognition threshold is the minimum level of word recognition sufficient to facilitate comprehension (Magliano et al., 2022; O’Connor, 2018; Wang et al., 2019). This definition is coherent with the Lexical Quality Hypothesis which emphasises the importance of high-quality lexical representations to enable accurate and efficient identification of words and their meanings, freeing up cognitive resources for higher order comprehension processes (Perfetti, 2007).

O’Connor (2018) assessed the speed reading aloud of passages and sentences in grade 2 and 4 students with and without reading difficulties. She found a lower threshold band, indicating that a slower reading speed was not sufficient to facilitate comprehension. The relationship between speed and comprehension varied as a function of grade, student type, and comprehension materials (passages or sentences). For students with reading disabilities, the basic word recognition threshold was around 40 correct words per minute (wpm); for typical readers, there was no evidence of a basic level inflexion point, perhaps because all typical readers in this sample had exceeded this threshold.

Similarly, Wang et al. (2019) found that, between grades 5 to 10 students, reading comprehension was only significantly related to decoding when accuracy was above a score of 235 (on a standardized scale with a mean of 250 and a standard deviation of 15) in a word recognition and decoding task: a task where participants had to decide whether each item of a list of real words, nonwords, and pseudohomophones was a real word, was not a real word, or it sounded like a real word. Further, students below this decoding threshold did not improve their reading comprehension score between grades 5 and 10, while their peers above the decoding threshold did.

Magliano et al. (2022) confirmed the existence of a threshold for college students after which there was a relationship between word recognition and reading comprehension scores: A cutoff score of 20 (out of a maximum of 52) in the same word recognition and decoding task used by Wang et al. (2019)Footnote 1. However, this threshold was not found for all tasks: It was evident for a task in which participants read short passages and answered literal and inferential multiple-choice questions, but not for a complex literacy task in which participants had to evaluate, integrate, synthesize, and reason with information from various sources to reach a meaningful reading goal (to solve a problem). The authors speculate that this finding arose because the latter task relies strongly on prior knowledge.

These three studies adopt the same definition of the basic threshold; one that enables processing for meaning (reading comprehension). However, the word reading measures used in these studies (correct wpm for sentences and passages: O’Connor, 2018; and score in a word recognition and decoding task: Magliano et al., 2022; Wang et al., 2019) differ in two potentially critical ways: Contextualized vs. decontextualized measures and reading speed vs. accuracy reading. An assessment of accuracy is compatible with the first definition of the basic word recognition threshold, discussed earlier; the basic word recognition threshold may be the minimum level of word recognition accuracy needed to enable both the self-teaching mechanism and higher order comprehension processes (Karageorgos et al., 2020; Wang et al., 2019).

Upper word recognition threshold as the level at which the relationship between word reading and reading comprehension breaks down

Some authors argue for an upper-bound inflexion point; the point beyond which the relationship between word recognition and reading comprehension ceases to be significant (Wang et al., 2019). However, only O’Connor (2018) has identified an upper boundary of reading rate related to reading comprehension after which faster speeds did not positively impact comprehension in grades 2 and 4 students. Word reading speed was assessed while reading passages or sentences (a contextualized measure). For students with reading disabilities, this upper bound was 75–90 correct wpm; for typical readers it was 110–150 correct wpm.

In conclusion, prior research indicates that the relation between word reading and reading comprehension is not linear; there are some critical levels or thresholds (which can be identified) in the development of word recognition ability and reaching them has important consequences for further improvements in reading. This is consistent with the conclusion that word recognition is a “pressure point” for reading comprehension (Compton & Pearson, 2016). However, these studies have not considered nor measured a crucial skill in their definition of the thresholds: Listening comprehension. Listening comprehension is central to the SVR. A consideration of listening comprehension allows us to propose and explore a new threshold that could be situated between the basic and the upper thresholds (the oral-written matching threshold: Sánchez & García, 2021) and it provides an alternative way to define the basic word recognition threshold.

A different way of defining thresholds: specifying the meaning of 1 and 0 for the word recognition component in the SVR

According to the SVR, reading comprehension is the product of two broad cognitive capacities: Word recognitionFootnote 2 and language comprehension (Hoover & Tunmer, 2018). Assessment of the reading and language comprehension components requires parallel assessments, presented visually and aurally (for reading and listening comprehension, respectively), which is why many use the term ‘listening comprehension’ to contrast with other assessments of language, such as vocabulary knowledge. According to the SVR, each component can range in value from 0 (skill not present) to 1 (perfect performance). This implies that there is a level (or threshold) of word recognition that allows for matching between listening and reading comprehension, at least in early reading development and with narrative/simple texts (Clinton-Lisell, 2022; Singh & Alexander, 2022). At this level, it is assumed that a reader will generate meaning-based representations of a text for reading and listening that are of a comparable quality. Below this level, a reader’s reading comprehension will be limited by word recognition and lower than for comprehension of the same text read aloud to her/his. We refer to this level as the oral-written matching threshold (Sánchez & García, 2021), which should be situated between the basic and the upper thresholds already identified.

Our main aim is to confirm the existence of the oral-written threshold and to illustrate how to locate it for a group of readers from grades 1 to 3 (6–9 years old). Some studies suggest that listening and reading comprehension are comparable by around grades 3 or 4 (Cain & Bignell, 2014; Diakidoy et al., 2005). However, the level of word reading ability required to enable this was not explored in these studies nor in two meta-analyses testing which moderators affect the comparison between listening and reading comprehension (Clinton-Lisell, 2022; Singh & Alexander, 2022). We sought to determine if it is possible to identify this level of word recognition by analyzing when good oral comprehenders achieve comparable performance in listening and reading comprehension. This matching could be addressed in two ways.

One option is to develop an acceptable level of word reading, although this will not be equal to the delivery rate of an expert reading aloud. We call this the functional word recognition threshold. On reaching this threshold, the product of reading and listening comprehension for good oral comprehenders reading suitable texts will be comparable, although reading the text may take longer (and, consequently, require more effort) than listening.

The second option is to read the words in the text close to the delivery rate of an expert reading the text aloud. We refer to this as the efficient word recognition threshold because it means that processing written words in a text does not consume more time than processing aurally presented words.

To test the existence and significance for reading development of these two versions of the oral-written matching threshold it is important to compare them with the thresholds identified in other studies. In this sense, the SVR provides another useful way to define the basic word recognition threshold. That is, the multiplicative formulation of the SVR also implies that children will not be able to exploit their listening comprehension capacity until reaching some word recognition capability. Thus, the basic word recognition threshold can also be considered as performing sufficiently beyond zero for word recognition, for the reader to benefit from their listening comprehension skills. Below this threshold, listening and reading comprehension will be dissociated; above this threshold, they will be related.

There is some indirect evidence to support this conclusion: (1) Listening comprehension is more related to reading comprehension some years after learning to read (e.g., Diakidoy et al., 2005; García & Cain, 2014); (2) listening comprehension is a good predictor of reading comprehension for readers of shallow orthographies earlier than for readers of deep orthographies (Florit & Cain, 2011); and (3) the slope of the prediction of reading comprehension from listening comprehension increases with increases in word recognition level (Chen & Vellutino, 1997). In sum, if the strength of the association between listening and reading comprehension increases with word recognition improvement, there will be a specific level of word recognition where the correlation between the two becomes significant. In this paper we calculate this level to compare it with the oral-written matching thresholds. This way of defining the basic word recognition threshold is similar to the definition proposed by others (e.g., Magliano et al., 2022; O’Connor, 2018; Wang et al., 2019); but a definition that includes the listening comprehension component facilitates an understanding of why there is a point where word recognition starts to be related to reading comprehension.

Word recognition skills and measures

A specific requirement for situating the oral-written matching thresholds in relation to other thresholds is to compare the critical level of performance and measure of word reading skill associated with each threshold. Prior studies have used different and unique measures of word recognition: Accuracy of reading words in a list (Juul et al., 2014); accuracy on a lexical decision task (Karageorgos et al., 2019, 2020); accuracy on a word recognition and decoding task (Magliano et al., 2022; Wang et al., 2019); or correct wpm when reading text (O’Connor, 2018). Each measure taps different cognitive processes, although they also share substantial variance (Juul et al., 2014; Wang et al., 2019). In addition, prior studies have not included any measure of only pseudoword reading, which captures the use of phonological recoding for word reading and is considered a suitable measure of word recognition for younger readers (Hoover & Tunmer, 2018)Footnote 3. Among European alphabets with a shallow orthography, like Spanish (but also German and Dutch), the phonological route can provide sufficient accuracy access to understand most printed words and mobilize reading comprehension (Daniels & Share, 2018; Müller & Brady, 2001). Thus, the phonological route may be more related to reading comprehension in Spanish than in other European languages with a deep orthography (like English), and, therefore, a certain level of pseudoword reading could act as a basic word recognition threshold.

For our purposes, we included a range of word recognition measures, as recommended by Wang et al. (2019) - accuracy and speed of reading words and pseudowords in lists and texts - and controlled the contribution of each measure to reading comprehension to determine which measure was critical for each threshold. Our comprehensive set of measures captures different levels of proficiency in the development of word reading, because accuracy precedes speed (Juul et al., 2014; Karageorgos et al., 2019; Share, 1995), the phonological route is the basis for the lexical one (Share, 1995), and reading in context is easier (and more ecologically valid) than reading isolated words (Fernandes, 2018). Consequently, our set of measures enables us to determine the hierarchy between the different thresholds.

The present study

Spanish readers from grade 1 to 3 (6–9 years old) read one of two narrative texts, listened to the other one, and were assessed on four decontextualized measures of word recognition (accuracy and time taken to read lists of words and pseudowords) and two contextualized measures of word recognition (accuracy and time taken to read the texts). There were three objectives. First to test for the existence of the oral-written matching threshold: The level of word recognition that enables comparable performance between listening and reading comprehension when young students read narrative texts. We assumed that, for good oral comprehenders, this matching could be achieved in two ways: (1) Reaching an acceptable level of word reading although not equal to the listening delivery rate (the functional threshold) or (2) reading the text close to the pace of listening when an expert reader reads it aloud (the efficient threshold).

The second objective was to place the oral-written matching thresholds with respect to the basic and the upper word recognition thresholds. We expected to find progression from the easiest to the most challenging threshold: Basic threshold, functional threshold, efficient threshold, and upper threshold. Support for this hypothesis would come from the critical reading measure or skill (e.g., accuracy for the basic threshold vs. speed for the other thresholds) and level (e.g., exact value of speed) associated with each threshold, and from the number of students achieving each threshold (globally and grade by grade). Thus, we compared the results obtained when searching for the functional and efficient thresholds with the results obtained in the current study when searching for a basic word recognition threshold and also with the upper word recognition threshold identified by O’Connor (2018).

The third objective was to test whether the thresholds identified in the current study (basic, functional, efficient) depend on the text being read. The SVR recognizes that performance is dependent on the materials (Hoover & Tunmer, 2018, supported empirically by O’Connor, 2018 and Magliano et al., 2022). To test for this, the two narrative texts differed in difficulty, word frequency, and length. We expected participants would need to be better readers to achieve the thresholds in the more difficult text.

Method

Participants

Participants were 344 students from four primary schools in Salamanca (Spain): Grade 1 (6–7 years, N = 104), Grade 2 (7–8 years, N = 96), Grade 3 (8–9 years, N = 144). One school was a state school in a rural town near the city; the others were urban schools supported by both public and private funds. We assessed all students enrolled in these grades, except six who were repeating grades, one who was not able to read, and three absentees. All were native Spanish speakers or had a good level of Spanish.

Variables and instruments

Reading and listening comprehension

We used two texts from the standardized Spanish PROLEC-R battery (Cuetos et al., 2007): Carlos and Marisa’s birthday (henceforth Carlos and Marisa: See Online Resource 1). Both have a narrative structure: The main characters responded to an initiating event generating goals that motivated actions to achieve them. Carlos was shorter (94 words) than Marisa (134 words) with a greater proportion of high frequency words (75.6% vs. 61.3%) according to Martínez and García’s (2004) dictionary. Carlos has a lower Crawford Index of formal readability (4.3 compared with 5 for Marisa). In line with these metrics, Carlos is referred to as the easy text, Marisa as the difficult text, although this does not mean that the first one represents the kind of easy texts students face at these ages and the second one represents the kind of difficult texts that they read.

Each student read aloud only one text randomly assigned for the reading comprehension assessment (visual presentation mode) and listened to the other text read aloud by the evaluator for the listening comprehension assessment (aural presentation mode). Oral reading allows the assessment of contextualized word reading (see below) and prevents beginning readers from skipping words (Cao & Kim, 2021). Nevertheless, it may impact reading comprehension in a different way than silent reading. Thus, we will discuss the implications of this decision in the limitations and future directions section.

After each text, participants answered orally, without returning to the text, six open-ended questions (four inferential, two literal) read by the evaluator. The literal questions were created for this study to increase the range of scores. One point was awarded for each correct answer. Two independent judges analyzed the answers of 21 randomly selected participants. A Kappa of 1 was obtained for both texts. Cronbach’s alpha for Carlos and Marisa were 0.48 and 0.56, and 0.48 and 0.50, for reading and listening comprehension, respectively. Given the low number of items and the diverse information tapped by them, these values were considered acceptable.

Each participant was assessed by one of two evaluators who read aloud the texts. Listening comprehension of participants who listened to each evaluator were comparable: Carlos, Ms = 4.71 and 4.81, p = .59; Marisa, Ms = 4.25 and 4.59, p = .09.

Contextualized measures of word reading

Participants were recorded while reading their allocated reading text to obtain two contextualized measures: Word reading accuracy (number of words read correctly) and word reading speed (time, in seconds, taken to read the whole text). From these we calculated correct wpm to enable comparison of participants’ and evaluators’ reading speed and to determine the oral-written matching thresholds.

Decontextualized measures of word reading

Participants read aloud two lists of 40 words (not included in the texts) and 40 pseudowords from the Spanish PROLEC-R battery (Cuetos et al., 2007). They were asked to read as fast as they could, trying not to make mistakes. Four measures were recorded: Word and pseudoword reading accuracy (number of words/pseudowords read correctly), and word and pseudoword reading speed (time, in seconds, to read all 40 words/pseudowords). Cronbach’s alpha for this test is 0.74 for word reading and 0.68 for pseudoword reading (Cuetos et al., 2007).

Procedure

Parental consent was obtained through head teachers. Two trained graduates administered the tasks during school time in a quiet place in schools. There was a single individual session at the end of the academic year (May/June), of approximately 20 min. The sessions were audio-recorded to facilitate scoring. Participants read the two lists of words and pseudowords, and then they read and listened to each text. A cross-over design was employed to control for order and presentation mode. Thus, half of the sample listened to the easy text and read the difficult text and the other half did the opposite.

Results

We present three sets of results: First, the descriptive statistics and correlations between all variables for the whole sample; second, the tests for the basic word recognition threshold above which the relationship between listening and reading comprehension is significant; and third the examination of evidence for the functional and the efficient oral-written matching thresholds, with just the good oral comprehenders. We did not remove outliers because we were interested in all levels of the skills measured, to explore whether the relationship between listening and reading comprehension changed with the level of word recognition. The rejection level for all analyses was set at.05. The data files and analysis scripts are available in https://osf.io/udw72/?view_only=18ae88e2594445c284452580add0eb95.

Descriptive statistics and correlations

Table 1 presents the descriptive statistics for the whole sample. There were no significant differences between participants who read the easy text and listened to the difficult text and participants who read the difficult text and listened to the easy text on any decontextualized measure of word reading: Fs(1, 344) ≤ 0.23, ps ≥ 0.632. Scores for decontextualized words and pseudowords were inside the appropriate range reported in the PROLEC-R manual and showed substantial variability, although comparison of our sample’s mean word and pseudoword reading speed with those reported in the manual suggests that our participants were faster readers than the normative population.

Table 1 Means, standard deviations, and range for study variables

For reading and listening comprehension, students showed good understanding with higher scores for the easier text in both modalities. The mean and SD of the reading comprehension score for the easy text suggests a possible ceiling effect, although skewness was − 0.99, indicating acceptable distribution (Barzilai & Eshet-Alkalai, 2015). We conducted a 3 × 2 ANOVA for each modality separately, with grade (1, 2, 3) and text (easy, difficult) as between-subjects factors. For listening, there were significant main effects of grade [F(2, 338) = 17.37, p < .01, partial η2 = 0.93] and text [F(1, 338) = 7.57, p < .01, partial η2 = 0.02]. Likewise for reading, there were significant main effects of grade [F(2, 338) = 23.89, p < .01, partial η2 = 0.12] and text [F(1, 338) = 4.47, p = .03, partial η2 = 0.01]. The interaction was not significant in either analysis: Listening [F(2, 338) = 2.03, p = .133]; reading [ F(2, 338) = 2.61, p = .07]. Thus, the advantage of the easy text was similar in all grades.

Correlations are presented in Table 2. Reading comprehension of both texts was correlated with listening comprehension and all measures of word recognition (most correlations were moderate). All measures of word recognition were correlated with each other, with some values higher than 0.70, indicating a risk of multicollinearity (Tabachnick & Fidell, 2007). Therefore, we isolated or centered some variables, where required.

Table 2 Spearman intercorrelations for all variables in the study

Evidence for a basic word recognition threshold

To determine the presence of the basic word recognition threshold, we explored for each text if any of the word recognition measures, after controlling the other ones, moderated the relation between listening and reading comprehension. If moderation was present, we identified the word recognition threshold above which the relationship between listening and reading comprehension was statistically significant.

First, we conducted regression analyses to determine the word recognition measures that had to be controlled for in the moderation analysis; these were the word recognition measures that accounted for unique significant variance in reading comprehension (see Online Resource 2). The tolerance values of contextualized word reading speed and decontextualized word reading speed were below 0.10, and their variance inflation factors were larger than 10. Therefore, we conducted different models for each of these regressors excluding the other one. Contextualized word-reading speed and decontextualized word-reading accuracy each made a unique contribution to the reading comprehension of the easy text, and the two contextualized measures of word reading plus decontextualized word-reading speed made a unique contribution to the reading comprehension of the difficult text. Therefore, these variables were included in the corresponding moderation analysis as covariates.

We then conducted moderation analyses and employed the Johnson-Neyman technique using the PROCESS 3.1 macro for SPSS (Hayes, 2018). Specifically, six models (one for each word recognition measure) were constructed for each narrative text (easy and difficult) with reading comprehension as the dependent variable, listening comprehension as the independent variable, each of the measures of word reading as possible moderators, and the covariates noted above when they were not being testing as moderators. In the moderation analyses, the listening comprehension variable and each word recognition measure were centered before the interaction terms were created to prevent multicollinearity among the first-order terms and the interaction terms.

A summary of the moderation analyses with reading comprehension of the easy text as the dependent variable is reported in Table 3. All models were statistically significant and, in all, listening comprehension made a significant contribution to reading comprehension. There was only one significant interaction. This was between listening comprehension and contextualized word-reading accuracy, but it was in an unexpected direction: The Johnson-Neyman technique showed that regions of significance were under 93.85 words. This means that the relationship between listening and reading comprehension was significant under this threshold and not significant when students accurately read all 94 words of the easy text. This pattern may have arisen because, for these accurate word readers, the variability in reading comprehension was very low: They obtained a mean score of 5.05 (SD: 0.95, maximum = 6) and 76.53% of these students achieved a score of 5 or 6. With so little variability it is difficult to find a correlation between listening and reading comprehension in this subgroup.

Table 3 Summary of moderation analyses. Dependent variable: reading comprehension of the easy text

A summary of the moderation analyses with reading comprehension of the difficult text as the dependent variable and the two contextualized measures of word reading as covariatesFootnote 4 is reported in Table 4. All models were statistically significant and listening comprehension made a significant contribution to reading comprehension in each. This contribution was qualified by a significant interaction with pseudoword reading accuracy. Specifically, the Johnson-Neyman technique showed that regions of significance were over 29.09 pseudowords, meaning that the relationship between listening and reading comprehension was significant only for students accurately reading more than 72.7% of items in the pseudoword list. The interaction between listening comprehension and pseudoword accuracy on reading comprehension of the difficult text is shown in the figure of the Online Resource 3. Note that only readers with high listening comprehension and pseudoword reading accuracy inside the regions of significance outperformed readers with low listening comprehension.

Table 4 Summary of moderation analyses. Dependent variable: Reading comprehension of the difficult text

The same pattern was found when the covariate contextualized word reading speed was replaced by decontextualized word reading speed to predict reading comprehension of the difficult text (see Table 5). The only significant moderator of the relationship between listening and reading comprehension was pseudoword reading accuracy and the regions of significance were over 29.12 pseudowords (72.8% of success reading the list of pseudowords).

Table 5 Summary of moderation analyses. Dependent variable: Reading comprehension of the difficult text

In sum, accuracy of around 73% for pseudoword reading was identified as the basic word recognition threshold. More than 85% of participants surpassed this threshold: 77.9% of the first graders; 89.6% of the second graders; and 88.2% of the third graders.

Evidence for the oral-written matching word recognition thresholds: functional and efficient

To explore whether there are some levels of word recognition that allow participants to perform equally well on the comprehension questions after listening and reading we followed three steps. First, we selected the readers with a positive Z score in listening comprehension independently of the text they listened to. This selection resulted in a subsample of 214 students. The descriptive statistics of this subsample are presented in Table 6. There were no significant differences between participants who read the easy text and listened to the difficult text and participants who read the difficult text and listened to the easy text on any of the decontextualized measures of word reading: Fs(1, 212) ≤ 3.59, ps ≥ 0.059. These data are very similar to those of the whole sample (Table 1), except the higher performance on comprehension measures (due to this subsample selection criterion). The mean and SD of the reading comprehension score for the easy text suggests a possible ceiling effect, although skewness was − 1.12, indicating acceptable distribution (Barzilai & Eshet-Alkalai, 2015). The two contextualized measures of word recognition (word reading accuracy, word reading speed) were transformed to correct wpm to provide a measure on the same scale independent of the text read (a requirement for further analysis) and to enable us to compare this with the rate of reading speed of the evaluators (reported in Table 6 as a reference for the expected efficient threshold).

Table 6 Means, standard deviations, and range for study variables in the subsample with positive Z score in listening comprehension and for the evaluators reading the texts

Correlations between all variables for the subsample are reported in Table 7. Reading comprehension of the easy text (above diagonal) showed small correlations with the measures of word recognition speed; reading comprehension of the difficult text (below diagonal) showed moderate correlations with all measures of word recognition (accuracy and speed). Almost all measures of word recognition were correlated among them. Some correlations were higher than 0.70, so, as before, multicollinearity problems were avoided by isolating or centering some variables when needed. Logically, as variability in listening comprehension in this subsample is less than in the whole sample, it was not significantly correlated with reading comprehension.

Table 7 Spearman intercorrelations for all variables in the study in the subsample with positive Z score in listening comprehension

Second, to search for the measures of word recognition that had to be controlled for in further analysis, we conducted regression analyses to establish which measures of word recognition accounted for unique significant variance in reading comprehension (see Online Resource 4). In these regressions, the tolerance values were over 0.10, and the highest value of the variance inflation factors was far from 10, indicating the absence of multicollinearity. Decontextualized word reading speed and pseudoword reading speed made unique contributions to reading comprehension of the difficult text and were, therefore, included in the following moderation analysis as covariates.

We employed PROCESS 3.1 (Hayes, 2018) to conduct moderation analyses with comprehension scores for each text (easy, difficult) as the dependent variables and presentation mode (aural, visual) as a between-subjects independent variable. Each word reading measure was tested as a possible moderator in a different model, and decontextualized word reading speed and pseudoword reading speed were introduced as covariates when predicting comprehension of the difficult text if they were not being testing as moderators. We coded presentation mode as one dummy variable (aural = − 0.05; visual = 0.05). In this way, an interaction between presentation mode and word reading would indicate that the impact of presentation mode was not the same across the whole range of word recognition skills. The Johnson-Neyman technique was used to locate values in the range of word recognition where presentation mode ceased to influence comprehension (i.e., where listening and reading comprehension were statistically equal). In the moderation analyses, presentation mode (aural, visual) and each word recognition measure were centered before the interaction terms were created to prevent multicollinearity among the first-order terms and the interaction terms.

A summary of the moderation analyses with comprehension of the easy text as the dependent variable is reported in Table 8. All models were statistically significant. There were two significant interactions between presentation mode and word reading, one with the measure of decontextualized word reading speed and other one with the measure of correct wpm. Specifically, the Johnson-Neyman technique showed no differences between students who read the text and those who listened for (a) students who took 36.31 s or less to read the word list (see figure in Online Resource 5), or (b) those who read the text at or above 140.28 correct wpm (see figure in Online Resource 6). The regression slopes were negative and significant for participants under these thresholds and not significant for participants above the thresholds, which means that for readers with low and average speed the aural presentation was more beneficial than the visual presentation (but for high-speed readers the presentation mode did not impact their comprehension).

Table 8 Summary of moderation analyses in the subsample with positive Z score in listening comprehension. Dependent variable: Score in the comprehension questions of the easy text

A summary of the moderation analyses with comprehension of the difficult text as the dependent variable and decontextualized word reading speed as covariate is reported in Table 9. All models were statistically significant. None of the interactions was significant. Thus, the impact of presentation mode was not moderated by word recognition and participants who listened to the text outperformed those who read it, regardless of word reading skill (see negative B value associated with presentation mode).

Table 9 Summary of moderation analyses in the subsample with positive Z score in listening comprehension. Dependent variable: Score in the comprehension questions of the difficult text

In conclusion, taking 36.31 s to read a 40-word list was identified as a plausible candidate for the functional threshold. Of the whole sample, 42.2% reached this threshold: 6.7% of first graders; 52.1% of second graders; and 61.1% of third graders. Reading text at 140.28 correct wpm could be considered the efficient threshold. Of the sample, 27% reached this threshold: 1.9% of first graders; 38.5% of second graders; 37.5% of third graders.

Discussion

Previous work has identified thresholds of word recognition performance that allow reading comprehension to emerge and permit progression in reading ability (Juul et al., 2014; Karageorgos et al., 2019, 2020; Magliano et al., 2022; O’Connor, 2018; Wang et al., 2019). In this paper we have provided support for our proposal for a new threshold, the oral-written matching threshold (Sánchez & García, 2021), and two conceptually different versions of that threshold: Functional and efficient. In addition, we have calculated a basic word recognition threshold to accomplish our second objective: Placing the new threshold (and its two versions) in relation to the basic word recognition threshold and the upper word recognition threshold (identified in prior research). This permits a more nuanced description of the critical milestones in word recognition that support reading comprehension. Also, in line with our third objective, we demonstrated that the thresholds identified here depend on the text being read.

With respect to the first objective, our analyses yielded the two anticipated oral-written matching thresholds. The functional oral-written matching threshold was defined as the level of word recognition at which listening and reading comprehension performance are comparable in readers with good listening comprehension skills, although with a reading rate below the delivery rate of an expert reading aloud. Accordingly, moderation analyses and the Johnson-Neyman technique showed that such matching is possible when readers read a list of 40 words in ~ 36 s. This is equivalent to reading at ~ 66 wpm (“adagio tempo”) or taking around 900 milliseconds to pronounce each word, which is far from automaticity (skilled adult readers require around 400–500 milliseconds per word: Kintsch, 1998). In fact, readers over this threshold read the easy text at an average of ~ 129 correct wpm and readers on the edge of this threshold at ~ 100 correct wpm while the evaluators read at an average of ~ 163 wpm. Consequently, we assume that good oral comprehenders over the functional threshold can have comparable oral and written comprehension but investing more time (and, presumably, more effort) in written word processing than the one required in oral word processing to obtain an equivalent comprehension and/or supported by their good top-level comprehension processes (Moojen et al., 2020; Sánchez et al., 2007). Surprisingly, according to the 2018 National Assessment of Educational Progress (NAEP), the average speed of passage reading of grade 4 American students performing at the NAEP basic level of comprehension (the level that allows to locate relevant information and make simple inferences) is 123 correct wpm (White et al., 2021).

The efficient oral-written matching threshold was defined as the level of word recognition that allows comparable listening and reading comprehension performance for good comprehenders employing the same time (and, presumably, effort) in reading than in listening to an expert reading aloud the text. Moderation analyses and the Johnson-Neyman technique have showed that this threshold may be equivalent to read words in a text at a rate of ~ 140 correct wpm. This reading rate is 1.44 SDs below the rate of the evaluators in the listening comprehension task but Brysbaert’s (2019) review found that audiobooks are spoken at a rate of 140–180 wpm. In addition, the average speed of passage reading of grade 4 American students performing at the NAEP proficient level of comprehension is 142 correct wpm (White et al., 2021), and the average speed of passage reading of American adults from the main assessment of the National Assessment of Adult Literacy is 154 correct wpm, which corresponds to the passage reading rate of those who have an intermediate level of ability to search, comprehend, and use information from continuous texts (Baer et al., 2009). All these concurrent data suggest that a speed of over 140 correct wpm is a suitable rate at which the cognitive resources spent in processing written and oral words in natural communicative situations could be comparable (and, therefore, the resources available to deploy for comprehension processes).

To locate these new thresholds in a developmental trajectory (our second objective) we must compare them with a basic and an upper threshold. With respect to the basic word recognition threshold, this was defined in this study as the level of word reading above which listening and reading comprehension are related. This means that children will not be able to draw on their oral language competencies to support reading comprehension until they have achieved a basic level of word recognition. Our estimate of this threshold through the moderation analyses and the Johnson-Neyman technique was to read accurately more than 73% of pseudowords in a list. This value is again in line with that found in previous research with measures of real word reading (e.g., Juul et al., 2014; Karageorgos et al., 2019, 2020), demonstrating reproducibilty in another language and confirming that accuracy precedes speed (Juul et al., 2014; Karageorgos et al., 2019, 2020; Magliano et al., 2022; Share, 1995). In a shallow orthography like Spanish, good phonological recoding is sufficient for reading most printed words allowing oral language competencies to support reading for meaning.

The above findings support our predicted progression from the easiest to the most challenging threshold: Basic threshold, functional threshold, and efficient threshold. Three key findings support this progression. First, accuracy is the critical reading indicator for the basic threshold, and speed (at different rate) of the other thresholds beyond accuracy. Second, 85.5% of the sample reached the basic threshold, 42.2% the functional threshold, and 27% the efficient threshold. Third, by grade, the basic threshold was reached by the majority of first graders (77.9%), the functional threshold by around half of the second and third graders (52.1% and 61.1%), and the efficient threshold by only a minority of second and third graders (38.5% and 37.5%).

With respect to the upper word recognition threshold, the value of the efficient threshold reported here is similar to the upper word recognition threshold found for O’Connor (2018) with typical readers of similar age (second and fourth graders). Differences in depth of orthography and materials make it hard to compare findings and further studies are required. But, if word recognition contributes to reading comprehension even in adults (García & Cain, 2014) and reading comprehension can surpass listening comprehension of demanding texts (Clinton-Lisell, 2022; Sánchez & García-Rodicio, 2008; Singh & Alexander, 2022), it is reasonable to think that the efficient and upper thresholds are conceptually different although with simple/narrative texts and young readers their values can be similar. Thus, word recognition may still improve and contribute to reading comprehension of more demanding texts (e.g., expository texts) after reaching the threshold needed to the oral-written matching in simple texts. So, the oral-written matching threshold could be a less challenging threshold than the upper word recognition threshold.

Finally, for our third objective, we confirmed that the exact value of each threshold is dependent on the assessment materials. As expected, participants required superior word recognition skills to achieve the thresholds in the more difficult text while they may have already exceeded the basic level required to process the easy text for meaning. Indeed, if we applied the Johnson-Neyman technique to explore how the number of pseudowords read correctly moderates the relation between listening comprehension and reading comprehension of the easy text, although the moderation is not significant, it yielded a basic threshold value of only 24.94 pseudowords read accurately, a value surpassed by 95.85% of readers of the easy text. This result is in line with O’Connor (2018), who did not find a basic threshold among typical readers. In contrast, we found evidence for the more advanced oral-writing matching thresholds only for the easy text. Our sample read the more difficult text more slowly, in line with expectations, and only 22.68% of our readers read this text above the cut point identified when we applied the Johnson-Neyman technique. It is plausible that, with a bigger and more variable sample, the corresponding moderator analyses would be also significant.

Implications

Our findings have implications for theory, research, and practice. Theoretically, our results provide a way to operationalise the SVR offering a more nuanced understanding of the relations between word recognition, listening, and reading comprehension. We note however that, beyond the initial stages in reading development, written language, and, especially, “academic language” (Olson, 1977; Uccelli et al., 2015), can enhance and expand language knowledge and, through that, comprehension. Thus, the relation between oral and written language described by the SVR may be expected to change when students systematically face more complex texts and more sophisticated comprehension demands. Second, the different thresholds identified in this study describe the first steps in reading development; but other critical thresholds may be identified. For instance, when reading expository texts, another critical threshold is the moment at which a reader can follow written rhetorical cues (e.g., “the first cause is…”) with the same ease as oral cues provided by teachers or other agents (Sánchez & García, 2021). Third, our results reinforce the proposal that word recognition is a “pressure point” for reading comprehension (Compton & Pearson, 2016), and illustrate how basic processes constrain higher order cognitive processes.

For research, we demonstrate a methodology for identifying critical points in ability measures. Moderation analyses and the Johnson-Neyman technique were shown to be useful for detecting changes in relationships and providing an alternative to quantile analyses.

There are implications for practice relating to evaluation, teaching, and test development. First, the basic, functional, and efficient thresholds can serve as well-defined criteria to classify at-risk beginner readers and improve identification of children with learning disabilities. We found that 22.1% of the first graders of our sample were below the basic word recognition threshold (accurate reading of 73% of pseudowords). Converging support for this comes from the published norms of the assessment used (PROLEC-R), which reports appropriate pseudoword reading accuracy for first graders at over 75% of success. The proportion identified to be at risk in this study is high, but prevalence rates vary in relation to severity and diagnostic criteria (Wagner et al., 2020) and many children identified with early weak word recognition skills attain typical levels of performance later on (Catts et al., 2012). Thus, according to the Response to Intervention Framework, their responsiveness to general education instruction could be monitored to identify non responders and provide more intensive intervention, if needed (Fuchs & Fuchs, 2006).

With respect to teaching, the three word recognition thresholds identified in this study are useful for defining concrete, well-ordered, meaningful, and theoretically grounded instructional objectives. For instance, the achievement of the basic word recognition threshold – measured by accuracy – can be considered as the point when “the code has been cracked” (Juul et al., 2014, p. 1104). This is a relevant objective. Our findings suggest that overcoming this accuracy threshold at grade 1 (for shallow orthographies such as Spanish) is also a plausible objective. In contrast, reaching the functional threshold seems to be a more appropriate goal for grade 2 or 3, and the efficient threshold may demand more practice and instruction beyond these grades, which recommends, in the meanwhile, the use of scaffolding aids to minimize the impact of word reading on reading comprehension (Sánchez et al., 2007). That is, in an educational context, where teachers may provide scaffolds to comprehension, the basic, functional, and efficient thresholds will probably be less demanding.

Finally, with respect to test development, we propose that these thresholds are explored in relation to existing reading ability assessments. Intervention for children diagnosed with reading difficulties requires accurate identification of the source of difficulty: Word recognition accuracy or speed, language comprehension skills such as morphosyntactic knowledge or knowledge of connectors, etc. (Catts et al., 2006; Sabatini et al., 2019; Wagner et al., 2020). The determination of these thresholds with standardized assessments will provide critical information to help identify the source of reading comprehension difficulties. To do so, it is important to assess each reading skill minimizing the influence of the others, as some existing reading skills components batteries already try to do (e.g., the battery employed by Magliano et al., 2022 and Wang et al., 2019 in the identification of the basic word recognition threshold, the RISE: Sabatini et al., 2019).

Limitations and future directions

In addition to the limitations and future research directions already discussed, we highlight the most pertinent. First, we used a restricted range of comprehension measures to illustrate how thresholds can be identified, but the exact value of these thresholds may depend on the materials, the way of reading (silently vs. aloud), and the comprehension questions employed. Further research is needed to test the generalizability and range of the critical threshold values reported here. Such work must include robust and parallel measures of listening comprehension and measures of reading comprehension after silent reading (the natural way of reading when someone is alone) because beginning readers tend to get better comprehension when reading aloud (Cao & Kim, 2021), which was the option chosen in the current study. Second, some poor word readers obtained good levels of reading comprehension; whether or not this arose due to passage-independent questions (Keenan & Betjemann, 2006) or superior (and compensatory) background knowledge or top-level skills should be tested in future work. Finally, some participants reached the efficient, but not the functional, threshold, suggesting they could profit from the semantic context more strongly than could be expected according to their decontextualized reading skill. Knowing the features of these readers may have theoretical and practical value. Despite these limitations, this study provides proof-of-concept to illustrate the utility of the proposed thresholds and methods for their identification.

Conclusions

This study makes important contributions to the understanding of early reading development. Our findings confirm the existence of a basic word recognition threshold in a new language – Spanish –, and the presence and utility of two new thresholds – the oral-written matching thresholds –. These different thresholds establish a sequence of well-ordered challenges in word recognition skills that shape the initial stages of reading comprehension development. Achieving the basic word recognition threshold defined in this study (a good level of phonological recoding) allows exploiting listening comprehension capacity; while reaching the functional threshold (an appropriate speed of reading) and, especially, the efficient word recognition threshold (an optimum speed of reading) makes reading simple texts as accessible as listening to them. This last threshold may boost engagement with reading, fostering the development of language and, later on, the more sophisticated skills needed to understand and learn from complex texts and multiple-text task-oriented reading. In contrast, falling below these thresholds could slow down the development of reading competence, and its associated skills. Of course, these speculations need to be backed up by future research.