Introduction

We investigate the relationship between two types of developmental difficulties related to language: low oral language skills – the primary symptom of developmental language disorder (DLD), and low reading skills—the primary symptom of developmental dyslexia.Footnote 1 Low oral language and low reading skills are distinct but often co-occurring phenomena (e.g. Baird et al., 2011; Bishop et al., 2009; Catts et al., 2005; Fraser et al., 2010; Kelso et al., 2007; McArthur & Castles, 2013; Ramus et al., 2013). Estimates of comorbidity vary between studies: comorbid low reading in children with low oral language ranges from 17% (Catts et al., 2005) to 79% (McArthur & Castles, 2013), while comorbid low oral language skills in children with low reading skills range from 15% (Catts et al., 2005) to 73% (Eisenmajer et al., 2005). These differences in reported comorbidities probably result from sampling methods, time of diagnosis, and specific inclusion criteria (Adlof & Hogan, 2018). The level of comorbidity of the two types of difficulties might also depend on some language characteristics (e.g., orthographic transparency): only a few studies included other-than-English-speaking samples and reported rather lower levels of comorbidity, e.g., Dutch—1959% (De Groot et al., 2015), Russian—3846% (Rakhlin et al., 2013), and Greek—4147% (Spanoudis et al., 2019). In any case, the comorbidity of language and reading difficulties is above the statistical chance level, suggesting that these two types of difficulties may, at least in part, have the same underlying cause.

The list of potential specific underlying deficits contributing to low oral language and low reading skills is long (Ramus et al., 2003; Vellutino & Fletcher, 2008), and the research focused on identifying a single cognitive deficit that clearly explains language and/or reading problems has been inconclusive. An alternative—a probabilistic rather than a deterministic approach—comes from the Multiple Deficit Model—MDM (Pennington, 2006). According to this model, specific disorders result from the atypical development of various cognitive functions, which in turn, are affected by multiple risk and protective factors. None of these factors is necessary or sufficient, and some are shared by different disorders. Thus, comorbidity results from shared cognitive risk factors (Pennington, 2006).

Research on dyslexia within the MDM framework confirms that the picture is indeed quite complex. Although most English-speaking individuals with dyslexia fit within the MDM with various constellations of multiple cognitive deficits, approximately 25% have only a single deficit (usually a phonemic awareness deficit). On the other hand, a group of individuals with dyslexia have no deficit in phonemic awareness (Carroll et al., 2016; McGrath et al., 2020; Pennington et al., 2012). In addition, the use of less strictly defined cut-off points increases the number of cases with multiple deficits (based on Polish data—Dębska et al., 2022).

Polish is a Slavic language with a complex system of inflectional morphology. In terms of orthography, Polish uses an alphabetic writing system based on the Latin alphabet. The Polish alphabet consists of 32 letters with some additional diacritical marks. These diacritics are essential for the correct pronunciation and meaning of Polish words. The spelling of Polish words is largely consistent, with generally reliable grapheme-phoneme correspondences.

Some of the most thoroughly researched cognitive-linguistic skills related to low oral language and/or low reading are skills measured by phonological awareness (PA), rapid automatized naming (RAN), and nonword repetition (NWR) tests (e.g., De Groot et al., 2015; Fraser et al., 2010; Ramus et al., 2013; Snowling et al., 2019; Vandewalle et al., 2012). These skills are also important in the Polish context: PA and RAN deficits are the most common deficits out of seven skills measured in Polish school-aged children with dyslexia (Dębska et al., 2022), while NWR tests are used in the clinical diagnosis of both dyslexia (Bogdanowicz et al., 2011) and DLD (Szewczyk et al., 2015).

Phonological awareness

Phonological awareness (PA) is the ability to discriminate, identify, and manipulate phonological segments of speech (Swanson et al., 2003). The PA deficit is a well-researched phenomenon in dyslexic individuals (large overall mean effect size, d = −1.37, compared to age-matched controls in the meta-analysis by Melby-Lervåg et al., 2012). It is also considered as the most common deficit within the multifactorial model of dyslexia (Catts et al., 2017; Pennington et al., 2012), and it is stable across the age range studied (5;4—16;10 in Melby-Lervåg et al., 2012). With regard to orthographic transparency, the role of PA in reading is likely to be less pronounced in more transparent orthographies than in opaque English orthography (Pfost, 2015; Ziegler et al., 2010, though see: Melby-Lervåg et al., 2012). The characteristics of certain PA tasks also seem to modify the measured effect sizes. Comparisons with reading-level controls in transparent languages show that in terms of task complexity, simple tasks (matching, blending and segmentation) produce slightly larger overall effect sizes than complex tasks (elision, substitution and spoonerism—a meta-analysis by Parrila et al., 2020).

Within the DLD-related research, the status of PA deficits is less clear. English-based studies that take comorbidity into account show that in PA tasks, preschool children with low oral language only perform as poorly as those who are going to be have low reading skills only and those who are going to have comorbid low oral language and low reading skills (Bishop et al., 2009; Catts et al., 2005; Snowling et al., 2019), but their performance changes with time spent on formal instruction. For school-age groups with low oral language only, the results show mixed patterns. Children with low oral language only outperform the comorbid low language and low reading group but not the low reading only group (De Groot et al., 2015; Fraser et al., 2010; Ramus et al., 2013; Snowling et al., 2019), or they reliably outperform both low reading groups (Catts et al., 2005). This inconsistency in findings could be due to the language studied, the detailed characteristics of the stimuli (Farquharson et al., 2014) or the tasks measuring PA. For example, in a study of Dutch children, the low-language-only group performs worse on the substitution task compared to the elision task, probably due to a significant short-term memory load (De Groot et al., 2015).

Rapid automatized naming

Rapid automatized naming (RAN) is the ability measured by a task in which participants name arrays of familiar objects, colours, letters (RANLetters), or digits (RANDigits) as quickly as possible. A RAN deficit is manifested by slow naming times. This task involves accessing phonological representations, integrating phonological and visual information, and allocating working memory (Norton & Wolf, 2012). The similarities between the processes involved in RAN and reading guaranteed RAN the title of “a microcosm of the reading system” (Norton & Wolf, 2012, p. 448). Indeed, the meta-analysis by Araújo and collaborators shows a moderate-to-strong correlation between RAN and reading performance (Araújo et al., 2014).

Given that RAN is so closely related to reading, it is not surprising that at the group level, children with low reading skills show a well-documented deficit in RAN (d = 1.19, compared to age-matched controls, Araújo & Faísca, 2019). A meta-analysis revealed that the deficit is quite universal (Araújo & Faísca, 2019): it is seen in orthographies of different complexity (Araújo & Faísca, 2019; Landerl et al., 2013), and it generalizes across different stimulus types (both alphanumeric and non-alphanumeric). On the other hand, as predicted by MDM, a multiple case study analysis shows that up to 40% of English-speaking children with dyslexia who show any of the cognitive deficits measured in the study have no deficits in RAN (they only present PA and/or oral language deficits, Pennington et al., 2012). In Polish children with dyslexia, RAN deficits were observed in 26% of children (Dębska et al., 2022).

Within the low oral language only group (9 years old, English-speaking) RANDigits results were similar to typically developing controls (Bishop et al., 2009). Pennington and Bishop (2009) suggest that intact RAN is a protective factor against the development of reading difficulties. In addition, some studies suggest that the results of the low oral language-only group appear to be stimulus dependent, unlike the low reading only group where performance is consistently poor across stimulus types. The results in RANLetters do not help to distinguish between children with low reading skills only and children with low oral language only, whereas in RANDigits, the performance of the low oral language only group is much better, reaching low average levels. The comorbid group score is the lowest (De Groot et al., 2015).

Nonword repetition

Nonword repetition (NWR) is a task in which a participant has to repeat single pronounceable pseudowords. The list of items usually includes pseudowords of 25 syllables. Modifying items’ characteristics seems to allow measuring different abilities. The level of items’ wordlikeness (Archibald & Gathercole, 2006), phonotactic frequency (Munson et al., 2005), phonological complexity (Marshall & Van Der Lely, 2009), and prosodic features (Gallon et al., 2007) influence the overall results and probably the skills needed to complete the task. In older children, NWR is mostly a measure of phonological short-term memory, although it is also affected by long-term knowledge, especially in younger children (Rispens & Baker, 2012). Lexical knowledge is better measured by more word-like items (Archibald & Gathercole, 2006).

The deficit in NWR is described in meta-analyses for English-speaking children with dyslexia (Melby-Lervåg & Lervåg, 2012, d = 1.12) and with DLD (Graf Estes et al., 2007, d = 1.27). Both meta-analyses highlight large variability of the results, which could be caused by ignoring the comorbidity effect: a NWR deficit is no longer seen in the low reading group if children with low reading skills and control samples are matched on nonverbal IQ and oral language skills (Cowan et al., 2017). This suggests that it is rather characteristic for children with low language only or comorbid low language and low reading group, not for the low reading only group.

Children with comorbid low language and low reading skills suffer from a cumulative effect, showing the poorest outcomes compared to the low language only or low reading only groups (Bishop et al., 2009; Catts et al., 2005; Cowan et al., 2017; Ramus et al., 2013; Rispens & Parigger, 2010; Snowling et al., 2019). However, certain task characteristics modify this pattern. The difference between children with comorbid low language and low reading skills and those with low reading skills only becomes significant when the task includes consonant clusters, but not for the task that consists mostly of changes in consonants and vowels—CVCVC structure (Cowan et al., 2017). Moreover, the differences are only observed for longer pseudowords (3–4 syllables but not 1–2 syllables, Catts et al., 2005). The results also seem to be language dependent: a meta-analysis (Melby-Lervåg & Lervåg, 2012) shows that the deficit in NWR is less pronounced in more transparent languages (effect size: d = −0.56). In addition, a high level of wordlikeness should increase the difference between children with low language skills and children with low reading skills or typically developing groups. This is because only children with good lexical knowledge could benefit from high wordlikeness of the item.

The current analysis

Our analysis aimed to describe the relationship between oral language and reading difficulties using three different methods. We used norming data to determine the prevalence rates of comorbid low language and low reading skills in Polish, i.e. the prevalence of low reading skills in children with low language skills among monolingual Polish speakers, and the prevalence of low language skills in children with low reading skills. To the best of our knowledge, there is no published empirical work on the comorbidity of low language and low reading in Polish. We predicted that comorbidity rates should be similar to those in other rather transparent languages, such as Russian (3846%; Rakhlin et al., 2013) or Greek (4147%; Spanoudis et al., 2019).

We also looked at the cognitive-linguistic profiles associated with low language and/or low reading using carefully matched group comparisons. We expected that this analysis would help to distinguish between groups with only low language, only low reading, and comorbid low language and low reading. We hypothesised that typically developing children would have the highest scores on the PA, RAN and NWR tests, whereas children with comorbid low language and low reading skills would have low or the lowest scores on all these measures. In addition, the low language only group would show deficits on the PA subtests that require substantial short-term memory load (i.e., elision but not blending) and on the NWR, more so on items with high wordlikeness level. The low reading only group would show deficits mainly on simple PA subtests and RAN tests, and less so on NWR tests. Furthermore, a multiple case study analysis would reveal multiple deficits in the low language, low reading and comorbid low language and low reading groups, more so for less strict deficit cut-offs.

Method

Participants

Sampling procedure

The current analysis is a secondary analysis based on data collected by the Educational Research Institute in Poland in 2014–15 within a norming study of two comprehensive tests for Polish-speaking monolingual children assessing oral language and early reading skills (Krasowicz-Kupis et al., 2015a, 2015b; Smoczyńska et al., 2015).

In total, 4771 children (50.1% girls, 49.9% boys) aged 4;0 to 8;11 participated in the norming study of the two tests. We applied the following inclusion criteria to the original database (the number of participants left in the database after applying the additional criterion is given in parentheses): complete results available for both the reading and language tests (n = 3706), 6;6 to 8;5 years old (on the day of registration to the study) and attending first grade (summer semester) or second grade (winter semester, n = 1384), Polish monolinguals (n = 1210), no reported uncorrected vision or hearing problems and no neurological disorders (n = 975), no other special educational needs recognized by special education services, including autism spectrum disorder and disorders of intellectual development (n = 963), and IQ > 70 (n = 962; 51.5% girls, 48.5% boys).

Group assignment

Children were assigned to one of four groups based on their scores in the Language and Reading tests, according to the unisex sten-scale (M = 5.5, SD = 2.0). The grouping criteria were established empirically to resemble epidemiological data on dyslexia and DLD. Low reading skills were recognized if a child scored low (≤ 3 sten, corresponding to ≤ 16 percentile) in at least two out of four used reading sub-tests. Low language skills were identified if a child scored low (≤ 3 sten) in at least two out of six language sub-tests. Children who met both the criteria for low reading skills and low language skills were classified as the comorbid low language and low reading group. Participants who did not meet any of the above criteria were considered typically developing.

Matching groups

The four groups (low language, low reading, low language and low reading, typically developing) differed significantly on controlled variables such as parental education and nonverbal IQ (Table 1). To resolve this problem, for the purpose of further comparisons of the cognitive-linguistic profiles, we used a pairwise participant matching algorithm to match the groups based on age, gender, nonverbal IQ and parental education (n = 38 for each group—equal to the size of the smallest group; see Table 2).

Table 1 Sociodemographic characteristics and overall language and reading skills in the full sample
Table 2 Sociodemographic characteristics and overall language and reading skills in matched groups

Measures

The tasks used in the current analysis are listed below. A detailed description of all tests is available in the Supplementary material 1 (Table S1).

Reading skills were assessed with four sub-tests of ‘Bateria Testów Czytania BTCZ IBE’ [Battery of Reading Tests] (Krasowicz-Kupis et al., 2015a, 2015b): Letter naming (maximum score: max = 32), Timed word reading (for 60 s, max = 56), Pseudoword reading (untimed, max = 24), and Timed pseudoword reading (for 60 s, max = 56).

Language skills were assessed with six sub-tests of ‘TRJ: Test Rozwoju Językowego’ [Test of Language Development] (Smoczyńska et al., 2015): Vocabulary—comprehension (max = 28), Vocabulary—production (max = 25), Sentence repetition (max = 34), Grammar—comprehension (max = 32), Grammar—production (max = 14), and Discourse—comprehension (max = 20).

Phonological awareness was assessed with twelve sub-tests of ‘Bateria Testów Fonologicznych’ [Battery of Phonological Tests] (Krasowicz-Kupis et al., 2015a, 2015b) including Phoneme discrimination, Alliteration, Blending (syllables and phonemes), Segmenting (into syllables or phonemes within words or pseudowords), and Elision (syllables and phonemes).

RAN skills were assessed with digit-based and with letter-based RAN tasks (Fecenec et al., 2013), hence variables: RANDigits and RANLetters.

Two separate nonword repetition (NWR) tests with different levels of wordlikeness were used: ‘Zetotest II Krasowicz-Kupis’, max = 40 (Bogdanowicz et al., 2011)—created as a measure of phonological short-term memory, without considering the level of wordlikeness (hence: NWRLow wordlikeness), and ‘Test Powtarzania Pseudosłów’ [Pseudoword Repetition Test], max = 27 (Szewczyk et al., 2015)—created mainly as a measure of sublexical knowledge, containing items of high wordlikeness (hence variable: NWRHigh wordlikeness).

Nonverbal IQ was assessed with the individually administered version of the CFT 1-R (Koć-Januchta et al., 2013).

Parental education level was assessed via a sociodemographic questionnaire on an 8-point ordinal scale. For 7 participants the data come from fathers, otherwise from mothers.

Procedure

Data collection was carried out by trained psychologists in quiet locations at children’s schools. There were four sessions (45–50 min each) separated by a maximum of one week: session I was devoted to a language test, session II: reading test, some of PA and RAN sub-tests, session III: reading, PA, NWR, and writing tests (not included in this paper), and session IV: IQ test, and PA sub-tests.

Data collection was conducted in accordance with the ethical standards of the Educational Research Institute at the time of data collection (2012–2014). The study was conducted in accordance with the 1964 Declaration of Helsinki and its subsequent amendments. The data analyses presented in this article were approved by the Research Ethics Committee of the Faculty of Psychology at the University of Warsaw.

Data analysis

We present the findings on the between-group differences in the matched groups analysed with a two-way ANOVA (Language × Reading). To further explain the results, we also present post-hoc between-group differences; all p-values are presented with a Holm‒Bonferroni correction for multiple comparisons. All measured variables are converted into Z-scores, i.e., M = 0, SD = 1, relative to the results of the typically developing group, to make the results of different sub-tests comparable on a single scale. Z-scores are calculated based on typically developing group normed sten scores (M = 5.5, SD = 2.0).

First, we confirm our initial group selection by analysing the results of the Language and Reading tests that were used to create the groups. Second, we present data on cognitive-linguistic skills (PA, RAN, NWR). For the twelve sub-tests measuring PA skills, Principal Component Analysis (Supplementary material 2, Table S2) allowed us to create three theoretically driven standardized factors.

Between-group analyses were accompanied by the analysis of a multiple case study: the deficit distribution within the groups was counted for both strict (−1.65 SD) and more liberal (−1 SD) cut-off points established for each variable under scrutiny. All analyses were performed using SPSS 28 (IBM Corporation, Armonk, New York).

The complete dataset used for the current analysis is available from an Open Science Framework archive: https://osf.io/6348t/?view_only=0d4a33d98aaa457c801bc80d306a9f6a.

Results

Comorbidity rates

As expected, low oral language and low reading skills co-occurred in the sample (Table 3). A total of 12.9% of the children were classified as having low oral language skills, and 16.4% were classified as having low reading skills. Among children with low reading skills, 24.1% also presented low language skills, while among children with low language skills, 30.6% also presented low reading skills. Approximately equal numbers of girls and boys were classified as typically developing and with comorbid low language and low reading skills. There were more girls than boys in the low language only group (1.21:1). The gender ratio for the low reading only group was 1.31:1 (boys to girls). None of the gender differences were significantly different from the 1:1 ratio: low oral language only, χ2 (1, N = 86) = 0.744, p = 0.388, low reading only, χ2 (1, N = 120) = 2.13, p = 0.144.

Table 3 Overlap between children meeting criteria for low language and/or low reading skills, with gender ratios

If low language and low reading skills co-occurred only by chance, the probability of such an event would equal 2.12%. In fact, these difficulties co-occurred significantly more often: 3.95%, χ2 (1, N = 962) = 15.6, p < 0.001.

Language and reading skills

Language

The main effect of Language was present for all six sub-tests (large effect sizes, see Supplementary material 3, Table S3). Additionally, the main effect of Reading was present in the Sentence repetition task (small effect size), but no interaction was observed (see Fig. 1). As expected based on group selection, children with low reading skills did not differ significantly from the typically developing group in any of the sub-tests. In most of them, children with low oral language had as poor results as the comorbid low language and low reading group (ranging from −1.17 to −2.22 Z-Scores relative to the typically developing group). Only the Sentence repetition results showed that the comorbid low oral language and low reading group showed significantly lower performance than the low language only group. The two-way ANOVA implies that the Sentence repetition result comes from the presence of two additive main effects: Language and Reading skills.

Fig. 1
figure 1

Mean Z-Scores in Language sub-tests (low language, low reading, and low oral language and low reading groups relative to typically developing group). Note. Between-group differences as Z-Scores’ results of the Language sub-tests (Vocabulary—comprehension, Sentence repetition, Vocabulary—production, Grammar—comprehension, Grammar—production, and Discourse comprehension). Error bars indicate the 95% confidence intervals. All p-values are Holm‒Bonferroni adjusted. Typically developing group: M = 0, SD = 1. Bracket means a significant difference (p < 0.05)

Reading

The main effect of Reading was present in all four sub-tests’ results (large effect sizes, Table S4 and Fig. 2). Additionally, the main effect of Language and the interactive effect was observed in Timed Word Reading (small effect sizes). In all four sub-tests, the low reading only group had as low results as children with low oral language and low reading skills. In Timed Word Reading, children with low language only had worse results than typically developing children, but they did as well as the typically developing group in all other Reading sub-tests: Letter naming, Pseudoword reading, and Timed pseudoword reading. The result of Timed Word Reading is further explained within two-way ANOVA by the main effects of language and reading skills but also their interaction: low reading skills determine low results regardless of the level of language skills, while typical reading skills are associated with lower results only when accompanied by low language skills.

Fig. 2
figure 2

Mean Z-Scores in Reading sub-tests (low language only, low reading, and low oral language and low reading groups relative to typically developing group). Note. Between-group differences as Z-Scores’ results for four Reading sub-tests. Error bars indicate the 95% confidence intervals. All p-values are Holm‒Bonferroni adjusted. Typically developing group: M = 0, SD = 1. Bracket means a significant difference (p < 0.05)

Cognitive-linguistic profile

Phonological awareness

Main effects of Language and Reading were present only in Phoneme discrimination (small to moderate effect sizes), Alliteration—pseudowords (small effect size), Rime—fluency (small effect size), Syllable blending—pseudowords (small effect size), Blending phonemes (moderate effect size), and Elision (moderate to large effect size; see Supplementary material 3, Table S3 and Fig. 3). In all sub-tests, children with comorbid low oral language and low reading skills had the lowest mean results, whereas typically developing children mostly had the highest mean results (except for Rime—fluency). Only the Phoneme discrimination, Blending phonemes, and Elision sub-tests showed differences between groups. In the Phoneme discrimination sub-test, children with comorbid low oral language and low reading gained significantly worse results than all other groups that did not differ from each other. This result is further explained within two-way ANOVA by main effects of Language and Reading skills and their interaction: typical reading skills are related to typical test scores irrespective of the level of language skills, while low reading skills are associated with a reduced sub-test score only when accompanied by low language skills. In the blending sub-tests (Syllable blending—pseudowords and Blending phonemes factor), the comorbid low oral language and low reading group had significantly lower results than the typically developing group and lower results than the low oral language only and low reading only groups (although in the Syllable blending—pseudowords sub-test the difference did not reach significance level, p = 0.131). Two-way ANOVA related these results to the main effect of reading skills. The Elision factor yielded the strongest effect size, i.e., the typically developing group gained significantly higher results than all other groups, and the comorbid low oral language and low reading group showed significantly lower performance than the low oral language only and low reading only groups. Within the two-way ANOVA, these results were related to the additive main effects of language and reading skills.

Fig. 3
figure 3

Mean Z-Scores on PA sub-tests (low language only, low reading, and low oral language and low reading groups relative to the typically developing group). Note. Between-group differences as Z-Scores’ results for PA sub-tests (Phoneme discrimination, Alliteration—pseudowords, Alliteration—fluency, Rime—fluency, Syllable blending—pseudowords, Blending phonemes factor, Segmenting syllables factor, Elision factor). Error bars indicate the 95% confidence intervals. All p-values are Holm‒Bonferroni adjusted. Typically developing group: M = 0, SD = 1. Bracket means a significant difference (p < 0.05)

Rapid automatized naming

For RAN, the main effect of Reading was observed (Supplementary material 3, Table S3 and Fig. 4). Pairwise comparisons revealed that in both the RANDigits and RANLetters sub-tests, low oral language only group performed as well as the typically developing controls, whereas the low reading only group results were significantly lower, equal to the comorbid low oral language and low reading group results.

Fig. 4
figure 4

Mean Z-Scores in RAN sub-tests (low language only, low reading, and low oral language and low reading groups relative to typically developing group). Note. Between-group differences as Z-Scores’ results for the RAN sub-tests. Error bars indicate the 95% confidence intervals. All p-values are Holm‒Bonferroni adjusted. Typically developing group: M = 0, SD = 1. Bracket means a significant difference (p < 0.05)

Nonword repetition

The main effect of Language was observed for both NWR sub-tests (Supplementary material 3, Table S3 and Fig. 5). Both the NWRLow wordlikeness and NWRHigh wordlikeness tests produced significant group differences. In both tests, the low reading only group had results equal to those of the typically developing group. In NWRLow wordlikeness, the low reading only group outperformed the low language only group, whereas in the NWRHigh wordlikeness test, the same trend did not reach statistical significance. The low language only group and the comorbid low language and low reading group had the lowest mean results.

Fig. 5
figure 5

Mean Z-Scores in NWR sub-tests (low language only, low reading, and low oral language and low reading groups relative to typically developing group). Note. Between-group differences as Z-Scores’ results for NWR sub-tests. Error bars indicate the 95% confidence intervals. All p-values are Holm‒Bonferroni adjusted. Typically developing group: M = 0, SD = 1. Bracket means a significant difference (p < 0.05)

Multiple case study

We grouped individual cases based on the cognitive-linguistic profiles observed (PA, RAN, NWR; cut-offs set either at −1.65 SD or −1 SD for each sub-test under scrutiny). The choice of these two cut-off points stemmed from previous studies, including those on Polish language (e.g., Dębska et al., 2022; Reid et al., 2007), which used identical cut-off points to identify children and adults with dyslexia who manifest different cognitive deficits, as using the same cut-off points may increase the comparability of study results. A participant was considered to have a deficit in a particular skill if at least one sub-test’s result measuring that skill fell below the cut-off point (see Table 4). For both thresholds, the typically developing group showed only a single deficit (mainly PA) or no deficits. The low oral language only group also mostly had no or a single deficit in PA at the −1.65 SD threshold, but at the more liberal −1 SD threshold, they exhibited mild multiple deficits. Most of the children in the low reading only group fell within a single deficit category for the −1.65 SD cut-off point (mainly RAN), but they fell within a multiple deficit category based on PA + RAN deficits for the −1 SD cut-off. As many as 60.5% of the comorbid low oral language and low reading group exhibited multiple deficits at the strict −1.65 SD cut-off point, and 89.5% exhibited multiple deficits for a more liberal threshold (mainly PA + RAN + NWR). The proportion of children with multiple deficits in comorbid low language and low reading group is significantly higher than in both low language only, X2 (2, N = 76) = 17.2, p < 0.001, and low reading only groups, X2 (2, N = 76) = 9.8, p = 0.007.

Table 4 Distribution of deficits in cognitive-linguistic skills (% of participants) within groups (−1.65 SD and −1 SD thresholds established for each task)

Discussion

This analysis aimed to describe the prevalence and characteristics of low language and low reading skills comorbidity in Polish. We compared four groups of monolingual Polish-speaking children sampled from a population-based study: typically developing, with low language skills, low reading skills, and comorbid low language and low reading skills assessed by comprehensive normed tools. We established how often each difficulty occurred in the sample, and described participants’ cognitive-linguistic skills assessed with another set of normed tools.

We found that the low oral language and low reading comorbidity rate in the whole sample (i.e., 3.95%) exceeded the chance level: 24.1% of children with low reading skills also had low oral language skills, and 30.6% of children with low oral language skills also had low reading skills. These numbers seem to be lower than in most studies conducted in English (from 30 to 79% low reading skills among children with low language skills; from 28 to 73% low oral language skills among children with low reading skills), more similar to other transparent languages such as Russian (38 and 46%, respectively, Rakhlin et al., 2013) and Greek (4147%, respectively, Spanoudis et al., 2019). A meta-analysis would help to establish whether there is truly a difference in the proportion of comorbid difficulties between languages with transparent vs opaque orthographies, but we suggest that the lower rate of low oral language and low reading comorbidity in transparent languages might be because reading in more transparent languages relies to a lesser extent on PA skills (Ziegler et al., 2010) as compared to English. A multiple case study method confirms that even at the liberal cut-off point (−1 SD), more than 20% of low oral language only and low reading only groups in our sample did not have a PA deficit. Perhaps low PA skills would then be a common deficit in children with low oral language and low reading skills in English and less so in more transparent languages. With regard to RAN, we found that RAN deficit is highly specific to children with low reading skills, which is consistent with previous research (e.g. Araújo & Faísca, 2019; Landerl et al., 2013; but see Ziegler et al., 2010). This may suggest that RAN serves as a protective factor against reading difficulties (Pennington & Bishop, 2009).

Another interesting result is the increased prevalence of girls in the low language only group in our sample. This finding is contrary to early estimates of the gender-related prevalence of language difficulties obtained for the American English population (Tomblin et al., 1997), although this is not the first population-based study showing that boys are not outnumbered among children with low language skills (Law et al., 2009, 2013; Wu et al., 2023). The gender ratio for low reading is in agreement with most other studies: more boys than girls show difficulties (Yang et al., 2022).

Second, the comorbid low oral language and low reading group differed from both the low oral language only and low reading only groups in terms of language and reading skills, as well as the cognitive-linguistic profile. As expected based on the group assignment, the comorbid low oral language and low reading group could be distinguished from the low language only group based on low reading results. This between-group difference is smaller for word reading than for pseudoword reading, probably because pseudoword reading is a better measure of decoding skills, whereas word reading also relies on lexical knowledge. The next difference between the low oral language only group and the comorbid low oral language and low reading group is in expressive grammar as measured by the Sentence repetition sub-test (a part of Language test). This is partly in line with a study by Moll et al. (2015), which indicated that difficulties in Sentence repetition are characteristic of language difficulties. On the other hand, our analysis also shows the additive effect of comorbid low language and low reading skills: the results of the low language only group are higher than the results of children with both language and reading difficulties.

Significant differences in cognitive-linguistic profiles include the low language only group’s higher results in tests that are usually found to be more difficult for children with low reading skills: PA and RAN tests (Dębska et al., 2022), although this effect is not true for every participant of the study: as much as 13.2% of the low language only group had double deficits in PA and RAN. Interestingly, the discrepancy between the comorbid low oral language and low reading group and the low oral language only group decreases with PA task difficulty: the difference in Z-scores in Phoneme discrimination and Phoneme blending is larger than that in the Elision sub-tests. This is probably because relatively simple tasks involve less short-term memory load, and as a result, they are easy for the low oral language only group (De Groot et al., 2015). Little advantage of the low oral language only group over the comorbid low oral language and low reading group in the Elision sub-tests seems to be in contrast to other studies (De Groot et al., 2015; Fraser et al., 2010; Ramus et al., 2013; Snowling et al., 2019), but this probably results from the young age of our sample and only one year of formal literacy instruction. In an English-based study, the advantage of the low oral language group in the Elision task was not visible before the age of 8 (Snowling et al., 2019). Finally, we found no RANDigits—RANLetters discrepancy in the low oral language only group (contrary to the Dutch study: De Groot et al., 2015), as the results in both sub-tests were close to the results of the typically developing group (in line with another study of Dutch: Vandewalle et al., 2012), even though on the individual level, as much as 26.4% of the low language only group had RAN deficits.

Another set of differences: those between the comorbid low oral language and low reading and low reading only groups seem to be particularly interesting for practitioners because language difficulties are more difficult to observe from the classroom perspective than reading difficulties (Adlof et al., 2017). Differences between the comorbid low oral language and low reading group and the low reading only group include Language (all sub-tests—this difference is expected due to the group assignment) and NWR tests. A multiple case study revealed that as little as 5% of the low reading only group had a deficit in the NWR test. Given that the NWRLow wordlikeness test in our analysis produced a larger effect size than NWRHigh wordlikeness, we assume that the NWR tests were better measures of phonological short-term memory than of lexical knowledge in this sample. It is worth noting that our NWRLow wordlikeness measure was created without considering wordlikeness. Its effect size could potentially be even higher if the test only contained items of low wordlikeness. This result corroborates some earlier findings for English-speaking children (Cowan et al., 2017; Ramus et al., 2013; Snowling et al., 2019), indicating that the phonological short-term memory deficit measured by the NWR test is more characteristic of DLD than of dyslexia. Interestingly, the results of the Phoneme discrimination sub-test also allow us to distinguish between the low reading only and comorbid low language and low reading groups (as well as between the low oral language only and comorbid low oral language and low reading groups). This simple task was easy for all groups except the comorbid low oral language and low reading group.

Third, as predicted by MDM, the multiple case study method revealed that children in both low reading groups mostly had multiple deficits, but this was observed for the −1 SD threshold rather than the stricter −1.65 SD threshold. This is in line with earlier Polish research (Dębska et al., 2022) and suggests that using a more liberal −1 SD cut-off might be more useful than more strict cut-offs within MDM research. For the low oral language only group, approximately 80% of children presented with PA deficits, but for 34.2%, it was the only deficit seen. This group warrants further research within MDM. The liberal −1 SD threshold might be especially useful for describing comorbidities—as much as 90% of the comorbid low language and low reading group had multiple deficits: mostly a triple deficit in PA + RAN + NWR. This result corroborates earlier research and it might suggest that an accumulation of risk factors leads to more severe outcomes (Evans et al., 2013; Hayiou-Thomas et al., 2021).

In summary, although we did not find any universal patterns of deficits (in line with MDM), certain cognitive-linguistic skills appeared to be less affected in subgroups of children with low language skills only and low reading skills only. Intact RAN may play a protective role against reading difficulties (Pennington & Bishop, 2009), while intact phonological short-term memory, as measured by NWR task, may be a protective factor against oral language difficulties. Both of them are accompanied by good basic PA skills (Phoneme discrimination, and Blending).

The limitations of this analysis include arbitrary criteria for low language/reading labels with no confirmed external diagnosis of DLD and/or dyslexia. Establishing groups based on the test results instead of the external psychiatric diagnosis is a frequent study design in the field (e.g., McArthur & Castles, 2013; Snowling et al., 2019). Nevertheless, these might be children with subclinical rather than clinical levels of difficulties. Therefore, the results could only be generalized with caution. Apart from that, our analysis involved a narrow age group in the course of intensive reading skills development—it is possible that one point in time was not enough to capture the dynamics of development of cognitive-linguistic skills related to reading. To resolve these limitations, future studies should compare groups of children with clinical rather than subclinical difficulties within a longitudinal design.

Conclusions

Low language and low reading indeed co-occur above chance level in the Polish sample. Mild multiple deficits were observed in both the low oral language and low reading groups, although only children from the comorbid low oral language and low reading group showed all deficits measured in this study (PA, RAN, and NWR). The findings have the potential to inform hypotheses related to deficits characteristic of DLD, dyslexia, and comorbid DLD and dyslexia. This analysis shows that it is important to distinguish and control for language skills in reading research and for reading skills in oral language skills research. Findings also inform practitioners by showing that it is important to screen children for comorbid low language and low reading skills, and multiple mild deficits might be a good indicator for such problems. In particular, apart from the Language test, low performance of children with low reading skills in the Phoneme discrimination sub-test, and NWR tests might indicate comorbid low oral language and low reading. Additionally, low performance of children with low oral language skills in Phoneme discrimination, Blending phonemes, Sentence repetition, and RAN tests might indicate comorbid low language and low reading.