Introduction

Previous longitudinal studies have shown that poor language skills, such as difficulties with expressive grammatical knowledge and receptive vocabulary, are a major risk factor for later reading difficulties (Hjetland et al., 2019; Snowling & Hulme, 2021). In most previous studies, the participants’ mean age was 5.5 years at the first assessment and 8.4 years at the latest reading assessment (Hjetland et al., 2017). This study aims to contribute to previous studies in two ways. First, the timeline is expanded by using data from toddler age (33 months) to 10 years old. Second, by using authentic assessment to identify children with poor language skills.

Many high-income countries have strengthened aid for early identification because of the need to identify children with poor language skills (Cairney et al., 2021; The Norwegian Directorate for Education and Training, 2023). Although the benefits of early identification of poor language skills and subsequent follow-up treatment are widely recognized, it remains undetermined at what age this focus should be prioritized (Duff et al., 2015; Jin et al., 2020). There seems to be consensus that children’s language development stabilizes between the ages of three and four (Duff et al., 2015), meaning that children will maintain their rank order in their further language development (Bornstein & Putnick, 2012). Furthermore, previous studies have shown that most toddlers with poor language skills attain language scores in the average range by school entry (Dale et al., 2014; Sandvik et al., 2014). At the same time, robust findings indicate that these children continue to have poor language and language-related skills, such as reading (Bleses et al., 2016; Rescorla, 2009). This implies a need for close follow-up of these children.

In ECEC, teachers are required to monitor children’s language development at an earlier stage, as many children start in ECEC when they are one year old (OECD, 2019). It is therefore important to have a tool that can be used by teachers to identify these children so that they can provide a reliable description of children’s early language skills to caregivers and specialists.

There is a known gap between research and practice, and this also applies to language assessments. For children attending ECEC in Norway, it is often the teachers, in cooperation with caregivers, who are responsible for identifying potential development delays. In other countries teachers are responsible for referrals to specialists. In both cases, teachers need to be able to identify these children. While teachers often use authentic assessments to assess children’s language development (Brown & Rolfe, 2005; Nordberg & Jacobsson, 2021; Sandvik et al., 2014), researchers mostly use parent-report or clinical standardized instruments to measure children’s language development and reading outcomes (Armstrong et al., 2017; Dale et al., 2014; Hjetland et al., 2019; Psyridou et al., 2018; Rescorla & Dale, 2013; Snowling & Hulme, 2021). Authentic assessment refers to the systematic observations of children’s functional language skills in daily routines observed by familiar and knowledgeable teachers (Bagnato & Ho, 2006). To address policy-makers’ focus on the early identification of children at risk for later language disorders and/or reading difficulties, we need knowledge on how authentic assessment can serve as a useful tool for this purpose. The aim of this study is therefore to investigate whether poor language skills measured by teachers using authentic assessment provide a reason for concern about later reading development. Regarding reading development and reading skills, Snowling and Hulme (2021) highlight the need to distinguish between the ability to decode and the ability to read for meaning (reading comprehension), explained by the simple view of reading: reading = decoding and linguistic comprehension (Gough & Tunmer, 1986). In this study, it was therefore important to use tests that assessed both decoding and language comprehension in school.

Background

There is a large body of evidence on the risk that children with poor language skills will develop developmental language disorders (DLD) and/or reading difficulties (Snowling & Hulme, 2021; Thal et al., 2013). Reading comprehension is more likely to be affected than decoding (Dale et al., 2014). In addition, vocabulary comprehension has been identified as an important predictor of outcomes in the normal range (Thal et al., 2013). Although many toddlers with poor language skills appear to catch up with their peers, they may not achieve skill levels that are characteristic of children with language development within the normal range (Thal et al., 2013). This underlines the importance of adapted provisions in ECEC regardless of whether toddlers will develop later difficulties or not. It is argued that poor language skills in toddlers below the age of 2 ½ years are not sufficient to justify speech–language therapy (Rescorla & Dale, 2013). At the same time, these children should receive increased educational support, particularly if poor language skills are combined with other risk factors, such as a family history of language/literacy learning difficulties to reduce the potential risk of later reading difficulties (Bleses et al., 2016).

Decoding skills rely on phoneme awareness, which is developed at an average age of 4 ½ years. Phoneme awareness relies on the accuracy of articulation, receptive lexical knowledge, and large-segment awareness (Carroll et al., 2003). There is also a strong relationship between early language skills, such as vocabulary knowledge and grammatical skills, and reading comprehension (Hjetland et al., 2017; Hulme et al., 2015). Language assessments in ECEC could therefore benefit from addressing many aspects of language in addition to the underlying language variables.

Norwegian ECEC often uses the observation assessment TRAS (Early Registration of Language Development) (Espenakk et al., 2011) when assessing children’s language development. TRAS is categorized into eight aspects of language skills to provide a broad assessment of a child’s language (Espenakk et al., 2011). ECEC in Norway must document progress in children’s language development (Directorate for Education and Training, 2017), and TRAS can be used for this purpose. The advantages of using authentic assessment conducted by teachers are that teachers have knowledge about both the individual child and multiple children in the ECEC group. This can provide a strong basis for the comparative evaluation of observed skills. By using authentic assessment teachers may gather information about toddlers across various settings in their natural environment (Bagnato & Ho, 2006; Højen et al., 2022) rather than decontextualized settings, as with conventional tests, which may fail to capture real-life skills (Bagnato & Ho, 2006). However, in the case of more persistent language disorders, a norm-referenced assessment is often needed to assess which specific language area children struggle with, and to establish a diagnosis (Sansavini et al., 2021).

In response to the objective of early identification, there is a need to focus on teachers’ early identification of children at risk of DLD and/or reading difficulties in children’s natural learning environment. Therefore, this study aims to examine the extent to which results from authentic language assessment in ECEC conducted by teachers of toddlers with poor language skills can be associated with decoding skills in 2nd and 5th grade and reading comprehension skills in 5th grade. Although a large body of studies has examined the relation between language skills and reading skills, most of these were effect studies designed for research purposes (Hjetland et al., 2019; Rescorla & Dale, 2013), that have addressed only a few language skills (Bleses et al., 2016; Hagen et al., 2017b) and often focused on older children’s language development (Hjetland et al., 2019). Regarding outcomes for toddlers with poor language skills, the samples have been small (Rescorla, 2009), and toddlers’ language skills have been measured with standardized language tests (Bleses et al., 2016) or viewed retrospectively (Reikerås & Dahle, 2022). No known previous studies have used a larger sample to identify toddlers with poor language skills as early as 33 months using authentic language assessments conducted by ECEC teachers as well as follow-up data on both decoding and language comprehension beyond 9 years of age.

Research Questions

  1. 1.

    To what extent can the results from authentic language assessment in ECEC conducted by teachers of toddlers with poor language skills be associated with children’s decoding skills in 2nd and 5th grade and their reading comprehension skills in 5th grade?

  2. 2.

    What proportion of toddlers with poor language skills are poor decoders in 2nd and 5th grade and poor readers in 5th grade?

Method

This study is part of the longitudinal Stavanger Project, -The Learning Child, which followed over 1000 toddlers (born during the period of July 1, 2005, to December 31, 2007) from the age of two to ten. One of the main objectives in the Stavanger Project was to study the relationship between language and reading (for more information, see Reikerås, Løge and Knivsberg (2012)). The Stavanger project collected TRAS language data in the ECEC when the children were 33 months old and again when they were 4 years old. Data on reading were collected in school when the children attended 2nd grade and 5th grade. The results from TRAS language at 4 years were not included in the present study due to a ceiling effect where the majority of the observed toddlers had fully mastered all variables. The test assessing reading comprehension in 2nd grade was excluded due to a lack of data on reliability and validity. We obtained ethical approval from the Norwegian Centre for Research Data (NSD) and parental consent for each child in the study.

Instruments

TRAS (Early Registration of Language Development) is a nonstandardized observation tool developed to observe different aspects of children’s language between the ages of 2 years and 5 years. It is divided into three levels of difficulty. The assessment is based on the observation of children’s use and understanding of language in a natural environment (Espenakk et al., 2011). It can therefore be regarded as an authentic assessment. TRAS was developed for use in ECEC for two reasons: to increase ECEC staff knowledge about children’s language development and to identify children at risk of language disorders (Espenakk et al., 2003). TRAS assesses different aspects of language in three main language areas based on the model of Bloom and Lahey (1978): form (with the partial tests pronunciation, word production and sentence production), content (with the partial tests language comprehension and language awareness) and use (with the partial tests interaction, communication and attention) (Bloom & Lahey, 1978; Espenakk et al., 2011). Of these three, the language area content may be regarded as the most important with regard to reading because it contains the areas of language comprehension and language awareness. These are areas of previous research regarding pathways to reading (Snowling & Hulme, 2021). Therefore, the present study used this division in the data analysis to determine whether the correlation was higher in this area. The validity of TRAS has been reported by the authors (Espenakk et al., 2011) and studies (Hansén-Larson et al., 2021; Helland et al., 2017). The reliability (Spearman correlation) has been reported for each age level 0.54 (level 1), 0.40 (level 2) and 0.74 (level 3) (Espenakk et al., 2003).

The word chain test (WCT) (Høien & Tønnesen, 2008) is a screening test that measures pupils’ decoding skills. In a sequence consisting of four words written together (called a word chain), the students must mark with a line where spaces should be. Students have four minutes to mark as many word chains as possible. The test is standardized for pupils aged 8 years to 15 years, in addition to adults (Høien & Tønnesen, 2008). The reliability coefficient from 8 years to 15 years was calculated, with an average value of r = .86 (˂0.001), which shows that the test gives reliable results (Høien & Tønnesen, 2008). The average value for validity was calculated against the word decoding test OS-400, with a result of r = .73 (˂ 0.001) (Høien & Tønnesen, 2008).

The national reading test (NRT) is a Norwegian reading test that is nationally administered to provide schools with information on pupils’ reading skills (Directorate for Education and Training, 2022). The NRT is compulsory and is administered in 5th grade. In the NRT, reading comprehension is divided into to three reading processes: finding information in text, interpreting and comparing information and reflecting on and assessing the form and content of texts (Directorate for Education and Training, 2022). The results are given in scale points, which in turn are divided into three achievement levels (Directorate for Education and Training, 2022).

Data Collection

In the Stavanger project, the staff at the ECEC received courses on the use of TRAS before assessing the children. They received training in how to use TRAS, how to interpret the results and how to register the results. During the TRAS observation, each child received observation from two individual ECEC staff members. When the results were coded, 2 points were given for complete mastery, 1 for partial mastery and 0 for not yet mastery. The maximum score was 144.

The WCT was conducted in the classroom by the teacher. Students received 1 point for each correct word chain and 0 points for each incorrect word chain. The maximum score was 60 points.

The NRT was administered by teachers in the classroom and had a time frame of 90 min. Depending on the type of test items, students received 1 to 2 points for each correct answer and 0 points for an incorrect answer. The maximum score was 32 points.

Participants

This study used data from toddlers born in 2006 and 2007 due to changes in the National Reading Tests (NRT) between 2015 and 2016 when the Directorate for Education and Training started to link samples between years “…through the use of IRT calibration and scaling methodology, with anchor tasks, scaled scores (scale points), and fixed mastery level limits from year to year” (Directorate for Education and Training, 2022, p. 10). These changes excluded 2005 toddlers since their results from the NRT could not be compared to the NRT results from 2006 to 2007 toddlers at a scale level. This left 906 toddlers as relevant participants in this study, of whom147 did not complete TRAS 2.9 and were therefore removed before the preliminary analysis. Due to the complexity of assessing bilingual children suspected of having language disorders (Goldstein, 2006), the focus of this study was toddlers with Norwegian as their first language. Bilingual toddlers (N = 153) were therefore also removed.

An independent t-test was conducted to identify whether there was a significant difference in language skills between the bilingual toddlers and the remaining toddlers. The analysis showed a significant difference between the bilingual group (M = 61.88, SD = 23.8) and the remaining group (M = 71.91, SD = 23.2); t (803) = 4.80, p = .001 (two sided). To interpret the strength of the effect size, eta squared was calculated based on Cohen (1988). The result was small, with a Cohen’s d of 0.43.

After these toddlers were removed, 652 toddlers remained. Of these, 128 toddlers dropped out of the study due to a lack of registration by schools or moving. An independent t-test was conducted to identify whether there was a significant difference in language skills between the toddlers who completed the reading tests and those who did not. The analysis showed no significant difference between the dropout group with regard to NRT (M = 70.45, SD = 24.21) and the remaining group (M = 72.26 SD = 22.9); t (650) = 0.787 p = .4 (two sided), the 2nd Word Chain Test (WCT) (M = 71.36, SD = 24.14) and the remaining group (M = 72.02 SD = 23.00); t (650) = 0.268 p = .8 (two sided) and 5th WCT (M = 71.26, SD = 23.4) and the remaining group (M = 72.06 SD = 23.11); t (650) = 0.342 p = .7 (two sided). The final the number of participants was N = 515. Of these, 266 were girls and 249 were boys.

To identify the language and reading development of toddlers with poor language skills, the TRAS results were used to divide the participants into three groups based on the level of language skills: poor language, below average and average to high. The cut-off in identifying toddlers with poor language skills was set at the bottom 10% in the TRAS toddler total results, 12-46% for below average and 48 -100% for above average to high (using the cumulative percentage). To examine what proportion of the toddlers with poor language skills continued to have poor reading skills, the results from the reading tests were divided into the same categories as the TRAS results (for the mean and SD, see Table 1). When frequency analysis was conducted, there was no cumulative percent at 10% for the 2nd WCT, and the cut-off was therefore set to the nearest point, which was 12%.

Table 1 Mean and SD for all participants and -the three groups on toddler language*

Data Analysis

IBM Statistical Package for the Social Science (SPSS) Statistics version 29 was used in the analysis (IBM-Corporation, 2022). Two research assistants were responsible for entering the data in the database, with one responsible for entering the data and the second controlling the entry. 10% of the participants were randomly selected for a second data entry to compare the degree of deviation, which revealed consistency of 98%.

A Pearson product-moment correlation coefficient was conducted to examine the correlation between language skills and reading skills. Due to multicollinearity, with a bivariate correlation of over 0.70 between the TRAS toddler variables, a regression analysis was dismissed in line with statistical recommendations (Tabachnick et al., 2020). Furthermore, to examine whether the three groups (poor language, below average, and average to high) differed significantly in their mean scores on reading tests, a one-way between-groups multivariate analysis of variance (MANOVA) was conducted. Since there were unequal N values, Pillai’s Trace was used, as it is a more robust statistic to test the statistically significant difference among the groups (Tabachnick et al., 2020). MANOVA works best when the dependent variables are moderately correlated. As the primary correlations show (Table 2), the three variables for reading skills have a correlation between 0.46 and 0.65. With no correlations up to 0.8 or 0.9, there is no reason for concern (Tabachnick et al., 2020).

Table 2 Pearson product correlations between measures of toddler language and reading in 2nd and 5th grade

There was a normal distribution when checking univariate normality (Q‒Q plots) and multivariate normality. The maximum Mahalanobis distance was 14.618. Using Pallant’s (Pallant, 2020) key values, the critical value was 16.27 therefore, none of the cases exceeded the critical value. The assumption of homogeneity of variance matrices had a sig value larger than 0.001 (0.932); therefore, the assumption was not violated. None of the sig values in Levene’s test of error of variances were less than 0.05, so the assumption of equality of variances was not violated. To further investigate where the significant differences lie, one -way analysis of variance (one -way ANOVA) was used. Due to unequal group sizes, Welch and Brown-Forsythe (Tomarken & Serlin, 1986) were followed for the ANOVA.

To examine which proportion of the toddlers with poor language skills continued to have poor language skills, a crosstabulation was conducted (Tables 3, 4 and 5).

Table 3 Crosstab toddler language skills and 2nd WCT groups
Table 4 Crosstab toddler language skills and 5th WCT groups
Table 5 Crosstab toddler language skills and NRT groups

Results

Research Question One

The relationship between language, decoding and reading comprehension was investigated using a Pearson product- moment correlation coefficient (Table 2). Preliminary analyses were performed to ensure no violation of the assumption of normality and linearity (Q_Q Plots).

There was a small but significant correlation between TRAS and word chain 2nd and 5th, with r between 0.17 and 0.20. The correlation between the different TRAS areas/TRAS total and NRT, r was between 0.20 and 0.26. This indicates that language skills (as measured with TRAS) only explained later reading skills to a small extent. The correlation between TRAS content was not higher than in the other sections.

As shown in Table 6, the mean scores for the language groups on the reading tests increased with toddlers’ language level. The toddlers with poor language skills had the lowest mean scores in the decoding tests in 2nd and 5th grade and in the reading test in 5th grade. To examine whether the language groups differed significantly on any of the reading tests, a MANOVA was conducted. To further investigate where the significant differences, a one-way ANOVA with post hoc Tukey HSD test was used.

Table 6 The toddler language group results on reading skills in 2nd and 5th grade

There was a statistically significant difference between the toddler language skill groups on the combined variables, F (6.102) = 6.52, p = ˂ 0.001 Pillai’s Trace 0.074. The effect size calculated using partial eta squared was 0.04. This is considered small using Cohen’s (Cohen, 1988). A Bonferroni adjustment was applied by dividing 0.05 by 3 (the number of analyses) to reduce the risk of type 1 error. This gave a new alpha level of 0.017. When the results for the dependent variables were considered separately, they all reached statistical significance of p=˂0.001. When controlling for effect size, partial eta squared showed a small to medium effect size, with the largest effect on the NRT. Since there was a sig in Pillai’s Trace, we could investigate the between-subject effects.

Post hoc comparison using Tukey’s HSD indicated a significant difference between the toddler language skill groups and later reading skills. There was a significant difference between the poor language and average to high groups on the WCT and NRT. In the WCT tests, there was no significant difference between poor language and below average. There was a significant difference between all of the groups on the NRT test (Table 7).

Table 7 Differences in reading skills between toddler language groups calculated by MANOVA

Research Question Two

To examine which proportion of toddlers with poor language skills were poor decoders in 2nd and 5th grade and poor readers in 5th grade a crosstab analysis was conducted.

As shown in Table 3, the majority of the poor language toddler group reached the below average-group in the 2nd WCT. A total of 26.5% of the poor language toddler group had poor decoding skills compared with average-to-high TRAS toddlers; only had 8.6% who fell into this category.

As shown in Table 4, the majority of the poor language toddler group and the below-average toddler group had decoding skills in the 5th WCT below average. For the average-to-high group in the toddler groups, 57.9% had decoding skills in the average-to-high group.

As shown in Tables 5, 22.4% of the toddler group with poor language had reading skills in the lowest 10% group in 5th grade. If the lowest 10% in NRT 5th grade is combined with the below-average group in NRT 5th grade, the results show that 67% of the poor language toddler group had a below-average result for reading in the NRT. By comparing 42.5% of the average-to-high toddler group obtained results in the two lowest categories in the NRT 5th grade.

Discussion

The aim of this study was to examine the association between poor language skills and later reading skills, using an authentic assessment as a tool of identification.

The Relationship Between Language Skills and Reading Skills

With regard to the first research question regarding how well the results from the authentic language assessment of toddler’s language can be associated with their later reading skills, the results indicate a very modest degree of association, reaching a small, at best, in terms of Cohen’s guidelines (Cohen, 1988). The correlation was higher between phonological awareness and decoding and reading comprehension skills in Hjetland et al.’s review (2017), with r = .37. However, these authors excluded studies with at-risk toddlers in addition to using studies with clinical standardized measures (Hjetland et al., 2017). Another possible reason for the small correlation is that 33 months may be too early to measure underlying reading skills, especially those related to decoding skills (Carroll et al., 2003; Duff et al., 2015). The section for TRAS content did not appear to be more important for later reading skills than the other toddler section. This may be partly because language skills are not stabilized this early (Duff et al., 2015) and partly because of the development of phoneme awareness. According to Carroll et al. (2003), phoneme awareness does not occur before an average of 4.6 years. Measuring toddlers’ ability to rhyme, for example, which is one of the questions in TRAS, is therefore not as relevant. This is also highlighted in the TRAS manual; 2 years is too early to measure language awareness (Espenakk et al., 2011). However, Carroll et al. (2003) found that skills in the accuracy of articulation, receptive lexical knowledge and large-segment awareness were important in the development of phonological awareness. To observe early markers of later reading skills at such a young age, authentic assessments should include observations/assessments of the skills mentioned above. Measuring known underlying language components such as auditory perception, word retrieval, and verbal working memory requires more clinical testing.

The results from the MANOVA confirm the results from the correlation in Table 1, with a larger association between toddlers’ early language skills and later reading comprehension than between toddlers’ early language skills and later decoding skills. This can be explained by several reasons. We know that language disorders often co-occur with reading difficulties (Bishop et al., 2017), and reading comprehension difficulties are more common than decoding difficulties (Hulme et al., 2020). Research also shows that the pathway from language factors to reading comprehension is more stable than the pathway to decoding-related skills (Hulme et al., 2015). The most important cognitive factors in decoding-related skills are phoneme awareness, letter-sound knowledge and RAN (rapid automatized naming) (Snowling et al., 2019). These aspects were not measured in this study due to the nature of the authentic assessment in observing children in their natural environment (Bagnato & Ho, 2006), and because of the ceeling effect on TRAS at 4 years. Assessing these factors may have provided additional information and underlines the importance of including clinical tests to supply this information. Given the importance of these cognitive factors in reading development, these measurements should be included in further research. Even if the pathway from language factors to reading comprehension is more stable than the pathway to decoding-related skills (Hjetland et al., 2019; Hulme et al., 2015), poor language appears to compromise later decoding skills (Hulme et al., 2015), highlighting the need for early identification and intervention in addition to more research on which of these language factors make the decoding pathway unstable.

When examining whether toddlers with poor language skills continued to have poor reading skills in 2nd and 5th grade the results confirmed that the toddler group with poor language skills had the lowest reading results in both 2nd and 5th grade. The three TRAS toddler language groups differed significantly from each other on the reading tests, except for poor language and below average on the WCT tests in the 2nd and 5th grades. The results show that toddlers with poor language skills at 33 months have significantly poorer reading results than toddlers with average to high reading skills. This is consistent with previous research (Bleses et al., 2016; Rescorla, 2005). As the present study shows, there is a significant chance that toddlers with poor language skills will continue to have poor reading skills later. With regard to the effect sizes in the present study, the effect size was small to moderate. In Rescorla’s (2005) study the effect size was larger, but both studies revealed that longitudinal studies showed a significant association between toddlers’ language skills and later reading skills. Rescorla’s (2005) study had a broader assessment of both language and reading, but the main results showed the same tendency in association. Since no known previous research has used an authentic assessment conducted by teachers to identify the 10% of toddlers with the poorest language skills, these results are important. They reveal that authentic assessment can be used to identify toddlers with poor language skills and therefore can be a supplement to more clinical tests when needed. This further confirms the importance of early intervention at an early age since many toddlers with poor language will continue to struggle with reading. At the same time there is no knowledge about what services these children have received, some may have received early intervention, therefore it cannot be concluded what the results would have been if this was controlled for.

Reading Outcomes for Toddlers with Poor Language

The second research question asked which proportion of toddlers with poor language skills continued to be poor readers in 2nd and 5th grade. It is interesting that 26.5% of the poor language group of toddlers remained among the 12% with the poorest decoding skills; 76.4% of the poor language group of toddlers had below-average decoding skills in 2nd grade. In 5th grade, the decoding scores were, as expected, much higher, assuming that decoding skills play a lesser role in reading skills later in education (Snowling & Hulme, 2021). In this study the results showed that 18.4% of the toddlers remained in the poor decoding skill group.

When considering the NRT results, 22.4% of the toddlers from the poor language group were in the poor reading group in 5th grade. One can argue that this low number shows that 33 months is too early to identify toddlers at risk for reading difficulties, or that using an authentic assessment for this purpose is inadequate. Nevertheless, as many as 67.3% of toddlers in the two lowest language skills groups had below-average reading skills in the 2nd and 5th grades. This is consistent with previous studies showing that even if toddlers with poor language at an early age attain language scores in the average range by school entry, they continue to have the poorest language-related skills, such as reading, into adolescence (Rescorla, 2009). On the other end of the scale, only 6.5% of children with average to high reading skills had reading skills in the lowest 10% on the NRT in 5th grade. This is consistent with the claim that children with good oral language skills learn to read better than those with poor language skills (Snowling & Hulme, 2021).

Conclusion and Pedagogical Consequences

The present study contributes with new and relevant information regarding the early identification of toddlers with poor language skills using authentic assessment. First, there is a small but significant correlation between early language and later reading skills. We can therefore assume that authentic assessment measures a large range of relevant language skills and can be used as the first tool to identify toddlers with poor language skills. Authentic assessments may have many advantages: toddlers are observed by people who know them, and it is possible to avoid placing the child in a test situation. Furthermore, the observations take place in the child’s natural environment. In addition, there was a significant difference in the mean and standard deviation between the poor language, below-average and average-to-high groups. This confirms that there is a risk that toddlers with poor language skills will continue to have poor language and language-related skills. On the other hand, the results are not clinically useful in regard to predicting which toddlers will develop reading difficulties, with 22.4% of the poor language group of toddlers remaining in the lowest 10% reading skills in 5th grade. To make predictions on an individual level, a more fine-grained tool is needed. The results suggest the importance of both authentic assessment and clinical tests and how they complement each other. While authentic assessment can be used to assess children’s language in everyday situations and as a first step in identification and adjusted adaptation in ECEC, more clinical tests are needed to address language disorders and make clinical decisions. Further studies should focus on which language factors differ between children who develop language scores in the average range by school entry.

The results have important implications for practice. First, authentic assessment can be used as a first step in identifying children with poor language. This tool can give ECEC teachers the background information needed for reliable information about children’s language development. Second, even if poor language skills at 33 months are not sufficient to receive special needs assistance, the results could support an argument for paying extra attention to these children and providing adjusted interventions to ensure their language development and reduce the risk of later reading difficulties.

Limitations

This was a longitudinal study, so the interval between early language assessment and later reading assessment was over a period of 7 years. We have no information about the pedagogical adjustments or interventions the toddlers with the poorest skills received, the services they received, or whether any of them were diagnosed with DLD or dyslexia. These factors may have affected the results, so the results must be interpreted with caution. Furthermore, the study did not control for any types of variables. For example, the present study had no access to SES or biological data. Therefore, future studies should include these variables. The present study also did not include vocabulary tests beyond toddler age. Future longitudinal studies should include these when examining the relation between language comprehension and reading comprehension more explicitly.

Further studies should also include bilingual toddlers because of the increasingly multilingual society.