Proper literacy skills are crucial for academic achievement, job opportunities, and social participation (Duncan et al., 2007; Savolainen et al., 2008). Word reading and spelling are two core components of literacy development, representing the building blocks of reading comprehension and writing composition (Berninger et al., 2002; Gough & Tunmer, 1986). Empirical studies have demonstrated that word reading and spelling are intricately associated (e.g., Swanson et al., 2003), relying on the same orthographic, phonological, and semantic components (Apel, 2009; Caravolas et al., 2012; Furnes et al., 2019; Georgiou et al., 2012). However, relatively few of the studies investigating their relationship have been longitudinal, and most have been conducted in English-speaking countries (but see Georgiou et al., 2020).

The present study aimed to examine the stability and developmental relations of word reading and spelling skills in Swedish and U.S. children transitioning from kindergarten to grades 1, 2, and 4. This research can inform theories regarding the interplay between reading and spelling. It can also inform educational strategies to support children learning to read and spell.

The relationship between word reading and spelling

Word reading and spelling can be seen as two sides of the same coin, as both skills rely on understanding the alphabetic principle and the ability to memorize spelling patterns (Ehri, 2000). This relatedness has been shown in correlational, intervention, neuroimaging, and behavior genetic studies (e.g., Bates et al., 2004; Pugh et al., 2006; Rapp and Lipka, 2011; Swanson et al., 2003). Despite these similarities, theorists have argued that reading and spelling have some different properties (e.g., Bosman and Van Orden, 1997; Ehri, 2000). Bosman and Van Orden (1997) propose that reading and spelling are supported by a network of bidirectional relations between phonemic, graphemic, and semantic information. They claim, however, that spelling is more difficult to learn than reading. One reason is that correct spelling relies on the production of an orthographic pattern and therefore requires a precise representation of a word in memory. This is optional for reading to the same extent, as partial cues in words can facilitate recognition (e.g., Conrad, 2008; Frith, 1980; Moll and Landerl, 2009; Ouellette et al., 2017). Another reason spelling is more difficult to learn than reading is that the links between graphemes and phonemes are generally more regular for reading than for spelling in alphabetic writing systems (e.g., Galuschka et al., 2020). In other words, the number of graphemes to choose from when writing phonemes is larger than the number of phonemes available for the pronunciation of graphemes.

Several influential theories have discussed the developmental interplay of reading and spelling. Frith (1986) argues that reading and spelling follow somewhat different developmental trajectories. Her model of literacy acquisition suggests that reading is the pacemaker in the pre-alphabetic stage of literacy development when children use visual (logographic) knowledge to represent a word (Stage 1). In contrast, spelling reinforces the use of alphabetic knowledge to decode words (Stage 2). Specifically, the spelling of simple words helps children understand the alphabetic principle, positively affecting their decoding of common words. Further, when children acquire word-specific representations in memory (orthographical knowledge, Stage 3), these representations are at first generally inaccurate and only suffice to recognize words, not to spell them. According to Frith, it is not until the representations of words are more accurate in children’s memory that this knowledge gradually supports spelling development. Frith´s framework suggests a shift in prediction patterns between reading and spelling once basic literacy skills are achieved.

Ehri (1995) argues that reading and spelling constantly reinforce each other during development and that the progression through the different phases is more or less the same in reading and in spelling. Moreover, Ehri emphasizes that both skills build up children’s word-specific representations, facilitating subsequent reading and spelling development. A similar view has been put forward in the self-teaching hypothesis (Share, 1995), suggesting that the understanding that specific graphemes represent phonemes (reading) and that specific phonemes are represented by graphemes (spelling) enables children to build up high-quality representations of different words on their own. These representations are essential for spelling but also support reading, as activating a word from memory improves reading speed.

The theories above are primarily developed for English and do not explicitly consider differences across alphabetic writing systems. A complementary approach is the orthographic depth hypothesis (Katz & Frost, 1992). This hypothesis posits that children learning to read in less transparent orthographies must develop orthographic reading strategies earlier than children learning to read in more transparent orthographies. If so, one might expect stronger effects of reading on spelling in less transparent orthographies. The rationale is that spelling typically requires a more precise representation of words in memory. If children learning to read in less transparent orthographies use orthographic reading strategies at early phases, these strategies should support their spelling attempts.

Only a few studies have investigated the stability of reading and spelling longitudinally in different orthographies. Some studies report that reading and spelling are highly and equally predictable based on previous performance levels in early school (English: Abbott et al., 2010; Ahmed et al., 2014; Finnish: Lerkkanen et al., 2004; Italian: Pinto et al., 2015), whereas others report that reading is more predictable than spelling (English: Caravolas et al., 2001; Finnish: Leppänen et al., 2006; Dutch: Schaars et al., 2017). Regarding the developmental relations of reading and spelling, findings are also somewhat mixed. Some studies report that the reading-spelling relationship is bidirectional (e.g., Abbott et al., 2010; Pinto et al., 2015). Other studies report that reading primarily predicts spelling (Ahmed et al., 2014; Schaars et al., 2017). Still others report that spelling predicts reading in the initial phase, whereas reading predicts spelling in later phases (Caravolas et al., 2001; Leppänen et al., 2006). Finally, one study reports that reading-spelling relations are bidirectional in the initial phase but unidirectional (reading primarily predicts spelling) in later phases (Lerkkanen et al., 2004). Overall, word reading seems to be a more important predictor of subsequent spelling skills than vice versa. In most studies, spelling only predicts reading in the initial phases of literacy development. After that, reading primarily predicts spelling but not the other way around. These findings align with Frith’s hypothesis on the developmental interplay between reading and spelling. The only exception is the studies by Abbott et al. (2010) and Pinto et al. (2015), showing bidirectional reading-spelling relationships from Grade 1 to Grade 7 (Abbott et al., 2010) and from Grade 1 to Grade 2 (Pinto et al., 2015). These findings align with Ehri’s hypothesis. However, in the Abbott et al. study, reading and writing ability were measured with tasks covering different linguistic levels, that is, words (accuracy), sentences, and text, and all these levels were included in their models. Similarly, Pinto et al. found bidirectional relations between reading (accuracy and fluency) and spelling in a free writing task. Hence, mixed findings across studies may reflect different ways of measuring reading and spelling.

Only two of the studies mentioned above assessed reading and spelling skills after Grade 2 (Abbott et al., 2010; Ahmed et al., 2014). In addition, all of the studies were performed within a particular orthography. There is thus a need for more longitudinal studies because they can clarify the direction of the relation between reading and spelling and whether the relationships change over time. Cross-linguistic studies are important as they can determine universal and more language-specific associations (cf. Share, 2008).

As far as we know, only one study has investigated the stability and developmental relations of reading and spelling directly across orthographies (Georgiou et al., 2020). In this study, 942 children learning five writing systems differing in orthographic transparency (English, French, Dutch, German, and Greek) were tested on word and nonword reading fluency and spelling accuracy three times between Grade 1 (spring) and Grade 2 (fall and spring). Georgiou et al. adapted pre-existing reading fluency and spelling-to-dictation tasks for each language. To address any variations in the length of words or nonwords used in each task across languages, the child’s reading score was determined by calculating the total number of syllables in the correctly read words within a set time limit. For spelling, the items in each language were arranged in terms of increasing difficulty, and testing was discontinued after six consecutive errors. The score was the total number of correct responses. Georgiou et al. reported that, across languages, the autoregressive effect of spelling, that is, the stability of individual differences from one occasion to the next, was moderate. In contrast, the autoregressive effect of reading was relatively strong. Moreover, earlier reading predicted subsequent spelling across languages but not the other way around. Additionally, they reported slight variations in the reading-spelling path coefficient across languages. However, overall, the similarities outweighed the differences. Georgiou et al. explained their results on the assumption that spelling relies on precise orthographic representations in many cases, whereas this is not needed to the same extent for reading (cf. Frith, 1980). Hence, reading should be a more important predictor of subsequent spelling skills than vice versa. Georgiou et al. did not report data on children’s reading and spelling skills in kindergarten and early Grade 1. As such, they could not test Frith’s hypothesis that spelling might be more critical for reading in the early phase of literacy development. In addition, they could not test the developmental interplay between reading and spelling during the phase when most children use an orthographic strategy to read and spell different words (e.g., Kilpatrick, 2015, from grade 2 onwards).

Literacy learning and schooling in Sweden and U.S. (Colorado)

In this study, we investigated the developmental interplay of reading and spelling in Swedish and English. A few important characteristics of these writing systems and the educational context in Sweden and the U.S. must be emphasized to better understand the background of this study. The Swedish orthography is semi-transparent, similar to Norwegian, German, and Dutch, and located between Finnish and English (for key features of the Swedish orthography, see Appendix A).

When data were collected, the school system in Colorado, U.S., from whom the English-speaking participants were recruited, provided some daily formal literacy instruction in kindergarten. However, there was no state standard for teaching reading and spelling at this stage, presumably leading to a diversity of alternative educational settings for literacy instruction. In contrast, the Swedish kindergarten curriculum gave more priority to social, emotional, and aesthetic development rather than early literacy acquisition. Due to variation in the age at which literacy instruction is introduced across languages, children from the U.S. sample outperformed their Swedish peers in reading and spelling in kindergarten. However, these differences narrowed as Swedish children received formal instruction in reading and spelling approximately one year later (e.g., Furnes et al., 2019).

From age seven, the teaching of reading and spelling in Colorado was informed by a state-wide curriculum. During the early grades, the primary focus was on phonics instruction, which aimed to develop children’s understanding of letter-sound relationships and alphabet knowledge. However, the extent of instruction varied among classrooms, grades, and individual children’s progress. For children who encountered difficulties with phonics, additional support was typically provided, particularly in the first and second grades. Alongside phonics instruction, there was also emphasis on practicing the reading and spelling of common irregular words, such as those requiring sight word recognition and word family knowledge. Frequently used irregular words in English, such as “the,“ “are,“ and “you,“ were gradually prioritized for immediate recognition as sight words. In addition, children were progressively introduced to word family knowledge, which enabled them to identify groups of words that shared similar spelling patterns and sounds. For example, if a child knows how to read and spell “cat,“ they can use that knowledge to read and spell “bat,“ “rat,“ and “hat”. It is important to note that the extent of instruction in these areas also varied across classrooms, grades, and individual children’s progress.

Compulsory education in Sweden commenced at the age of seven. Although there was a specific curriculum informing the teaching of literacy, there have been limited explicit guidelines or instructions on effective methods for teaching early literacy skills in Sweden, both in the present and in the past (Levlin & Nakeva von Mentzer, 2020). Nonetheless, teachers frequently adopted a combination of phonics instruction for reading and spelling, along with a focus on sight word recognition strategies. During the initial two years, there was an emphasis on phonological instruction, which involved reading and spelling based on letter-sound rules. As children developed more specific word representations in memory, orthographic instruction, which involved reading and spelling familiar words through sight recognition or word family knowledge, was gradually introduced.

The present study

This study investigated the stability and developmental interplay of word reading and spelling across a more (Swedish) and a less (English) transparent orthography, spanning from the end of kindergarten to the end of Grade 1, Grade 2, and Grade 4. We expected that word reading and spelling would exhibit moderate to strong stability over time while accounting for each other, with reading demonstrating more stability than spelling across orthographies. Furthermore, we expected that word reading and spelling would be interrelated from the end of kindergarten to the end of Grade 1, with reading subsequently predicting spelling in the following years as children gradually progressed toward the “orthographic stage” of literacy development, in line with Frith’s hypothesis (1986). Regarding cross-linguistic differences, we expected the following: If the orthographic depth hypothesis holds true, reading should be a stronger predictor of spelling in English than in Swedish.

Method

Participants

We used data from a longitudinal twin study of literacy and language conducted in the U.S., Australia, Sweden, and Norway (e.g., Olson et al., 2014). The relevant review boards in each country granted ethical approval for the study. Same-sex twins were recruited from the Medical Birth Registry in Sweden and Norway, the Colorado Twin Registry in the U.S., and the National Health and Medical Research Council’s Australian Twin Registry. Only children for whom the predominant language of their country was the first language spoken at home were selected. The Norwegian and Australian samples are omitted from the current analyses as we do not have data in Grade 4. Therefore, the sample consists of 489 same-sex twin pairs from the U.S. (50% girls) and 191 from Sweden (51% girls). The use of twins for these analyses potentially represents a methodological problem because the scores of the twins in each pair might not fully represent independent observations; that is, twins share genes, home, and school environments. Therefore, one child from each pair was randomly selected by removing Twin 1 from Pair 1, Twin 2 from Pair 2, and so forth.

There were no significant differences in parents’ years of education across samples (Sweden, M = 13.9, SD = 2.95, and U.S., M = 14.2, SD = 2.22, t = 0.66, p = .51).

Measures

Word reading and spelling tests were originally in English and were translated and adjusted into Swedish for this project. Based on data available for the U.S. sample, Cronbach’s alpha reliability estimates are high for all measures (Samuelsson et al., 2008). We do not have available alpha estimates for the Swedish sample, but we calculated monozygotic twin correlations to provide lower-bound reliability estimates for the measures. These were reasonably high and comparable for all tests across grades and test sites (spelling: 0.76–0.84 in the U.S. sample and 0.67–0.76 in the Swedish sample; reading: 0.75–0.89 in the U.S. sample and 0.80–0.91 in the Swedish sample). As the present study is part of a more extensive study that addresses a wide range of questions, only variables that specifically relate to the current research questions are presented here.

Reading

The Test of Word Reading Efficiency (TOWRE; Torgesen et al., 1999) was used to measure sight word and phonemic decoding efficiency at each testing wave. To assess sight word efficiency, the children read as many words aloud as they could in 45 s from two lists of 104 items each. To assess phonemic decoding efficiency, the children read as many nonwords aloud as possible in 45 s from two lists of 63 items each. In all four lists, words/nonwords were presented in order of increasing difficulty. Sum scores for the number of accurately read words/nonwords from all four lists were used to measure sight word and phonemic decoding efficiency. Only one list for sight word and one for phonemic decoding efficiency was given in Grade 4.

Spelling

Kindergarten spelling was measured by a test developed by Byrne and Fielding-Barnsley (1993) in which children spelled ten real words and four nonwords. For the real words, children hear the word in isolation, then in a sentence, and then in isolation again. The nonwords were pronounced three times. Most of the words had simple sound-to-spelling correspondences (e.g., Swedish: tåg [train], blå [blue], lampa [lamp]; English: man, plug, limp), but some required specific orthographic knowledge (Swedish: ett [one], kom [come]; English: one, come). Scoring was based on the correct representation of the phonemes in each item.

The 45-item spelling subtest from the Wide Range Achievement Test (WRAT; Jastak and Wilkinson, 1984) was used in grades 1, 2, and 4. Each word was first presented in isolation, then in a sentence, and then in isolation again. The number of correct spellings was measured. Testing was stopped if a child made ten consecutive errors. This test includes words with simple sound-spelling correspondences (e.g., Swedish: brev [letter], baka [bake], natur [nature]; English: go, and, him) and words that required specific orthographic knowledge (e.g., Swedish: kjol [dress], ljuset [light], officiell [official]; English: light, kitchen, recognize).

Procedure

Informed consent was obtained in writing from all families who agreed to participate in the study, and the children gave verbal consent. Testing at the end of kindergarten, Grade 1, Grade 2, and Grade 4 was conducted with separate testers for each twin at school or home during a 1-hour session.

Data analytic approach

We tested for autoregressive effects and cross-lagged relations in two steps; within and across two orthographies. All analyses were performed in Mplus 8.5 (Muthén & Muthén, 1998–2021). To assess model fit, we used the chi-square statistic, the standardized root mean residual (SRMR), the root mean square error of approximation (RMSEA) with 95% Cis, the comparative fit index (CFI), and the Tucker–Lewis index (TLI). Indicative of acceptable model fit are SRMR ≤ 0.08, RMSEA ≤ 0.06, CFI > 0.95, and TLI > 0.95 (Hu & Bentler, 1999). The Bayes Information Criteria (BIC) was used as another index for comparing model fit, with a model difference of at least 5 suggesting practically important differences (Raftery, 1995).

First, path analyses were performed within samples to establish a baseline model for each orthography. These analyses included correlation estimates at the same time point, as well as stability estimates (within-domain; reading or spelling) and cross-lagged paths (cross-domain; reading to spelling and vice versa) to each time point from the previous time point. Next, we performed multigroup analyses to investigate the equivalence of the structural path coefficients across the two languages. The first step in determining the comparability of the models is to impose equality constraints across samples to establish that the models have the same paths and fixed and free parameters across samples. The next step is to compare the baseline model with a nested model, where the structural paths such as contemporaneous correlations, stability estimates, and cross-lagged associations are held equal across samples. Finally, pairwise comparisons across samples were conducted by constraining each path coefficient to equality while other coefficients were freely estimated. Model comparison was based on the chi-square difference testing and change in BIC (> 5).

Results

Descriptive statistics

Data were screened for missing values, outliers, and normality (see Table 1). The attrition rate was low across grades and samples. Sight word and phonemic decoding efficiency in kindergarten were positively skewed in both languages. The score distribution of the reading and spelling tests in Grades 1 and 2 was approximately normal, except for phonemic decoding in the Swedish sample in Grade 1. The raw score analyses of these measures were repeated with appropriately logarithmically transformed scores (Tabachnick & Fidell, 2007). Although transformations of variables that deviate from normality improved their distributions, they did not change the results. Therefore, analyses were performed on raw scores. Before conducting longitudinal analyses, the scores were standardized by regressing age and sex onto the raw scores for each variable within each sample and time of assessment.

Table 1 Descriptive Statistics for the Swedish and U.S. sample

First, we used simple correlations to examine concurrent and cross-lagged associations between reading (word and nonword) and spelling and between two consecutive assessment ages within one domain, that is, word reading and spelling (see Table 2). The mean correlation between reading and spelling skills (word reading, nonword reading, and spelling combined) in kindergarten was 0.75 in the U.S. sample and 0.77 in the Swedish sample. A similar pattern of findings was seen for reading and spelling in Grade 1 (0.84 in both the U.S. and Swedish samples), Grade 2 (0.81 in the U.S. sample and 0.75 in the Swedish sample), and Grade 4 (0.80 in the U.S. sample and 0.74 in the Swedish sample). In general, moderate to strong correlations were found between reading and spelling skills between kindergarten and Grade 1 (0.57–0.67 in the U.S. sample and 0.60-0.62 in the Swedish sample), Grade 1 and Grade 2 (0.70-0.79 in the U.S. sample and 0.50-0.68 in the Swedish sample), and Grade 2 and Grade 4 (0.68-0.80 in the U.S. sample and 0.57-0.66 in the Swedish sample). The stability of sight word and phonemic decoding efficiency between kindergarten and Grade 1 was high, with a mean correlation of 0.64 and 0.72 in the U.S. and Swedish samples, respectively. These correlations were even more substantial between Grades 1 and 2 (0.82 in U.S. and 0.72 in Sweden) and Grade 2 and Grade 4 (0.82 in U.S. and 0.77 in Sweden). For spelling, stability was moderate to strong between kindergarten and Grade 1 (0.59 in U.S. and 0.73 in Sweden), Grade 1 and Grade 2 (0.85 in U.S. and 0.73 in Sweden), and Grade 2 and Grade 4 (0.86 in U.S. and 0.71 in Sweden).

Table 2 Correlation Coefficients Between Sight Word Efficiency, Phonemic Decoding Efficiency, and Spelling in the Swedish (below diagonal) and U.S. sample (above diagonal)

Significant correlations between four consecutive time points from reading to reading, spelling to spelling, reading to spelling, or vice versa, are not sufficient evidence of stability and integrated development between these skills over time. Instead, concurrent and autoregressive associations need to be considered to determine the existence of significant cross-lagged associations. Therefore, the relations between word reading and spelling were further modeled with path analysis. Because a total of 16 associations were inspected (concurrent correlations, stability estimates, and cross-lagged associations), we guarded against type I error by using a significance level of 0.003 (0.05/16). Before conducting the path models, we combined the subtests of sight word and phonemic decoding efficiency into a composite word reading score.

Cross-lagged path models of word reading and spelling within orthography

The path models with standardized estimates separately for each orthography are shown in Fig. 1. The models depicting word reading and spelling fitted the data well in the Swedish sample (χ2(12, N = 191) = 26.15, p < .01, CFI = 0.986, TLI = 0.969, SRMS = 0.034, and RMSEA = 0.080 (90% CI = 0.037-0.120)) and the U.S. sample (χ2(12, N = 488) = 27.99, p < .00, CFI = 0.996, TLI = 0.991, SRMS = 0.013, and RMSEA = 0.052 (90% CI = 0.027-0.078)), respectively. In each orthography, the autoregressive path estimates for word reading were high over time, and the cross-lagged paths from word reading to spelling were moderate and significant (all p’s < 0.001). For spelling, the autoregressive path estimates in each orthography were moderate, and the cross-lagged paths from spelling to word reading were substantial between kindergarten and Grade 1 (all p’s < 0.001) but, after that, failed to reach significance. The concurrent associations between reading and spelling were moderate to strong in the first grades and relatively weak in later grades (all p’s < 0.001).

Fig. 1
figure 1

Concurrent, Autoregressive and Cross-Lagged Associations for Word Reading and Spelling with Coefficient Estimates for each Orthography Separately. Note. U = U.S. sample; S = Swedish sample; WR = Word Reading; SP = Spelling; KG = Kindergarten; 1 = Grade 1; 2 = Grade 2; 4 = Grade 4. All significant paths are presented with solid lines and non-significant ones with dash lines together with standardized loadings estimates

***p < .001; *p < .05

Cross-lagged path models of word reading and spelling across orthography

We did several chi-square differences tests to ensure that valid comparisons of the model paths can be made across the two orthographies. The baseline model fit the data well (χ2(24, N = 680) = 54.146, p < .00, CFI = 0.994, TLI = 0.986, SRMS = 0.021, and RMSEA = 0.061 (90% CI = 0.039-0.082), indicating that the same structural paths were applicable across samples. We then compared the baseline model with a more restricted one where stability estimates, cross-lagged paths, and contemporaneous path coefficients at kindergarten, Grade 1, Grade 2, and Grade 4 were constrained to be equal across samples. This model did not fit the data well, χ2 = 136.239, p < .001, and the change in model fit was significant, Δχ2 = 82.093, Δdf = 16, ΔBIC = 23, p < .001. Further analyses, in which we systematically constrained one group of path weights at a time, revealed that this fit difference was driven entirely by a difference in autoregressive paths from spelling in kindergarten to spelling in Grade 1 and a difference in cross-lagged paths from word reading in kindergarten to spelling in Grade 1. Specifically, stability of spelling from kindergarten to Grade 1 was significantly higher for the Swedish sample (r[SE] = 0.54[0.059]) than the U.S. sample (r[SE] = 0.24[0.043]). Conversely, the cross-lagged path from word reading in kindergarten to spelling in Grade 1 was significantly higher for the U.S. sample (r[SE] = 0.53[0.40]) than for the Swedish sample (r[SE] = 0.26[0.063]). To sum up, although we found some minor discrepancies in the developmental interplay of reading and spelling across samples, there were more similarities than differences between the two orthographies.

Discussion

The present study investigated the stability and developmental relations of word reading and spelling from the end of kindergarten to the end of grades 1, 2, and 4 across a more (Swedish) and a less (U.S.) transparent orthography. In this section, we will discuss our findings in relation to existing theory and previous research. Finally, we will consider some limitations of the current study and discuss some theoretical and practical implications.

Stability and cross-lagged associations of word reading and spelling across more and less transparent orthographies

This study showed that word reading skills were more stable over time than spelling in both U.S. and Swedish children. This finding fits well with previous research (Caravolas et al., 2001; Desimoni et al., 2012; Landerl & Wimmer, 2008; Pinto et al., 2015; Schaars et al., 2017; but see Abbott et al., 2010). A reasonable explanation is that the correspondences between graphemes and phonemes are generally more predictable for reading than spelling across alphabetic writing systems (e.g., Bosman and Van Orden, 1997; Galuschka et al., 2020). Moreover, as reading is less dependent on a precise orthographic image of a word than spelling (e.g., Ouellette et al., 2017), this might result in earlier consolidation and stabilization of reading skills. Although the predictability of reading was relatively high, it was only moderate between kindergarten and Grade 1. Given that Swedish children rarely receive literacy instruction in kindergarten and that, at the time this study was carried out, there was no state standard for teaching reading and spelling in the U.S. (Colorado) at this age, this finding may suggest that individual differences in literacy skills is simply a reflection of whether or not teaching was provided in kindergarten. In later grades, the regulated teaching of literacy across languages is likely to result in greater consistency in reading skills (cf. Samuelsson et al., 2008).

Regarding the developmental interplay between reading and spelling, we found that the two skills reinforce each other during the initial phase of literacy development, i.e., kindergarten to Grade 1. After that, reading predicted spelling from Grade 1 to Grade 2 and from Grade 2 to Grade 4, but not the other way around. Generally, these findings correspond with most previous research within more and less transparent orthographies showing that reading is important for subsequent spelling skills, whereas the reversed influence of spelling-to-reading is not consistently found (e.g., Ahmed et al., 2014; Caravolas et al., 2001; Leppänen et al., 2006; Lerkkanen et al., 2004). In addition, using a more comprehensive temporal range, i.e., end of kindergarten, to the end of Grade 1, Grade 2, and Grade 4, our results replicate and extend the report by Georgiou et al. (2020) of a unidirectional reading-spelling path between Grade 1 and Grade 2 across five different languages. Our results, however, contradict the orthographic depth hypothesis (Katz & Frost, 1992), which suggests that reading should be a stronger predictor of subsequent spelling in English. Instead, the main explanation of the developmental reading-spelling path reported here seems to be that reading builds word-specific representations in memory over time, irrespective of the transparency of a particular orthography. This explanation fits well with Frith’s theory (1986), suggesting that spelling is particularly important when children discover the alphabetic principle, and that the development of word-specific representations in memory is at first only sufficient to recognize words, not to spell them. Hence, reading development primarily serves spelling development as soon as additional orthographic strategies come into play. This also makes sense as people usually read more words than they can spell (Cunningham & Stanovich, 1990; Ellis, 1994). Therefore, better specified word-specific representations gained through reading can be used in spelling, even in young children during early development. That being said, spelling a word by writing it is never a pure task, as most spellers write the word and then read it to confirm whether it is correct (Ehri, 1997). In most cases, reading and spelling contribute to the final spelling product. This may also partly explain why reading is more important for spelling than the other way around.

Strengths and limitations

A strength of the present study is the number of same-aged children from different languages and cultures followed over several years. However, some limitations should be considered.

First, the findings may generalize only to orthographies with similar characteristics as English and Swedish and to children of similar ages to the present participants. A second limitation relates to a common methodological issue in cross-linguistic research: developing equivalent reading and spelling tests that also consider the unique features of each language. We partly met this problem by using existing literacy measures that follow the same administration and scoring procedures across languages. Although one cannot rule out the possibility that the findings reported are partly due to the translation of tests from English to Swedish, any systematic test differences across countries should mainly impact mean performances and not correlation analyses. A third limitation is that spelling was assessed with only one test at all grade levels, which may have negative impact on measurement reliability. Although the MZ-correlations for the spelling test were comparable to those reported for reading, it is possible that the correlational analyses presented in this study were influenced in part by measurement reliability and regression to the mean. Additionally, it cannot be ruled out that the use of additional or alternative spelling tests would have resulted in significant relationships from spelling to reading. For example, the use of a speeded measure could have been beneficial for assessing spelling, but few such tests exist in the literature. Typically, spelling fluency is evaluated through a recognition task that measures response times for each item (e.g., determining the correct spelling of a word, e.g., “tak” or “talk”). However, assessing word spelling knowledge through recognition may result in better performance by children because they are able to visually perceive the letters of the correct spelling, rendering the process more akin to reading (see Georgiou et al., 2020 for a similar argument). A fourth issue is that we considered both accuracy and speed in our reading measure. Although this way of assessing word reading seems meaningful in the age range studied here, one cannot directly compare the outcomes of the current study to some previous studies of reading development that used other measures (e.g., Lerkkanen et al., 2004 & Leppänen et al., 2006, included reading comprehension in their latent reading measure; Caravolas et al., 2001, only assessed reading accuracy).

Conclusions and implications

Our results show that reading and spelling reinforce each other during the early phase of literacy development. After that, reading becomes more critical for subsequent spelling development than vice versa. Specifically, spelling is essential for reading in the initial phases because it helps children discover the alphabetic principle. In later phases, reading boosts spelling because reading exposure leads to a richer lexicon of orthographic representations. Our results support the hypothesis put forward by Frith (1986) that spelling is important for reading mainly in the initial phase of literacy development. After that, reading is a more important predictor of subsequent spelling than vice versa. This developmental pattern applies irrespective of the transparency of a particular orthography. Regarding educational implications, literacy instruction should integrate a focus on reading and spelling over time so that children can benefit from the learned knowledge in both domains. If balanced carefully, such an approach will also trigger and strengthen the connections between phonemic, graphemic, and semantic knowledge (Bosman & Van Orden, 1997).