Introduction

English is used as a lingua franca in many places (MacKenzie, 2014). It is one of the most common additional languages to be learned in many countries around the world, including Spain (Council of Europe, 2001). And yet, children face a challenge when they learn it as a foreign language, partly because of English orthographic depth and inconsistency. Consistency refers to how regular grapheme-phoneme and phoneme-grapheme translation rules are (Alegría & Carrillo, 2014). Orthographic depth is the extent to which the orthography is a phonetic representation of speech (Katz & Feldman, 2017). The more complex and unpredictable an orthographic system is, the less reliable its print-to-sound correspondences are (De Simone et al., 2021; Schmalz et al., 2015). In the case of English, the orthography differs from that of many other European languages (like Italian or Spanish) in terms of syllable complexity and orthographic regularity (Seymour et al., 2003). For instance, Spanish orthography is transparent and highly consistent, as it is characterized by one-to-one correspondences between graphemes and phonemes (with few exceptions). English orthography is more opaque, as more than one grapheme corresponds to a single phoneme (e.g., /i/ can be spelled ee like in see, or ea like in sea). Additionally, graphemes can represent different phonemes depending on the orthographic context (e.g., th may represent /ð/ like in brother, or /θ/ like in thing). The depth of the English orthography also influences other linguistic aspects, apart from grapheme-phoneme correspondences. Some spellings preserve their morphological identity at the expense of phoneme-grapheme consistency (Chomsky & Halle, 1968; Cummings, 1988), as in the case of sign and signature, which exemplifies how the appearance of the root morpheme is conserved (it is also spelled with i), despite different pronunciation (/aɪ/ in sign, and /ɪ/ in signature). Moreover, not only the pronunciation of the vowel varies, but also that of the consonant (/g/ in signature). Additional examples of this double inconsistency in pronunciation (despite preserving the same root morpheme) include signal (/sɪgnəl/) and design (/dɪzaɪn/).

These differences between languages may have an influence on decoding strategies used by foreign language learners (Bhide, 2015; Rau et al., 2015; Ziegler & Goswami, 2005), and may also impact spelling (Watcharapunyawong & Usaha, 2013). The aim of this study was to explore the characteristics of the spelling errors that Spanish children produce in English when it is learned as a foreign language in order to understand the strategies that children rely on, and the effect that a native language background from a script with a transparent orthography may have on English spelling development.

Spelling

Spelling is a cognitive process fundamental to writing (Bourassa & Treiman, 2009). Solid orthographic retrieval skills (that is, being a good speller) allows more cognitive resources to be dedicated to other processes of writing (McMurray, 2006). During spelling acquisition, children rely on different types of information, and different skills support the spelling process. At first, phonological skills are one of the most relevant, especially phonological awareness and knowledge of phoneme-to-grapheme correspondences (Blachman, 2000; Ziegler & Goswami, 2005). Novel words are decoded through a sublexical route by applying these phoneme-to-grapheme correspondences (Coltheart et al., 2001). This demonstrates how important phonology is for literacy acquisition. But it is not a fully efficient route, and it is not sufficient for processing words with an irregular orthographic form (Moats, 2010, 2016). Since sublexical processing is serial, applying phoneme-grapheme correspondences takes time and cognitive resources. The retrieval of a word is faster once its orthographic representation is stored in the lexicon, especially for long or irregular words (Barton et al, 2014). Thus, strong orthographic representations for words is essential to save cognitive resources, which can then be dedicated to other processes, like semantic or syntactic processing. Moreover, advanced orthographic skills are required in languages with complex orthographies like English, for example, grasping which letter sequences are legal or illegal, comprehending context-dependent spelling, and processing different patterns like digraphs and clusters (Cassar & Treiman, 1997). Lastly, morphological awareness is also relevant (Carlisle, 2000; Carlisle & Feldman, 1995). Morphology is a source of valuable information for correct spelling, especially for English due to its depth (Chomsky & Halle, 1968) and its morphophonemic orthography (Nunes et al., 2006; Venezky, 1999, 2011). For instance, pronunciation of certain graphemes in base words may change after inflection or derivation (magic, magician), but knowing the forms (suffixes) facilitates spelling (Bourassa & Treiman, 2008; Garcia et al., 2010).

Triple word form theory

Three linguistic elements (phonology, orthography, and morphology) comprise the Triple Word Form theory (Bahr et al., 2009; Berninger et al., 2009; Garcia et al., 2010). According to this theory, these three types of knowledge about spelling and word structure are stored and coordinated with each other in order to construct the written form of language. The interrelationship between sounds (phonology) and letters (orthography), and the relationships between base words and affixes (morphology) contribute to spelling. Studies focused on the analysis of spelling errors reveal how children rely on these sources of knowledge along with the processes that they activate during spelling (Bahr et al., 2009; Garcia et al., 2010). They also shed light on the non-linearity of spelling development since error patterns change across grades. For instance, Bahr and colleagues (2012) found a shift in linguistic error distribution. While phonological errors were more frequent than morphological errors among younger children, they found an increase of morphological errors in grades four and five. These authors attributed the shift to children’s vocabulary growth leading to the use of more complex words, whose written formation may not be mastered yet. At the same time, growth in the orthographic lexicon after years of reading experience leads to less reliance on phonological information alone and thus a reduction in phonological errors that are also found in grades four and five. Given the transfer between spelling and reading (Conrad, 2008), the formation of strong orthographic representations through a self-teaching mechanism facilitates spelling (Shahar-Yames & Share, 2008; Share, 1995).

Error analysis

There are a variety of error collection and analysis methods. Spelling errors obtained from word dictation tasks (Caravolas et al., 2001; Dixon et al., 2012; Fashola et al., 1996; Howard et al., 2012; Raynolds & Uhry, 2010; Russak & Kahn-Horwitz, 2015; Sun-Alperin & Wang, 2008) guarantee the same number of words for all the participants, as well as control for the different orthographic features being targeted. Nevertheless, dictation tasks could artificially inflate the rate of errors and would limit the knowledge that the students can activate, if they do not recognize the target words. Collecting data in a more natural environment, like using narrative writing tasks (Bahr et al., 2012, 2015; Quick & Erickson, 2018), has greater face validity, as it approximates the natural writing process more closely. Although narrative writing can lead to avoidance of certain words which the speller may find difficult to spell, this form also prevents the possibility of encountering words that a student is not familiar with or mis-hears. Narrative writing tasks may allow the use of a wider range of a participant’s vocabulary, avoiding the constraints of researchers’ words selection. Since students will write different numbers of words, the scoring and spelling comparisons must be made by calculating percentages using the total number of words produced, and the total number of errors (Moats et al., 2006).

Regarding error analysis methods, some standardized tests assess spelling accuracy by quantifying errors through a binary measure of correct/incorrect (e.g., Test of Written Spelling-5, by Larsen and colleagues, 2013), but there are researchers who have proposed different scoring systems. Constrained approaches (Caravolas et al., 2001; Treiman & Bourassa, 2000) do not consider acceptable orthographic productions (rane for rain) as errors, as long as phonology is represented through a legal sequence. On the contrary, the aim of unconstrained approaches is to identify the type of information that children rely on during spelling: phonology, orthography, or morphology. Inaccurate spellings are analyzed, even when the result is phonetically plausible, in order to identify the contributions of these three features (Bahr et al., 2012). This latter system considers multiple linguistic sources, which give information about underlying knowledge. Furthermore, the unconstrained approach is useful for measuring students’ spelling ability (Daffern & Ramful, 2020), and the differences before and after intervention (Apel & Masterson, 2001). In foreign language learners, this method also allows for a better understanding of the type of errors that students produce, and thus instructional targets that teachers should focus on (Bahr et al., 2015; Joshi et al., 2008).

Spelling error studies in English as a foreign language learners include speakers from different linguistic backgrounds, like Arabic (Al-Bereiki & Al-Mekhlafi, 2015; Allaith & Joshi, 2011; Russak, 2022), Chinese (Dixon et al., 2012), Hebrew (Kahn-Horwitz et al., 2012; Russak, 2020; Russak & Kahn-Horwitz, 2015), Italian (Palladino et al., 2016) or Spanish (Bahr et al., 2015; Fashola et al., 1996; Howard et al., 2012) (see Figueredo, 2006, for a review). In the case of Spanish, research is relevant because it is the first language of more than 580 million speakers around the world (Fernández Vítores, 2019).

Spanish and English spelling

One of the key interests in exploring spelling errors in foreign language learners is the presence of native language influence. Certain aspects may be transferred across languages (Chung et al., 2019), although this depends on many factors, such as the linguistic proximity between languages (Geva & Siegel, 2000; Kahn-Horwitz et al., 2011). In the case of English and Spanish, both languages use the same alphabet. Despite the use of a similar alphabet, there are English spelling elements that differ from Spanish, and that have been found to present a challenge for Spanish speakers, leading to misspellings that Fashola and colleagues (1996) call “predicted errors” (see Table 1). These spelling errors have been widely described in Spanish-speaking populations (Bahr et al., 2015; Cronnell, 1985; Fashola et al., 1996; Howard et al., 2006, 2012; Lindner et al., 2022; Raynolds & Uhry, 2010; Rolla San Francisco et al., 2006; Sun-Alperin & Wang, 2008; Zutell & Allen, 1988). In what follows, these error types will be described in greater detail.

Table 1 English spelling features which are challenging for Spanish speakers with examples

Phonology

The origin of some of these errors lies in differences in phonological inventories across languages. For instance, there are certain graphemes that are shared across languages, which lead to confusion if their pronunciation is not the same (Howard et al., 2012). One example is the letter j, which represents the sound /x/ in Spanish (e.g., jarrón) but the sound /ʤ/ in English (e.g., jelly). In other cases, some phonemes exist in the foreign language but not in the native language phonemic inventory (novel phonemes), so they may be more difficult to process. This idea was explained in the Linguistic Affiliation hypothesis (Russak & Saiegh-Haddad, 2011; Saiegh-Haddad et al., 2010), and it was also evidenced in English as a foreign language (EFL) learners from other linguistic backgrounds (Allaith & Joshi, 2011; Russak & Kahn-Horwitz, 2015). Furthermore, certain English consonants do not contrast in Spanish (e.g., /ð/ and /d/), and therefore, speakers fail to discriminate between them (Howard et al., 2012; Zutell & Allen, 1988). This also happens with voiced and unvoiced stop consonants (p/b, t/d and k/g), which are usually confused (Raynolds & Uhry, 2010). Errors with allophones (confusing v and b, which have the same pronunciation in Spanish) are very typical as well and have been described by Fashola and colleagues (1996) and Cronnell (1985). Vowels also represent a challenge for Spanish speakers. Spanish has five vowels sounds while English has between fifteen and twenty, depending on the variety (Deterding, 2004; Moats, 2009). As a result of this difference, many errors consist of the substitution of an English-specific phoneme (like /i:/ or /æ/) for spelling of the closest phoneme in Spanish (like /ɪ/ or /ɛ/) (Cronnell, 1985; Fashola et al., 1996; Howard et al., 2012; Sun-Alperin & Wang, 2008; Zutell & Allen, 1988). Additionally, vowel length is not distinctive in Spanish, while it is an important variable in English and has an influence on spelling (Fox et al., 1995). The pronunciation of some long vowels, like a /ei/ and i /ai/, are perceived and spelled as diphthongs by Spanish speakers (Fashola et al., 1996; Rolla San Francisco et al., 2006). Finally, confusion between the phonemes /θ/ and /s/ has been considered before (Bahr, 2015), as in South America and certain parts of the south of Spain there is no phonological contrast between these phonemes (Canfield, 1981; Lipski, 2012). However, both phonemes are discriminated in northern Spain oral Spanish.

Orthography

Apart from the difficulties stemming from phonological differences, some English orthographic patterns are challenging for Spanish speakers. Differences between English and Spanish orthography affect the size of the processing units (see the Psycholinguistic Grain Size theory by Ziegler & Goswami, 2005, 2006). English processing requires larger units, like syllables or rimes, beyond graphemes, which are enough for Spanish. As a result, there are English-specific orthographic features that do not exist in Spanish, and therefore may present difficulties in processing by Spanish speakers. Although consonant clusters exist in both languages, they are rarely formed by more than two letters in Spanish, and they never occur at the end of a word (only Latin words like bíceps are an exception). In English, however, triconsonantal clusters are more common, both in starting and final positions (street, tasks). Digraphs are also elements that exist in English and Spanish, but their number is more limited in Spanish (rr, ch, ll, qu and gu) and not all digraphs are shared between both languages. Moreover, Spanish digraphs contain at least one consonant, while there are English digraphs formed exclusively by vowels (ea, ee, oo…). A particular case is grapheme doubling: in Spanish this occurs with few consonants, and they are usually associated with a change in pronunciation (caro and carro are pronounced differently). English grapheme doubling depends on rules, and more consonants and vowels are affected. Finally, the letter h is the only silent letter in Spanish. In English, many letters (including consonants and vowels) can be silent (sword, name, castle), and the letter h corresponds to the phoneme /h/ (which does not exist in Spanish). The existence of these unfamiliar new patterns (see Table 1) may explain spelling errors in English as well (Fashola et al., 1996; Howard et al., 2012; Sun-Alperin & Wang, 2008).

Our study

In this study we wanted to explore the type of spelling errors that are most frequent in Spanish-speaking children spelling in English. In particular, we wondered if there were differences in phonological, orthographic, and morphological errors across grades, and to what extent the native language (Spanish) may influence spelling in English as a foreign language (EFL). The majority of studies to date have been carried out in the United States, due to the number of Spanish speakers in the American Educational System (Hussar et al., 2020). Little research has been done in Spain, where English is mainly limited to educational environments, and the language of instruction in English classes is usually divided between Spanish and English. Moreover, English acquisition occurs during foundation stages of learning reading and writing in Spanish, producing a unique situation of sequential oral bilingualism and simultaneous biliteracy acquisition. In addition, some pronunciation differences between Spanish varieties (like /θ/ and /s/ phonemes) may have not been considered in previous studies. To our knowledge, in this country only Lahuerta (2015, 2018) has assessed the type of errors that students in Spain produce when spelling in English. Nevertheless, these studies focused on other aspects, like fluency or grammatical and lexical complexity, spelling remained in the background. Furthermore, the participants were adolescents, who are likely to have less errors due to the fact that they have had more exposure to the written language than young children (Lindner et al., 2022). Therefore, a gap exists in the literature regarding spelling errors among younger Spanish children learning to write in EFL. The present study attempts to fill this gap, in order to glean information about the influence of linguistic variables involved in English spelling. Moreover, studying literacy acquisition by children learning EFL allows a comparison with other studies focusing on native children. This, together with previous research with Spanish-speaking populations, gives us the opportunity to distinguish between the influence of developmental and linguistic factors. Specifically, we explore spelling development patterns of primary school students with the aim of answering the following research questions:

  • To what extent do Spanish children learning EFL rely on their knowledge of phonology, orthography and/or morphology when spelling in English?

  • Does the pattern of reliance on linguistic categories (phonology, orthography, morphology) change across grades?

  • What type of linguistic features will be most apparent in spelling errors in Spanish children learning EFL?

Based on previous findings, we hypothesize that:

  • In line with Triple Word Theory (Bahr et al., 2009; Berninger et al., 2009; Garcia et al., 2010), spelling errors in EFL among Spanish speakers will be evident in phonological, orthographic, and morphological categories.

  • Morphological errors will increase while phonological errors will decrease across grades, following the previously observed developmental pattern.

  • Spelling errors will reflect linguistic features that are challenging for L1 Spanish speakers when spelling in EFL, specifically doubling graphemes, long vowels and vowel digraphs, clusters, and novel phonemes.

Method

Participants

Demographic data

Participants were Primary Education students who attended fourth, fifth and sixth grades in a semi-private school in blinded location (Spain), a region in which a Northern Spain Spanish variety is spoken. Semi-private schools are present in other regions in Spain as well as in other European countries. These schools are partially funded by the Government, and they receive around 25–30% of the Spanish student population (Ministry of Education, 2021). The region is also representative of the Spanish population, with a per capita income similar to the Spanish average (National Statistics Institute, 2021). Contrary to the Spanish-speaking population in the United States, our participants live in a country where Spanish is the official and dominant language (although there are dialects and co-official languages in certain territories). English is acquired as a foreign language at schools. Being proficient in English is highly valued. Nonetheless, its use is mainly relegated to educational or high-profile professional contexts (like business or academic pursuits).

The sample was comprised of 136 participants. Forty-four students were approximately 9 years old (M = 9.7 years; SD = 3 months), 47 were approximately 10 years old (M = 10.7 years; SD = 3 months) and 45 were approximately 11 years old (M = 11.8 years; SD = 3 months). All of the participants were native speakers of Spanish. None of the participants had cognitive, learning, or behavioral impairments. Furthermore, the socioeconomic status of the students who attend this school was generally middle. All the children’s guardians provided written consent and agreed to participate. The procedure of the study was approved by the Ethics Committee of Research of the Principality of Asturias.

English language learning context

Children attended English lessons for four hours a week. Furthermore, some classes (like arts and science) were given in English in order to increase exposure to the language since the school follows a Content and Language Integrated Learning methodology (CLIL; Martínez Agudo, 2019), frequently used in Spanish schools. English instruction began in kindergarten, focusing primarily on oral communication during the first years, and integrating written content in the following grades. Although teachers were Spanish native speakers, they were proficient in English. Selected grades were chosen in order to explore literacy development stages, but also to allow students to have some experience with English writing prior to testing.

Procedure

Writing samples

Samples of a narrative task were collected, one per student. A template with instructions was printed and it was given to the students by the classroom teachers, who were responsible for administering the task. Students had 12 min to produce a handwritten composition. Instructions consisted of the following sentence: “Write about yourself, your family, your house… be creative!”. Samples were transcribed by a bilingual Spanish–English speaker, who identified every misspelled word. One of the researchers reviewed the samples in order to detect any missing errors. The number of words and errors per sample were counted, and the percentage of misspelled words was calculated.

Classification system

Errors were classified on two different levels.

Categories

The first and more general level used the POMAS (Phonological, Orthographic, and Morphological Assessment of Spelling) categories (Bahr et al., 2012; Silliman et al., 2006). POMAS is an unconstrained, qualitative scoring system which embodies the Triple Word Form theory. As an unconstrained system, it considers all misspellings, even those in which the phonology of a word is preserved by an acceptable orthographic representation. In this first level, errors were classified in general categories: Phonological, Orthographic, Phonological-Orthographic, Morphological or Morphological-Orthographic. Errors affecting phonology were those that did not preserve the phonological skeleton (Bourassa & Treiman, 2003) (pay for play). Orthographic errors were those that, despite representing all the phonological elements of the word, did not demonstrate appropriate orthographic notation (rabit for rabbit). Morphological errors implied errors during the processes of derivation or composition, like misspelled prefixes (andemployed for unemployed). Phonological-Orthographic and Morphological-Orthographic categories were for those errors that overlapped between two areas of development. For instance, a Phonological-Orthographic error would be a misspelling affecting the orthography, which at the same time causes a change in the phonological representation of the word (whit for with, or yuo for you). An example of Morphological-Orthographic error would be a word root spelled phonologically with an accurate suffix spelling (recepcionist for receptionist).

POMAS codes

Next, we classified the errors on a more fine-grained level, following the POMAS codes (Bahr et al., 2012; Silliman et al., 2006), which are a detailed classification of errors according to specific linguistic features derived from general American English. For instance, if an error affected grapheme doubling (rabit for rabbit), it was classified into the code OGD (Orthographic grapheme doubling), because it was an omission of an obligatory doubling. If a cluster was misspelled and not all the elements were represented (like in frien for friend), it received a PCR code (Phonological cluster reduction). This level allows comparison with studies that used the same codes, revealing similarities and differences with native speakers and EFL learners from other linguistic backgrounds.

If a word contained multiple errors, the erors were labeled separately. Errors resulting from grammar rule confusion were excluded from the analysis (he go instead of he goes), as well as other errors that were not misspellings, but lack of vocabulary (e.g. use of false friends, code-switching, non-related word substitutions). By the same token, we also excluded verbs in those cases in which the spelling error was an omission (omitting -s for third person, or the whole suffix for past tenses -ed). This was done because it was not possible to distinguish between errors caused by misspelling and errors caused by lack of English grammar knowledge. However, errors affecting verbs that were clearly misspellings were included (learnd for learned). Finally, those errors that were independent of the language, like capital letters or word boundaries were also excluded. Although other studies have considered these errors (Bahr et al., 2015), we felt that they would not be representative of children’s spelling skills in EFL, but more related to vocabulary and grammar rule knowledge. Since our participants were young EFL learners, a cautious approach was preferred.

We counted the total number of words as well as the number of errors of every participant. With this information, we calculated the percentage of total errors per participant by dividing the number of errors by the number of words. Percentage of errors per category and codes were calculated from the total number of errors. In our analysis we considered the mean number of words and percentage of total errors; percentage of errors per category; and percentage of errors per POMAS code.

Data analysis

From the data several comparisons were performed, using R-software (RStudio Team, 2020), to explore the impact of grade (fourth, fifth and sixth grade) on number of words and number of spelling errors. An ANOVA analysis was performed for the number of words. Regarding percentage of errors, the data were tested for normality and homogeneity of variance, and a Shapiro–Wilk test was performed, showing that the distribution departed significantly from normality (W = 0.82, p < 0.001). Based on this outcome, we performed non-parametric analyses. Kruskal–Wallis tests were used to determine the most frequent errors on both levels. Furthermore, differences across grades were also assessed.

Results

Number of words and percentage of errors

There was a statistically significant difference at the p < 0.05 level in the number of words for the three grade groups: F (2, 133) = 36.81, p < 0.001. On average, students wrote 75 words (SD = 28) in the fourth grade, 113 (SD = 41) in the fifth grade and 141 (SD = 36) in the sixth grade. Post hoc analyses (Bonferroni) showed significant differences between the three grade groups. The differences between fourth and fifth grades were significant (Estimate = 38.2, p < 0.001), as students in fifth grade wrote more words on average than fourth grade students. There were also significant differences between fifth and sixth grades (Estimate = 27.6, p = 0.001) and fourth and sixth grades (Estimate = 65.7, p < 0.001), as students in sixth grade wrote more words than students in fifth and in fourth grades.

Students had an average percentage of errors of 9.24% (SD = 8.10) in the fourth grade, 7.45% (SD = 7.29) in the fifth grade and 4.82% (SD = 3.62) in the sixth grade. A Kruskal–Wallis Test revealed a statistically significant difference in percentage of errors across the three grades (fourth, N = 44; fifth, N = 47; sixth, N = 45), χ2(2, N = 136) = 7.132, p = 0.028. A pairwise post-hoc Dunn test with Bonferroni adjustments indicated that the percentages of errors made by fourth grade students were significantly greater (z = 2.67) than those made by sixth grade (p = 0.022) (see Fig. 1). No other differences were statistically significant. Furthermore, ten participants (3 in fourth, 3 in fifth and 4 in sixth grades) were excluded from the rest of the analysis. Although these ten participants had grammar and lexical errors in their samples and their writing was comparable to their peers, they did not have any errors affecting spelling. Therefore, their production was not analyzed in this study.

Fig. 1
figure 1

Number of words a and Percentage of Errors b per Grade

Level 1—categories

A total of 965 errors were found, representing 6% of the total number of words produced. Errors were found in all the categories, although the percentage of errors in each category differed. The category in which most errors were found was Orthography, as 60.30% (SD = 27.38) of the errors belonged to this category (e.g., rabit for rabbit). Phonological errors were less frequent, 25.58% (SD = 23.20) (e.g., pay for play). Morphological (e.g., andemployed for unemployed), Phonological-Orthographic (e.g., whit for with) and Morphological-Orthographic (e.g., recepcionist for receptionist) errors were least frequent, with 6.73% (SD = 13.77), 4.85% (SD = 10.02) and 2.52% (SD = 8.10), respectively.

The effect of grade was only significant in certain categories. Specifically, the results of the Kruskal–Wallis chi-squared tests were significant for Orthographic (χ2(2) = 6.72, p = 0.034) and Morphological-Orthographic (χ2(2) = 6.15, p = 0.046) errors. A pairwise post-hoc Dunn test with Bonferroni adjustments indicated that the percentage of errors was different for the fourth graders compared to the sixth graders in the Orthographic category (p = 0.036) and in the Morphological-Orthographic category (p = 0.039). Specifically, students in the fourth grade had more orthographic errors than students in the sixth grade (z = 2.51), while they had less Morphological-Orthographic errors (z = 2.47). There were no differences in the rest of categories, as the results of the Kruskal–Wallis chi-squared test were not significant for Phonological, Morphological and Phonological-Orthographic errors.

Level 2—codes

Due to the high number of POMAS codes, we only included those codes that had a percentage of errors of 2% or above (see Table 2). A cut-off criterion was applied following the method by Bahr and colleagues (2015). Contrary to their study, where error samples were collected in Spanish and English, we collected our data in English only. Therefore, we calculated the percentage that allowed at least 20 error instances (instead of 40), which led to a percentage of errors of 2% or above.

Table 2 Most Common POMAS Codes and Percentage of Errors (Mean and SD)

The most frequent errors were grapheme doubling (OGD) representing 15.05% of the errors (SD = 23.31), followed by other orthographic errors like long vowel digraph misrepresentation (OVDI) with 9.56% (SD = 17.08), stressed short vowel substitution (OSE) with 6.38% (SD = 11.13), unstressed vowel misrepresentation (OUE) with 5.70% (SD = 13.51), and short vowel digraph misrepresentation (OSD) with 4.93% (SD = 11.48). Regarding errors of phonological origin, some of them occurred relatively frequently, like cluster reduction (PCR) with 5.36% (SD = 12.73) and addition of unnecessary letters (PEP) with 6.13% (SD = 12.91). Morphological and phonological-orthographic errors were less common, for example, homonyms (MHOM) with 3.49% (SD = 11.97) and phonological-orthographic reversal (POR) with 3.86% (SD = 7.90) which were the most frequent errors in each category, respectively. Examples of errors classified into every code are provided in Table 2.

With regards to the effect of grade, the results of the Kruskal–Wallis chi-squared test were significant for the orthographic error grapheme doubling (χ2(2) = 7.98, p = 0.018). A pairwise post-hoc Dunn test with Bonferroni adjustments indicated that, while the percentage of errors was similar for fourth and fifth graders, the difference was significant between fifth and sixth graders (p = 0.018). Specifically, fifth grade students had more errors than sixth grade students (z = 2.73). The effect of grade was also significant in the case of the phonological error consonant deletion (χ2(2) = 7.89, p = 0.010). The pairwise post-hoc Dunn test with Bonferroni adjustments also indicated significant differences between students in fourth and sixth grades (p = 0.018), since students in sixth grade had more errors involving a consonant deletion (z = 2.73).

Discussion

The aim of this study was to assess the variety of errors that Spanish-speaking children produce when writing in English. To do so, we analyzed and classified spelling errors obtained through a free writing narrative task. The categories that constituted the analysis were based on the POMAS system (Bahr et al., 2012). We hoped to highlight the different sources of knowledge (phonology, orthography, morphology) children rely on when spelling in a foreign language, specifically, a foreign language like English, which differs in terms of orthographic consistency from their native language (Spanish).

Predictably, the mean number of words produced was highest for sixth grade children, while the percentage of errors for this group was lowest. This could be explained by the increase of both writing practice and exposure time to English during their schooling experience, in line with other studies performed with foreign language learners (Chenoweth & Hayes, 2001; Palviainen et al., 2012). Overall performance across grades was good, as the percentage of errors did not exceed 10% in any grade. Nevertheless, the percentage was lower for older children. An improvement in spelling accuracy among older participants, with continuing presence of errors, has been found in Spanish EFL learners (Lindner et al., 2022) and EFL from other linguistic backgrounds (Hebrew: Russak & Kahn-Horwitz, 2015; Russak, 2022; or Arabic: Russak, 2022). The reason for these findings could be the complexity of the English orthography combined with a lack of explicit instruction about some specific English spelling rules such as, context-dependent spellings and rimes at school. Children’s spellings may improve as they build orthographic representations for already known words, but since they keep learning new vocabulary, they may keep making errors.

The first questions considered if Spanish children relied on phonology, orthography, and/or morphology during spelling in English, and how their grade may influence the distribution of errors in each category. This was assessed using the first classification level, POMAS general categories. We found phonological, orthographic, and morphological errors, which confirms our first hypothesis concerning the presence of errors in all the categories. Nevertheless, morphological errors were the least common. This could be due to the choice of words by participants, who (as less competent EFL learners) might have preferred more simple words that did not involve morphological word formation processes like compounding, derivation, or inflection. Phonology was the category with the second lowest percentage of errors. Based on this finding, it appears that these children have partially internalized English phonological rules at this stage, with fewer errors related to phonology than orthography. Orthographic errors were the most frequent, in line with what Bahr and colleagues (2015) found in Spanish–English students. According to these authors, the origin of the spelling errors could be attributed to a reliance on phonology joined with an incomplete knowledge of English orthography.

Regarding differences across grades, we found an increased percentage of Morphological-Orthographic errors among older children. Their higher proficiency and more frequent use of morphologically complex words could be the reason for this finding, as suggested in previous studies (Bahr et al., 2012; Berninger et al., 2010). Moreover, the percentage of orthographic errors was higher for younger students, as previously found in native speakers (Bahr et al., 2012). We believe that limited experience could explain the high number of misspellings at the younger age. Improved scores among older children are likely to result from deeper knowledge of English orthographic rules (Lindner et al., 2022). Greater exposure to English words may have strengthened the older children’s orthographic representations (Shahar-Yames & Share, 2008; Share, 1995). However, the percentage of phonological errors remained stable across grades. This result contradicts our hypothesis, as well as the results found by Bahr and colleagues (2012), where native speakers became less dependent on phonology while building their orthographic lexicon. In contrast, our participants showed continued dependence on phonology. This suggests that the developmental spelling pattern in EFL among Spanish speaking children differs from the pattern of native spellers. Among the phonological errors, most occurred with novel phonemes, which strongly supports the Linguistic Affiliation hypothesis (Russak & Saiegh-Haddad, 2011; Saiegh-Haddad et al., 2010) (like in broder for brother, cins for things, or tolk for talk). Less familiarity with the novel phonemes, and the absence of an identical phoneme in native language presents a challenge for EFL learners (Wade-Woolley & Geva, 2000), as evidenced among Hebrew (Russak, 2022; Russak & Saiegh-Haddad, 2011), Arabic (Russak, 2022), Chinese (Wang & Geva, 2003) and Spanish (Raynolds et al., 2013) speakers. Moreover, considering that readers of transparent orthographies (like Spanish) rely more on phonology, the continued use of phonological strategies shown by our participants could represent interference from their native language.

POMAS categories classification contributed to answering our first and second research questions about reliance on linguistic knowledge and changes across grades. The third question examined the type of linguistic features that were most apparent in the spelling errors. Code classifications provided a detailed analysis of our participants’ spelling errors, as well as further evidence of native language interference. The most frequent errors were epenthesis (PEP) and cluster reductions (PCR) for Phonology; grapheme doubling (OGD) and errors affecting vowel representation (OVDI, OSE, OUE) for Orthography; phonological-orthographical reversal (POR) for Phonological-Orthographic; and homonyms (MHOM) for Morphology. All these errors were also found to be common in previous studies using POMAS classification (Bahr et al., 2012, 2015). In what follows we will discuss these errors.

Epenthesis errors (addition of unnecessary letters) was one of the most frequent phonological errors. These errors involved consonants, like r (beutirful for beautiful) or d (bildich for village), but also the vowel e (mouthe for mouth, abaute for about). In the case of letter e, the origin could lie in a misapplication of final silent e. Children seem to be aware of the existence of this feature, but they do not discriminate in which cases it must be used. Regarding consonants, a specific pattern of misuse was observed for the letter h. Additions of this letter were either arbitrary and isolated (havaut instead of about), or to form a digraph (thenager instead of teenager, fhather for father). According to Bahr and colleagues (2015), both errors could be considered an influence of language transfer: use of native language linguistic features (the letter h is silent in Spanish) and overgeneralization of English spelling patterns (digraphs), which are novel for the EFL learners. As these authors suggest in their study, the origin of this last pattern could be an insufficient knowledge of English spelling conventions (children are not totally aware that digraphs represent specific phonemes). Errors affecting digraphs were also found by Palladino and colleagues (2016) among Italian EFL learners. As it also happens with Spanish, some Italian digraphs are formed with h (ch and gh), although English has specific digraphs that are novel for Italian speakers as well. Another frequent phonological error was cluster reduction. This error seems to be frequent in L1 spelling of younger children (Bahr et al., 2012, Treiman & Cassar, 1996), but there were no differences across grades in our sample. Furthermore, most of the reductions occurred in final clusters (frien instead of friend), which are very uncommon in Spanish. Our results are similar to previous findings among Spanish speakers (Fashola et al., 1996; Lindner et al., 2022), and point to L1 interference, supporting what Bahr and colleagues conjectured in their study (2015). But while clusters and initial h were found to be a challenge for Spanish speaking children, they were not problematic for speakers with other linguistic backgrounds like Hebrew (Kahn-Horwitz et al., 2011; Russak & Kahn-Horwitz, 2015). This suggests that EFL learners have specific difficulties with different English features, depending on their native language.

In line with earlier findings among non-native English spellers, many of the orthographic errors involved vowels (Bahr et al., 2015; Russak, 2022; Sun-Alperin & Wang, 2008). Particularly, errors affecting long vowel digraphs, stressed short vowels and unstressed vowels were the most frequent in our study. Regarding long vowels, they were sometimes spelled like diphthongs (miusic for music), reflecting the findings of Rolla San Francisco (2006) and Fashola and colleagues (1996). In some cases, the addition did not involve a vowel letter, but a consonant representing a vowel sound (hellow for hello). Sometimes the error implied a total substitution of the grapheme (hay for I). There were no significant differences across grades in errors affecting vowels, contrasting what Bahr and colleagues found with native speakers (2012). The status of English in our study, as a foreign language (which implies less exposure to the language than native speakers) may explain the contradictory findings. Regarding homonyms, our results support previous findings with bilinguals (Bahr et al., 2015) and native children (Bahr et al., 2012) suggesting that the selection of an inappropriate word reflects choice of the most familiar form, instead of semantic processing of the linguistic context (their for there). This is a plausible explanation in our context as well, considering that our participants have had limited exposure to English. Orthographic forms may not be consolidated yet, and thus children transcribe the phonemes to the best of their limited ability.

One particularly interesting finding is the presence of differences across grades only for consonant deletion (PCD) and grapheme doubling (OGD). Grapheme doubling was one of the most frequent errors, especially among younger students. Errors were caused by omission (ofice for office) or addition (preffer for prefer). This confirms that it is a particularly challenging feature among Spanish speakers (Howard et al., 2006). Furthermore, no errors with grapheme doubling were found in other EFL learners, like Hebrew and Arabic (Russak, 2022). Doubling consonants that exist in Spanish do not have any relevance to previous vowel pronunciation (as they do in English). Thus, this specific rule may be difficult to assimilate by Spanish EFL learners if not explicitly taught. Less exposure to the English orthography may be another reason why our participants still have difficulties with this feature. As with other orthographic errors, improvement among sixth graders could be due to the growing understanding of specific English spelling rules, or the formation of orthographic representations as a result of increased exposure and practice with the written form of the language (Shahar-Yames & Share, 2008). Nevertheless, there seems to be an emerging awareness about doubled consonants and positional constraints among younger children. Similar to native speakers, who also struggle with consonant doubling at early stages (Bourassa & Treiman, 2003; Cassar & Treiman, 1997), errors were never located in starting positions. This spelling pattern, although yet not mastered, may be beginning to be assimilated by Spanish children spelling in EFL. A specific explanation could be plausible in those cases where doubling is applied to a wrong grapheme in the word (soceer for soccer, funyy for funny): EFL learners (who are not familiar with English phonology) could also be attempting spelling of difficult words through visually accurate matches. This strategy is similar in native spellers with less developed phonological skills, who are more likely to use visual memory strategies ((Lennox & Siegel, 1996). The way that native speakers integrate certain features of orthographic knowledge (Bahr et al., 2009; Wright & Ehri, 2007), could be the same in foreign language learners. Additionally, older children in our study had a higher percentage of errors related to consonant deletion (PCD) (tenni for tennis, birthay for birthday), although this code was not very frequent. This finding is in line with previous research with native and bilingual speakers (Bahr et al., 2012, 2015). Bahr and colleagues suggested that the origin of deletion errors could be an increased focus on what students wanted to say during writing, instead of paying attention to the form of individual words. Considering that our participants are writing in EFL, it is possible that certain consonant deletion errors may be influenced by working memory load. The demands of the task and the redistribution of cognitive resources could be also responsible for some of these and other errors, and not only the lack of knowledge of the spelling patterns themselves.

In sum, in this study we revealed more information about how Spanish children learning EFL begin to coordinate multiple sources of linguistic knowledge when writing, and how these patterns change across grades. With the help of an unconstrained approach to measure spelling accuracy, the POMAS system (Bahr et al., 2012), we can conclude that our participants rely on phonology, orthography and morphology during spelling. Distribution of errors varies across grades for morphology and orthography, while remaining constant for phonology. Our findings also suggest that spelling acquisition in EFL learners is not exactly equivalent to that of native speakers’, being more affected by linguistic and educational variables, rather than developmental factors. Moreover, L1 Spanish interference can be confirmed by the presence of typical Spanish speakers’ errors that were introduced in Table 1, supporting our third hypothesis as well. Novel phonemes and English orthographic conventions represent a challenge for all spellers learning EFL, although which patterns are unfamiliar for them will depend on their native language and its proximity to English. This is strongly evidenced by the similarities and differences found between the present findings and other studies carried out with speakers from different linguistic backgrounds (Bonifacci et al., 2017; Dixon et al., 2010; Palladino et al., 2016; Russak, 2022; Russak & Kahn-Horwitz, 2015).

One of the limitations of this study relates to the category of phonological errors. Since most of English teachers in Spain are not native speakers, it is impossible to assess if children have simply generated wrong phonological representations (Raynolds & Uhry, 2010; Read, 1971, 1986), or if they have learned the wrong phonological forms from their teachers, as mentioned in previous studies (Russak & Kahn-Horwitz, 2015). In both cases, inaccurate phonological representation could be due to difficulties with discrimination between certain phonemes (novel and familiar), corroborating the Linguistic Affiliation hypothesis (Russak & Saiegh-Haddad, 2011; Saiegh-Haddad et al., 2010). Future studies should take this issue into consideration. Another limitation of this study has to do with one of the disadvantages of narrative tasks, which relates to word choice. Children may avoid certain words if they consider their spellings to be too challenging. This topic could be resolved by including dictation tasks with a closed list of words. Also, including younger and older students could elucidate a more complete developmental map of spelling errors among Spanish children learning EFL at school. This need is supported by the fact that spelling errors are also found in samples of older participants, like secondary school students (Kiernan & Bear, 2018; Lahuerta, 2019).

Despite these limitations, we believe that our findings have important theoretical and educational implications. Spelling error analysis broadens our knowledge about Triple Word Form theory (Bahr et al., 2009; Berninger et al., 2009; Garcia et al., 2010) in non-native speakers. Moreover, the POMAS system (Bahr et al., 2012; Silliman et al., 2006) has proved to be a useful tool in assessing which linguistic elements are more challenging for children learning EFL, and how these elements may vary depending on grade level. Our results also present a table of spelling error patterns that could be used in comparative studies. In this study the comparison was made with other studies analyzing spelling errors by native speakers, but it could also be made with speakers from other linguistic backgrounds. Regarding educational implications, information about spelling error patterns can help teachers detect which features or aspects should be targeted during instruction (Kiernan & Bear, 2018).

Finally, findings from this study confirm the existence of difficulties originating in differences between Spanish and English linguistic features. Similar percentages of phonological errors across grades also suggest a need for improving phonetic discrimination, as well as consolidation of English phonological knowledge. Given the challenge that novel phonemes present for Spanish speaking children (a fact explained by the Linguistic Affiliation hypothesis), students could benefit from increase of oral exposure (with input from native speakers, desirably). Furthermore, explicit training with typical English grain size units, like syllables or rimes, could facilitate the children’s acquisition of the opaque English orthography. The more we know about how written languages are acquired, the more we can support children during the important achievement of learning a second language. As Frank Smith (2014) said “One language sets you in a corridor for life. Two languages open every door along the way”.