English spelling is a complex system comprising a myriad of phonological, orthographic, and morphological patterns. Although there are some consistent relationships between phonemes and graphemes, many phonemes are spelled in multiple ways (i.e., irregular phoneme-grapheme relationships). For example, it is estimated that 21% of the 3,000 most common English words violate typical spelling-sound rules (Henderson, 1986). Although English is “chaotic” at the level of individual phoneme-grapheme relationships, it is more systematic when larger, context-specific letter combinations and morphological relationships are considered (Treiman, 2017). There are reliable graphotactic patterns involving permissible letter-sequence positions (e.g., consonant doubling rarely occurs in word-initial positions). There are also morphological regularities involving links between phonemes and morphemes (e.g., /e/ is spelled ‹ea› in health due to its root being heal). Along with root consistency, morphological inflectional and derivational suffixes have consistent spelling patterns.

The present study focuses on influences of these larger-unit, morphological and graphotactic regularities on spelling. While sensitivity to English morphological spelling patterns has been extensively studied in first-language (L1) English speakers, there is limited evidence on sensitivity to these spelling regularities in bilinguals whose second language (L2) is English (Figueredo, 2006; Kuo et al., 2017). Globalization of English means that more individuals with differing linguistic backgrounds are learning English as an L2. Late bilinguals, defined as those who acquire L2 after ages six to seven and by then have begun to acquire literacy in their L1, represent a significant portion of this population (Liu et al., 2017). It is important to note that this definition encompasses a broad and heterogenous population, including adolescent and adult learners. Given this diversity, it is crucial to systematically investigate the acquisition of English spelling among bilinguals, especially late bilinguals (Hu & McKay, 2012; Shakkour, 2014). To contribute to this goal, we compared sensitivity to spelling regularities among late Chinese-English bilinguals and English monolinguals.

Development of spelling skill

Theories of spelling differ in the extent to which they make explicit assumptions about the influence of morphological spelling regularities. While these theoretical models and frameworks are not directly comparable, they contribute to our understanding of the development of spelling proficiency. Phase models identify the progression of spelling skills over time, from preliterate/precommunicative, to alphabetic, to phonetic spelling, and finally orthographic stages, with the latter stage integrating phonological and morphological regularities (Ehri, 1998; Nunes et al., 1997; Treiman & Kessler, 2003). Research underscores the importance of developing morphological awareness, i.e., the ability to identify and segment morphemes, for vocabulary growth in both L1 English speakers and bilinguals (Fracasso et al., 2016; Kuo & Anderson, 2006; McBride-Chang et al., 2008). Explicit instruction about inflectional and derivational morphemes enables developing readers to decipher unfamiliar words by deconstructing words into segments (e.g., prefixes, root words, etc.). Exposure to morphological spelling regularities and morphologically complex words not only enhances morphological awareness, thereby improving spelling skills in L1 English-speaking children, but also uniquely predicts spelling ability and listening comprehension in adults (Arnbak & Elbro, 2000; Fracasso et al., 2016; Kirk & Gillon, 2009).

In contrast, the dual-route framework aims to explain the process of spelling various words. The framework posits two parallel routes to spelling: the lexical route, which retrieves whole-word representations from long-term memory for all known words, and the sublexical route, which uses phoneme-grapheme rules for regular words and nonwords that follow typical sound-to-spelling rules (e.g., boat, foat; Houghton & Zorzi, 2003; Martin & Barry, 2012; Tainturier & Rapp, 2001). However, this framework fails to account for systematic graphotactic, morphological, and context-specific influences on spelling (Treiman, 2017).

Statistical learning theories address the limitations of the dual-route framework by proposing that learning occurs through the implicit abstraction of permissible letter sequences in words through exposure to written language (Treiman, 2018; Treiman et al., 2019). In contrast to the dual-route framework’s emphasis on phoneme-to-orthography mapping and whole-word memorization, statistical learning suggests that children first acquire information about the visual form of writing (e.g., how letters combine), then gradually internalize complex patterns with greater exposure to writing, including graphotactic and morphological spelling regularities (Sprenger-Charolles et al., 1998; Treiman, 2017).

Evidence shows that children and adults implicitly acquire and statistically learn graphotactic patterns, such as the position of consonant doublets within words (e.g., ‹ff› rarely occurs in word-initial positions; Hayes et al., 2006; Perry & Ziegler, 2004; Treiman, 1993; Treiman et al., 2002). Morphological regularities also appear to be acquired through statistical learning, with children learning more complex morphological patterns as exposure and experience with spelling patterns increase with age (Deacon & Dhooge, 2010; Levesque et al., 2021; Treiman & Kessler, 2014).

Effects of bilingualism on spelling development

Existing research with bilinguals has primarily focused on the benefits of shared alphabetic writing systems, specifically phonological awareness (i.e., the ability to manipulate speech sounds) due to its importance as a precursor to reading and spelling in alphabetic languages like English (Kruk & Reynolds, 2012; Kuo et al., 2016; Sun-Alperin & Wang, 2011). This reflects the view that bilinguals’ acquisition of reading and spelling involves cross-linguistic transfer, i.e., that L2 acquisition is facilitated by shared linguistic structures between L1 and L2 (Kuo et al., 2016; Yang et al., 2017).

Fewer studies have examined cross-linguistic transfer between English and Chinese, a nonalphabetic language. In contrast to English, written Chinese is morphosyllabic; each character represents a monosyllabic morpheme pronounced as an open syllable. Chinese characters contain component radicals which often convey semantic and phonetic information. However, this information is probabilistic rather than systematic; in contrast to alphabetic languages, it is not possible to reliably determine the pronunciation of an unfamiliar word (Shakkour, 2014; Zhou et al., 2018). Ambiguity in the orthography-phonology relationship in Chinese likely encourages readers to pay greater attention to whole-visual characters and morpho-linguistic units when reading and writing (Wang & Geva, 2003). Moreover, the orthographic structure of Chinese characters, representing thousands of different visual patterns as opposed to English only featuring a visual set of 26 letters, plays a crucial role in reading and writing. This complexity of Chinese orthography contributes to the development of enhanced visual-orthographic skills (McBride, 2016; Wang et al., 2014). Furthermore, the relative importance of phonological versus syllabic awareness differs between English and Chinese (Bruck et al., 1997; McBride, 2016). Chinese also contains many homophones, resulting in less reliance on phonology and greater dependence on orthography, further developing orthographic processing (Kuo et al., 2020). These differences may present challenges in cross-linguistic transfer from Chinese to English.

According to the Shallow Structure Hypothesis (Clahsen & Felser, 2006), based on the dual-route framework, L2 processing overuses the lexical route and underuses the grammatical sublexical route, resulting in the “shallow” processing of L2 morphological information and the indirect influence of L1 on L2 processing. Since Chinese encourages lexical processing due to the lack of systematic orthography-phonology relationship, Chinese-English bilinguals may apply a more lexical orthographic learning strategy when processing English that may have implications for their spelling (Ben-Yehudah et al., 2019; Nelson et al., 2009). Nonword spelling data demonstrates that, although Chinese-English bilingual children performed more poorly than English monolingual children when spelling unfamiliar nonwords requiring sublexical processing, they performed better when spelling familiar nonwords requiring lexical processing (Wang & Geva, 2003). This evidence suggests that Chinese-English bilinguals are skilled at acquiring orthographic information from even a brief exposure to nonwords, consistent with previous findings of superior visuo-orthographic processing in Chinese readers (McBride, 2016).

An alternative perspective comes from theories that assume a prominent role for statistical learning (Treiman, 2018; Treiman & Kessler, 2019). For example, Treiman and Kessler’s (2014) Integration of Multiple Patterns framework assumes that the development of spelling involves not just the memorization of word-specific spellings and general spelling rules, but also the acquisition of implicit knowledge about deterministic and probabilistic orthographic patterns through statistical learning. Integration of Multiple Patterns predicts that learning the spelling of a word is easier when more than one source of information supports a particular spelling. For bilinguals, having access to two languages may allow them to notice structural similarities and differences between linguistic systems, developing greater sensitivity to linguistic features at a more abstract level. Structural Sensitivity Theory posits that regular exposure to multiple languages facilitates the detection of structural patterns in linguistic input, including phonological, orthographic, and morphological regularities (Kuo & Anderson, 2010, 2012). Bilinguals first acquire information about the visual form of writing (e.g., how letters combine), then use statistical learning to internalize complex patterns with greater exposure to writing, including graphotactic and morphological spelling regularities (Sprenger-Charolles et al., 1998; Treiman, 2017). The present study focused on bilinguals’ sensitivity to inflectional and derivational spelling regularities, elaborated in the following sections.

Morphological spelling regularities

Conventional spelling development models posit that children learn morphemes, specifically inflectional (e.g., ‹s›, ‹ed›) and derivational endings (e.g., ‹ly›, ‹al›) by the age of nine to 10 years (Kemp et al., 2017; Tyler & Nagy, 1989). These morphological rules aid spelling development as reliance shifts from phonological awareness to morphological awareness for spelling (Fracasso et al., 2016). As implicit morphological sensitivity has not been well studied among late bilingual populations, it is important to assess both inflectional and derivational endings to investigate possible differences in sensitivity between monolingual and late bilingual participants.

Inflectional plural regularity

In English, inflectional regularity governs the spelling of regular plural endings, which must be spelled with ‹s› regardless of the final phoneme (e.g., /s/: cats, /z/: dogs, /ks/: rocks). However, singular /z/ and /ks/-endings do not follow this rule (e.g., ‹z›: quiz, ‹zz›: fizz, ‹ze›: size, ‹se›: nose, ‹x›: box; Berg et al., 2014; Kemp & Bryant, 2003; Mitchell et al., 2011). Studies have demonstrated that people’s spellings of the inflectional plural ending ‹s› may depend on statistical learning of graphotactic regularities. Plural words with voiced consonants (VC; e.g., /b/) preceding the inflectional suffix are always pronounced with a final /z/ and spelled ‹s› (e.g., bugs; Kemp et al., 2017). This graphotactic regularity is not usually explicitly taught, unlike the “plural rule” which dictates that plural words are always spelled with a final ‹s›. Individuals may learn through exposure that /z/ is spelled ‹s› or ‹z›, spelling sequences with a ‹z› following a VC never occur, and plural words with a VC before /z/ are spelled ‹s› (Kemp, 2009). Therefore, correct use of ‹s› rather than ‹z› for these forms may reflect statistical learning rather than the application of the plural rule (Hayes et al., 2006; Perry & Ziegler, 2004; Treiman et al., 2002; Treiman, 2018; Treiman et al., 2019). However, the purely inflectional plural rule is required to determine whether a final ‹s› should be used in a word that includes a long vowel sound (e.g., /iː/) before the final /z/ (e.g., fleas [plural] vs. freeze [nonplural]; Kemp, 2009; Kemp & Bryant, 2003).

Evidence from nonword spelling studies suggests that adult L1 English speakers are influenced by implicitly learned morphological and graphotactic regularities when spelling plural nonwords. For example, Kemp and Bryant (2003) conducted a study in which participants were asked to spell plural nonwords. In one condition, a final /z/ was either preceded by a long vowel (e.g., /priːz/), requiring the application of the morphological plural rule for correct spelling. In another condition, the /z/ followed a VC combination (e.g., /brʉgs/), allowing the correct spelling to be determined through either morphological regularity or by avoiding the graphotactically atypical ending VC+‹ze›. Their pattern of results suggested that adults relied on implicit graphotactic and morphological knowledge rather than morphological plural regularity. There was also evidence of a role for English proficiency: Tertiary-educated adults were more likely than adults with a secondary education to apply the morphological plural rule of adding ‹s›.

Mitchell et al. (2011) provided further evidence of adults’ sensitivity to inflectional spelling regularities using a spelling choice task involving inflected and uninflected noun nonwords ending in /ks/ and /z/. Participants were required to choose between two spellings for nonwords embedded in a sentence that clearly indicated whether the nonword was plural or singular (e.g., “Mary has one yocks/yox”). Performance accuracy was similar for both word-endings and significantly above chance-level, suggesting that participants could use morphological information to determine the correct inflectional nonword-ending, and that morphological spelling regularity was implicitly acquired through statistical learning (Deacon & Dhooge, 2010; Levesque et al., 2021; Nunes et al., 2006; Treiman & Kessler, 2014). The finding that spelling ability predicted accuracy in the nonword spelling task suggests that adults with greater spelling ability may have greater exposure to and experience with reading and writing morphologically-based inflectional noun endings (Mitchell et al., 2011).

Derivational adjectival regularities

Derivational suffixes also demonstrate systematic spelling patterns. The word-ending /əs/ is spelled ‹ous› only in adjectives (e.g., nervous), while the same sound is spelled differently in other lexical categories (e.g., ‹us›, ‹is›, and ‹ice› in nouns fetus, crisis, chalice, respectively). Similarly, the /ɨk/ word-ending is almost always spelled ‹ic› in adjectives (e.g., basic) but usually ‹ick› in nonadjectives (e.g., gimmick; Berg & Aronoff, 2017). Spelling patterns in derivational suffixes are usually not explicitly taught, suggesting that individuals most likely acquire them through statistical learning (Aronoff et al., 2016; Berg & Aronoff, 2017).

Consistent with this possibility, Heyer (2021) found that ‹ous› spellings for /əs/-ending nonwords were almost twice as frequent in an adjectival context (e.g., “This is Amy. She is very /krædəs/”) than in a noun context (e.g., “This is Amy. She is a /krædəs/”). Spelling ability among the adult English-speaking and German-English bilingual participants also influenced performance, such that better spellers were more likely to spell nonwords with ‹ous› in adjectival contexts but not nonadjectival contexts. Similarly, Ulicheva et al. (2020) found that spelling ability predicted greater sensitivity to derivational regularity for highly reliable and consistent suffixes. For example, although ‹ous›-ending adjectives are less frequent than ‹ic›-ending adjectives in English, virtually all words spelled with ‹ous› are adjectives. Thus, better spellers’ effective use of consolidated derivational spelling regularities may reflect their greater exposure to and experience with written English.

Although these findings suggest that participants were sensitive to derivational regularities, particularly those with high spelling ability, it does not appear that this sensitivity results in the complete and consistent use of these regularities. Treiman et al. (2021) suggested that this may be due to a lack of attention to sentence context. They found that participants used more ‹ous› and ‹ic›-endings in adjectival than in nonadjectival contexts when asked to write complete, dictated sentences. However, the appropriate use of /əs/ and /ɨk/ spelling was still lower than expected based on the statistical frequency of these spelling patterns. Thus, although statistical learning may yield sensitivity to derivational regularities, application of these regularities may depend on participants’ attention to sentence context and the cognitive demands associated with producing sentences or evaluating the grammatical information conveyed by sentences.

Present study

Sensitivity to spelling patterns has largely been neglected in bilingual research. The handful of studies that have investigated sensitivity to systematic morphological knowledge have mostly focused on early bilinguals (i.e., those simultaneously exposed to both languages before age four, Liu et al., 2017; e.g., Kuo & Kim, 2013). Research on late bilinguals remains limited despite this group’s practical importance, rooted in many international university students being late bilinguals (de Bruin, 2019). Moreover, although there is growing interest in the morphological knowledge of bilinguals, most studies investigate explicit morphological awareness (i.e., ability to reflect on and manipulate morphemes) rather than implicit sensitivity to systematic morphological regularities in English spelling (Kuo et al., 2020).

The present study compared sensitivity to systematic inflectional and derivational morphological spelling regularities in late Chinese-English bilinguals and English monolinguals. A forced-choice task was used to encourage participants to process sentence context to choose between two nonword spellings. To further encourage attention to the sentence context, participants were allowed to replay a voice recording of the sentences and nonwords before responding. The study comprised two sub-experiments on inflectional plural ‹s› regularity and derivational /əs/ and /ɨk/ regularities. The inflectional sub-experiment included /z/, /ks/, and VC+/z/ word-ending conditions to assess possible differences in the use of morphological and morpho-graphotactic spelling regularities (Kemp & Bryant, 2003; Kemp et al., 2017; Mitchell et al., 2011). The derivational sub-experiment included /əs/ and /ɨk/ word-ending conditions due to their high frequency and differing consistency in derivational spelling regularity (Heyer, 2021; Treiman et al., 2021; Ulicheva et al., 2020).

The present study was designed to answer the following primary research question: To what extent do English monolingual and Chinese-English bilingual participants differ in their sensitivity to inflectional and derivational spelling regularities? Equivalent or better performance by bilinguals versus monolinguals in correctly identifying nonwords in relevant grammatical forms would suggest that bilinguals are better able to detect structural patterns in linguistic input than monolinguals, consistent with the predictions of Structural Sensitivity Theory (Kuo & Anderson, 2010; Kuo & Kim, 2013). However, poorer performance by Chinese-English bilinguals versus monolinguals in both sub-experiments may indicate that bilinguals principally rely on the lexical route when processing English spelling, as posited by the Shallow Structure Hypothesis, as nonword processing involves the sublexical route (Clahsen & Felser, 2018; Martin & Barry, 2012).

A second goal of the present study was to gain insight into the source of bilinguals’ sensitivity to spelling regularities by comparing their performance across sentence contexts and word-endings. If bilinguals are better able to detect and use morphological spelling regularities, as Structural Sensitivity Theory predicts, their performance should be similar across sentence context and word-ending conditions. As the Structural Sensitivity Theory also predicts greater detection of graphotactic regularities through exposure to multiple languages, bilinguals may perform better at VC+/z/ word-endings than other word-endings in the inflectional sub-experiment since the spelling can be determined by either morphological or graphotactic regularity.

Finally, we sought to examine systematic inter-individual differences in performance by conducting an exploratory analysis of the relationship between English spelling ability and sensitivity to morphological spelling regularities. As previous studies have found spelling ability to correlate with morphological spelling sensitivity, monolingual and bilingual participants who score higher on English spelling ability tasks may be more sensitive to morphological regularities than those with lower spelling ability (Heyer, 2021; Kemp & Bryant, 2003; Mitchell et al., 2011).

Method

Participants

The final sample comprised 129 participants (89 female, MAge = 20.39, SDAge = 3.55) from The University of Sydney who participated in the experiment in exchange for course credit.Footnote 1 A power analysis using G*Power (Faul et al., 2007) showed that a sample size of 105 was needed to detect a small effect size (f = 0.30) for a between-factors estimation with a power of 1-β = 0.90 and α = 0.05. Fifty-one participants were English monolinguals and 78 were Chinese-English bilinguals whose L1 was a Chinese dialect. Most bilingual participants were international students who had moved to Australia to undertake senior high school and/or university studies. Of the 78 bilingual participants, 60 reported that they moved to an English-speaking culture at or after age 16. Thirty-three bilingual participants reported that they learned English through formal education, three stated that they learned through mainly interacting with people, and 42 indicated that they learned from a mixture of both. All participants met English language requirements set by The University of SydneyFootnote 2 and were informed that the experiment focused on spelling.

Three English ability measures were used to assess spelling (spelling recognition test) and vocabulary (LexTALE and the Nelson-Denny vocabulary test). Table 1 summarizes the average percentage of correct responses for each test and participant characteristics per language group. As expected, monolinguals had significantly higher spelling recognition, LexTALE, and Nelson-Denny vocabulary scores than the bilingual group. The study was approved by The University of Sydney Human Research Ethics Committee.

Table 1 Mean (and SD) participant characteristics

a Data points were removed from this variable due to missing data (N = 4) and invalid responses (N = 1).

b Perceived English proficiency self-reported on a scale from 1 to 7, with 1 being very poor and 7 being excellent.

Materials

Experimental stimuli

The experimental task consisted of two repeated-measures sub-experiments: an inflectional regularity task involving a 3 (morphological /z/ vs. morphological /ks/ vs. morpho-graphotactic VC+/z/ word-ending monosyllabic nonwords) x 2 (plural vs. singular sentence) design, and a derivational regularity task involving 2 (/əs/ vs. /ɨk/ word-ending bisyllabic nonwords) x 2 (adjectival vs. nonadjectival sentence) design.

Inflectional items. A set of 60 pairs of monosyllabic nonword spelling choices selected from the materials of Kemp and Bryant (2003), Kemp et al. (2017), and Mitchell et al. (2011) was used for the inflectional regularity sub-experiment. Three conditions were compared: 20 morphological nonword pairs with a long vowel before the final /z/ (morphological /z/; e.g., bloos or blooze), 20 morphological nonword pairs with /ks/-endings (morphological /ks/; e.g., snocks or snox), and 20 pairs of morpho-graphotactic nonwords with a VC preceding the final /z/ (VC+/z/; e.g., broogs or broogze). Each item pair was presented in a sentence context indicating whether the nonword was intended as a plural noun (e.g., “How many prees/preeze can you see?”) or a singular noun (e.g., “He keeps a big prees/preeze in his cupboard”). The mean length of the sentences was 6.65 words. Two counterbalanced lists were constructed to ensure that participants only saw each item pair once, but that each nonword appeared in both sentence contexts across participants. Thus, “prees” was the correct choice in one list, while “preeze” was correct in the other. Mean split-half reliability for the inflectional items was 0.62 indicating good internal consistency (Hedge et al., 2018).

Derivational items. A set of 60 bisyllabic nonword pair spelling choices selected from the materials of Heyer (2021), Treiman et al. (2021), and Ulicheva et al. (2020) was used to assess participants’ ability to correctly choose spelling based on derivational regularity. Bisyllabic nonwords were used because spelling differences between adjectival and non-adjectival words arise for bisyllabic, rather than monosyllabic words. Thirty nonword pairs featured /əs/ word-endings (e.g., bormous or bormus/bormis), while the other 30 featured /ɨk/ word-endings (e.g., blenic or blenick). Half of the nonderivational /əs/ word-endings were spelled ‹us› and the other half spelled ‹is› because both are frequent nonadjectival spellings. However, both groups were combined in the analyses (Ulicheva et al., 2020). Each item pair was presented in a different sentence context indicating whether the nonword was intended as an adjective (e.g., “The snow looked smepous/smepis from the distance”) or nonadjective (e.g., “The firefighter threw the heavy smepous/smepis”). The mean length of sentences was 7.17 words. The items were presented in two counterbalanced lists and intermixed with the items from the inflectional sub-experiment. Mean split-half reliability for the derivational items was 0.64, indicating good internal consistency (Hedge et al., 2018).

Language history

A modified version of Dunn and Tree’s (2009) Language History Questionnaire was administered after participants completed the main tasks to determine their previous language history. English monolinguals only answered the first two question asking their L1 and whether they were fluent or native in a language besides English. Bilingual participants answered additional questions about English acquisition and rated their fluency and usage in each language. Questions included age of first exposure to an English-speaking environment, and instructional methods (formal education, interactions with English-speakers, a combination of both, or other; see Table 1).

English ability

All participants completed the following English vocabulary and spelling ability tests.

LexTALE. Lexical Test for Advanced Learners of English, an untimed standardized lexical decision task intended for bilingual speakers (Lemhöfer & Broersma, 2012), was administered to assess English vocabulary. Designed for English L2 speakers of high proficiency (i.e., those who formally started learning English from 10 to 12 years and use English in daily life), the test consists of 60 items; participants were asked to determine whether they were existing English words (e.g., word: denial, nonword: spaunch). LexTALE scores have been shown to be highly correlated with translation tests and commercially available proficiency tests (Lemhöfer & Broersma, 2012).

Nelson-Denny vocabulary test. The vocabulary subtest from the Nelson-Denny Reading Test (Brown et al., 1993), consisting of 80 multiple-choice items, was administered to further assess English vocabulary proficiency, due to the possibility of monolinguals performing at ceiling on the LexTALE. The Nelson-Denny is designed to discriminate among college-aged readers, and Vermeiren et al. (2023) reported test-retest reliability of 0.9 when administered to L1 English-speaking university students. Participants were allowed 7.5 min for the task—a time limit found to yield a more normal distribution of scores for English-speaking university students (Andrews et al., 2020).

Spelling recognition test. Andrews et al.’s (2020) spelling recognition test was administered to assess spelling ability. The test has been found to have relatively high internal consistency (Cronbach’s alpha > 0.8) and high test-retest reliability (r = .93; Andrews et al., 2020; Andrews & Hersch, 2010). Item-level analyses conducted by Andrews et al. suggested that the test may be more discriminating among lower proficiency adult readers, further motivating its use in the present study. The untimed task instructed participants to select all incorrectly spelled items from a list of 88 items (half spelled correctly and half spelled incorrectly).

Procedure

Qualtrics online survey system was used to administer all tasks, which participants completed online on their own devices and in their own time. After consenting to the experiment, they were informed that they would be presented with sentences containing a pair of “nonsense words” of which they had to select the more appropriate spelling in English. As each item appeared onscreen, participants were instructed to play a voice recording of the researcher (a General American English speaker) saying the sentence and nonword before responding. The tasks were self-paced, and participants were presented with the next item only after they had made a choice.

Upon completing the nonword choice tasks, participants answered demographic questions (age, gender, and handedness) and completed the language history questionnaire. Monolinguals who answered “no” to the question, “Are you fluent or native in in any other languages than English?” were automatically redirected to the next task. Bilinguals who answered “yes” answered all remaining questions on the questionnaire. Participants then completed LexTALE, spelling recognition, and the Nelson-Denny vocabulary test. For the timed Nelson-Denny vocabulary test, participants were automatically redirected to the next page after 7.5 min.Footnote 3.

Transparency and openness

The raw data, analysis scripts, and stimulus materials from the present study are publicly available at: https://osf.io/ke6s2/. The experiments were not preregistered.

Results

Sub-experiment 1: inflections

Table 2 presents the mean accuracy in the experimental spelling task of three inflectional word-endings (/z/, /ks/, VC+/z/) in each sentence context (plural, singular) between language groups. One-sample t-tests for each condition confirmed that accuracy was significantly above chance, except for VC+/z/ word-ending in singular contexts (e.g., broogze) for monolinguals.

Table 2 Mean (and Standard Deviation) accuracy in the inflectional experimental spelling tasks

Trial-level accuracy data were analyzed with a generalized linear-mixed effects model (GLMM) using the lme4 package (version 1.1–33; Bates et al., 2015) in R (version 4.3.1; R Core Team, 2023). The model tested the fixed effects of language group (monolinguals vs. bilinguals), sentence context (plural vs. singular), word-endings (/z/, /ks/, and VC+/z/), and interaction effects. Language group and sentence context were specified as effect-coded contrasts (0.5, -0.5). Two word-ending contrasts were tested: (1) morpho-graphotactic word-ending VC+/z/ vs. morphological word-endings (average of /z/ and /ks/ word-ending), (2) morphological /z/ vs. /ks/ word-endings. The random effects included subject and item random intercepts, by-subject random slopes for sentence context and Context × Word-ending interactions, and by-item random slopes for the Group × Context × Word-ending interactions. Random correlations were not included. Models with more complex random-effects structures failed to converge. The GLMM summary is shown in Table 3.

Table 3 Generalized linear-mixed effects model summary for analysis of nonword spelling choice accuracy for inflectional word endings

No significant main effect was found for language group (z = -1.92), indicating that the average accuracy of choices in the experimental task was similar between monolinguals and bilinguals. Additionally, no significant main effect was found for sentence context (z < 1), or for the interaction between sentence context and group (z < 1). The difference in accuracy between morpho-graphotactic and morphological word-endings was not significant when averaged over group and context (z < 1). Mean accuracy for /ks/ word-endings was significantly higher than for /z/ word-endings, when averaged across groups and sentence context (z = 2.15). Neither of the word-endings contrasts demonstrated significant interactions with language group (both |z|s < 1.09). However, both word-ending contrasts significantly interacted with sentence context, whereby morpho-graphotactic word-endings had greater accuracy in plural over singular context whereas morphological word-endings had greater accuracy in singular over plural contexts (z = 9.23), and /z/ word-endings had significantly greater accuracy in singular over plural context compared to /ks/ word-endings, which showed similar accuracy in singular and plural contexts (z = 4.54).

Finally, there was a significant three-way interaction between group, sentence context, and the morphological vs. morpho-graphotactic contrast (z = -6.44). Separate follow-up GLMMs for the two language groups showed that the difference in mean accuracy between plural and singular contexts was significantly larger for the morpho-graphotactic word-endings (e.g., broogs, broogze) than the morphological word-endings (e.g., bloos/snocks, blooze/snox) among monolinguals (z = 7.69) but not bilinguals (z = 1.15; see Fig. 1).

Fig. 1
figure 1

Nonword spelling accuracy for inflectional items for bilingual and monolingual groups. Note. Error bars are +/- SEM

Sub-experiment 2: derivations

Table 4 presents the mean accuracy in the nonword spelling task involving the two derivational word-endings (/əs/, /ɨk/) in adjectival or nonadjectival contexts across language groups. Mean accuracy of word-ending item choices was significantly above chance, except for the /ɨk/ word-ending in a nonadjectival context for bilinguals.

Table 4 Mean (and Standard Deviation) accuracy in the derivational experimental spelling tasks

Accuracy data were analyzed with a GLMM, testing the fixed effects of language group (monolinguals vs. bilinguals), sentence context (adjectival vs. nonadjectival), word-endings (/əs/ vs. /ɨk/), and interaction effects. The random effects included subject and item random intercepts, by-subject random slopes for sentence context, word-ending, and the Context × Word-ending interaction, and a by-item random slope for the Group × Context × Word-ending interaction. Random correlations were not included. Models with more complex random-effects structures failed to converge. The GLMM summary is shown in Table 5.

Table 5 Generalized linear-mixed effects model summary for analysis of nonword spelling choice accuracy for derivational word endings

No significant main effect was found for language group (z < 1), indicating that mean accuracy was similar between groups, averaged across sentence context and word-endings. A significant main effect was found for sentence context (z = 6.46), indicating that accuracy was higher for adjectival than nonadjectival sentence contexts when averaged across group and word-ending. There was also a significant main effect of word ending averaged across group and sentence context (z = -3.20), reflecting higher accuracy for /əs/ word-endings than /ɨk/ word-endings. None of the two-way interactions were significant (all z < 1.73). However, the three-way Group × Context × Word ending interaction was significant (z = 3.33). Follow-up GLMMs conducted for each language group demonstrated that this reflected opposite effects for monolinguals versus bilinguals. For monolinguals, classification of /əs/ but not /ɨk/-ending items was more accurate in adjectival than in nonadjectival contexts (b = -0.83, SE = 0.37, z = -2.27). By contrast, classification of /ɨk/ but not /əs/-ending items was more accurate in adjectival than in nonadjectival contexts for bilinguals (b = 0.83, SE = 0.38, z = 2.20; see Fig. 2).

Fig. 2
figure 2

Nonword spelling accuracy for derivational items for bilingual and monolingual groups. Note. Error bars are +/- SEM

Exploratory analyses

To examine individual differences in experimental task performance, correlations between English ability tests (spelling recognition, Nelson-Denny vocabulary, LexTALE) and experimental task items were separately calculated for monolinguals and bilinguals (Table 6). It was expected that monolinguals and bilinguals with greater spelling and vocabulary ability would perform better across inflectional and derivational sub-experiments, as previous findings suggest English ability to be a significant predictor of performance (Kemp & Bryant, 2003; Treiman et al., 2021).

Table 6 Correlations between English ability variables and experimental task items for monolingual and bilingual participants

Spelling recognition, Nelson-Denny vocabulary, and LexTALE scores were all significantly and positively correlated with each other for monolinguals, but not for bilinguals. Moreover, spelling recognition scores had significant moderate to large positive correlations with accuracy in all experimental task conditions for monolinguals. Nelson-Denny vocabulary scores also had significant moderate positive correlations with experimental task conditions, except for /z/ word-ending items. Thus, participants who scored relatively high on spelling recognition and Nelson-Denny vocabulary also tended to correctly identify experimental task items. However, LexTALE scores were not significantly correlated with task items, most likely due to the test’s ceiling effect for monolinguals (approximately 73% of monolinguals scored above 90 out of 100; range: 73–100). All inflectional word-endings exhibited significantly moderate to strong positive correlations with each other. Although the association was weaker, both derivational word-endings were significantly and positively correlated. Therefore, participants who scored relatively high on one inflectional word-ending item tended to also score relatively high on other inflectional word-ending items, and participants who scored relatively high in one derivational word-ending category also tended to score relatively high on other derivational word-ending. Moreover, a strong, significantly positive correlation was found between averaged inflectional and derivational items, indicating that participants who scored relatively high on inflectional items also tended to score relatively high on derivational items (Table 6A).

For bilinguals, neither spelling recognition nor Nelson-Denny vocabulary scores were significantly correlated with experimental task conditions. The Nelson-Denny vocabulary test was likely too difficult for bilinguals. LexTALE scores were not significantly correlated with inflectional experimental items, but significantly and positively correlated with derivational items. All inflectional word-endings were significantly, strongly, and positively correlated with each other. Although the association was weaker, both derivational word-endings were significantly and positively correlated with each other. Thus, participants who scored relatively high on one inflectional word-ending tended to also score relatively high on other inflectional word-ending items, and participants who scored relatively high on one derivational word-ending category also tended to score relatively high on the other derivational word-ending. A significantly strong positive correlation was found between averaged inflectional and derivational items, indicating that participants who scored relatively high on inflectional items also tended to score relatively high on derivational items (Table 6B).

Exploratory GLMMs were conducted to assess systematic effects of spelling ability and vocabulary on performance of monolinguals and bilinguals. Because a ceiling effect was observed for LexTALE scores (Mean = 92.03%) among monolinguals, Nelson-Denny vocabulary scores were used to index vocabulary ability. Conversely, LexTALE scores were used to index vocabulary ability among bilinguals, since the Nelson-Denny vocabulary test was possibly too difficult and resulted in low scores (Mean = 39.52%). Spelling recognition scores were used to index spelling ability for both groups. The models included the same fixed and random effects as the main analyses with the addition of the centered, continuous spelling and vocabulary scores as fixed effects. Interactions between experimental effects and the individual-differences measures were not tested due to insufficient statistical power.

Inflectional items

For monolinguals, the GLMM for inflectional word-ending items showed that spelling ability significantly predicted higher overall accuracy (b = 0.74, SE = 0.34, z = 2.18). The main effect of vocabulary was not significant (b = 0.34, SE = 0.29, z = 1.14). For bilinguals, neither spelling ability nor vocabulary were significant predictors of overall accuracy on inflectional items (Spelling: b = -0.22, SE = 0.13, z = -1.64; Vocabulary: b = 0.14, SE = 0.16, z = 0.87).

Derivational items

For monolinguals, the GLMM for derivational word-ending items showed that spelling ability again significantly predicted higher overall accuracy (b = 0.60, SE = 0.26, z = 2.32). The main effect of vocabulary was not significant (b = 0.37, SE = 0.22, z = 1.65). For bilinguals, there was a significant main effect of vocabulary on overall accuracy (b = 0.29, SE = 0.12, z = 2.38). There was no significant main effect of bilinguals’ spelling ability on overall accuracy on derivational items (b = -0.12, SE = 0.10, z = -1.25).

English exposure

Given the heterogeneity among the late bilingual group, we conducted two further exploratory analyses to explore the possible contributions of English proficiency and exposure on bilingual nonword spelling accuracy. The first set of analyses compared lower and higher proficiency bilinguals based on a median split of LexTALE scores. For the inflectional task, there was no main effect of proficiency on accuracy (z < 1) and no significant interactions between proficiency, sentence context, or word ending (all |z|s < 1.41). Similarly, for the derivational task, there was no main effect of proficiency on accuracy (z = 1.65) and no significant interactions involving proficiency (all |z|s < 1.84).

The second set of analyses assessed the influence of duration of exposure to English by comparing bilinguals who reported moving to an English-speaking culture before the age of 18 (N = 32) to those who moved at or after the age of 18 (N = 42). For accuracy in the inflectional task, the main effect of English exposure was not significant (z < 1), and there were no significant interactions involving English exposure (all |z|s < 1.77). For the derivational task, bilinguals who moved to an English-speaking culture at an earlier age had significantly higher accuracy than those who moved at a later age (b = -0.45, SE = 0.20, z = -2.20). However, English exposure did not significantly modulate any of the experimental effects (all |z|s < 1.14). Thus, there was some limited evidence that time spent in an English-speaking culture affected sensitivity to derivational spelling regularities among the late Chinese-English bilingual participants in the present study.

Discussion

The present study was designed to investigate whether sensitivity to morphological spelling regularities differed between late Chinese-English bilinguals and English monolinguals, and whether sensitivity differed between inflectional and derivational regularities across language groups. A forced-choice task was used to determine whether this testing method increased consistency with the statistical distribution of morphological spelling regularities. Although mean accuracy of the experimental task did not reach the same level of consistency as the statistical distribution of morphological regularities, the results provide compelling evidence of late bilinguals’ sensitivity to and use of morphological spelling regularities to guide spelling choices.

The findings of the present study provide evidence of sensitivity to inflectional and derivational spelling regularities for both bilinguals and monolinguals, consistent with the predictions of Structural Sensitivity Theory (Kemp & Bryant, 2003; Kuo et al., 2016). Our inflectional findings revealed significant differences in the interaction of sentence context for morphological and morpho-graphotactic word-ending items between monolinguals and bilinguals. Monolinguals were more likely to rely on orthographic, graphotactic regularities to guide their spelling choices, whereas bilinguals tended to rely on morphological regularities. In our derivational sub-experiment, mean accuracy was found to be higher for /əs/ than /ɨk/-ending items. However, this performance advantage was greater for monolinguals than bilinguals, indicating that their performance was more consistent with statistical occurrences of /əs/ and /ɨk/ word-endings in real-word nonadjectives (Berg & Aronoff, 2017). The present study extends the understanding of morphological sensitivity in spelling for late bilinguals and helps to clarify potential factors and processes involved when choosing between spellings with different morphological word-endings. In the following sections we discuss the theoretical implications of our findings in more detail.

Sensitivity to morphological word-ending spellings

Both monolingual and bilingual participants were somewhat knowledgeable about inflectional regularities (items ending in /z/, /ks/, and VC+/z/) and derivational regularities (/əs/ and /ɨk/) when deciding between nonword spellings. These findings are consistent with previous studies on English monolinguals (Heyer, 2021; Kemp & Bryant, 2003; Kemp et al., 2017; Mitchell et al., 2011; Treiman et al., 2021; Ulicheva et al., 2020) and extend the findings to bilinguals. However, for both groups, accuracy was lower than the statistical distributions of inflectional and derivational word-ending spellings, suggesting that participants did not always rely on morphological regularity to choose between alternate nonword spellings. Although statistical learning predicts that accuracy should correspond more closely to the statistical distribution of the word-ending spellings, other studies have found a similar underuse of spelling regularities (Heyer, 2021; Kemp & Bryant, 2003; Treiman et al., 2021). This may reflect participants’ use of both their knowledge of the typical context-free spellings (e.g., /z/ being commonly spelled as ‹ze›) and context-specific morphological and morpho-graphotactic patterns (Kemp et al., 2017; Treiman & Kessler, 2006). The extended process of acquiring morphological knowledge requires consideration of information beyond phonemes. Spellers often underutilize phonological context, even when it is immediately adjacent, and broader linguistic context (e.g., morphological and morpho-graphotactic regularities) poses further complexity. Therefore, despite the relatively early acquisition of morphological spelling awareness in childhood, it is a protracted process in English (Treiman et al., 2021).

With respect to effects of language group, the present study demonstrated differential sensitivity to sentence context between bilinguals and monolinguals. Monolinguals appeared to choose nonword spellings based on graphotactic regularity (i.e., not selecting the VC+‹ze› spelling because it did not seem consistent with English), rather than morphological plural regularity. Similarly, in the derivational task, monolinguals were more likely to select the ‹ous› spelling regardless of sentence context. This finding reflects statistical learning, since virtually all English words spelled that end in ‹ous› are adjectives (Berg & Aronoff, 2017). However, monolinguals’ overreliance on ‹ous› spellings suggests that they did not consider context. Instead, they may have believed that ‹ous› word-endings were more “English-like” due to their implicit knowledge that English words with final ‹ous› are adjectives. The effortful process of considering information beyond the nonword and considering the broader context may have been difficult, as studies have shown that English spellers rely less on adjacent phonological and overall broader context than expected based on the language’s statistical structure (Kemp, 2009; Treiman et al., 2021; Treiman & Kessler, 2019).

The overall pattern of monolinguals’ nonword spelling accuracy aligns with patterns of real English words, suggesting that monolinguals rely on a frequency-based orthography pattern rather than a morphological rule when deciding between spellings of plural and singular nonwords (Kemp & Bryant, 2003; Mitchell et al., 2011). Monolinguals’ underreliance on morphological regularity may also reflect the possibility that these monolingual participants may not have been taught formal grammar at school. In fact, formal grammatical instruction was removed from the national curriculum in the 1970’s and has only recently been reintroduced (Australian Curriculum Assessment and Reporting Authority, 2015; Sawyer & Durrant, 2018). Therefore, monolingual participants who have not been explicitly educated in grammatical rules may exhibit limited use of morphological rules, relying more on graphotactic regularity.

In contrast, bilinguals appeared to rely more on morphological rules than graphotactic regularities. For example, bilinguals more accurately identified /ɨk/ word-endings (but not /əs/ word-endings) in an adjectival context than in a nonadjectival context, albeit to a lesser extent than monolinguals. In other words, they were more likely to spell ‹ic› regardless of sentence context. Since ‹ous› spellings are more consistent in signaling adjectival status than ‹ic› spellings, bilinguals may have heavily relied on this morphological spelling regularity when choosing between spellings for /əs/-ending nonwords. This is inconsistent with the predictions of Shallow Structure Hypothesis that late bilinguals rely more on the nongrammatical, surface-level information of the lexical, heuristic route (Clahsen & Felser, 2018). Evidence of morphological reliance is more consistent with Structural Sensitivity Theory and may suggest that bilinguals are able to detect distributional regularities across Chinese and English, resulting in greater sensitivity to morphological spelling regularities in English, and application across sentence contexts beyond the structural overlap between Chinese and English (Kuo & Anderson, 2010). However, this account would also predict that bilinguals should have been able to detect distributional regularities of graphotactic patterns.

Another plausible explanation is that late bilinguals may rely more on morphological regularities because they often learn L2 through formal education, which explicitly teaches and emphasizes morphological grammatical rules (Williams, 2013). Indeed, virtually all of our bilingual participants learned English through formal education (96%) and had an average of 10.42 years of instruction. Therefore, rather than using graphotactic regularities which are often implicitly acquired, as predicted by Structural Sensitivity Theory and statistical learning, late bilinguals may rely more on the explicitly taught morphological plural rule when deciding between nonword spellings. These results align with those of Kemp et al. (2017), which showed that individuals who have not been explicitly taught these morphological rules do not rely on them when spelling. It is also possible that bilingual participants may have been explicitly taught adjectival endings, including the ‹ous› spelling, resulting in greater reliance on this morphological regularity.

Finally, it is important to consider whether these observed morphological sensitivities are unique to late Chinese-English bilinguals or if they are indicative of a more general bilingual experience. Ramirez et al. (2011) compared morphological awareness between Chinese- and Spanish-speaking English Language Learners (ELLs) and found that the length of stay in an English-speaking environment was a significant predictor of derivational awareness in Chinese-ELLs, suggesting that extensive English exposure is required for developing morphological awareness. This finding aligns with statistical learning and Structural Sensitivity Theory, such that greater exposure to English through living in an English-speaking environment enhances sensitivity to morphological regularities (Kuo & Anderson, 2012; Treiman, 2018). However, the influence of English exposure was lower for Spanish-ELLs than Chinese-ELLs, possibly due to the greater linguistic divergence between Chinese and English than Spanish and English in terms of derivational morphology. The limited role of derivational and inflectional morphemes in Chinese may lead to fewer opportunities for developing these skills in their L1 and therefore less cross-linguistic transfer of morphological awareness in English. On the other hand, the explicit formal English instruction experienced by many bilingual speakers irrespective of L1 may yield morphological sensitivity (McCarty, 2012). Explicit learning of spelling and grammar has been found to facilitate generalized knowledge and application of morphological rules to novel words among monolinguals (Bowers & Kirby, 2009). Future studies should compare morphological sensitivity between late bilingual groups with different L1 scripts and take into consideration whether explicit formal education may play a role in influencing morphological sensitivity (Chung et al., 2019).

Individual differences

Our exploratory individual-differences analyses revealed that spelling ability was a significant predictor of inflectional and derivational item performance for monolinguals, consistent with previous findings (Heyer, 2021; Kemp et al., 2017). Monolinguals who are proficient spellers may have greater experience and exposure to written language than poorer spellers, resulting in greater statistical learning of morphological regularities and a greater ability to apply them. This is also consistent with evidence from both masked priming and eye-movement studies that proficient spellers are sensitive to the orthographic structure of words (e.g., Andrews & Hersch, 2010; Andrews & Lo, 2013; Veldre & Andrews, 2015a, b).

Vocabulary knowledge, measured by the LexTALE task, predicted derivational item performance for bilinguals, possibly reflecting the frequency and consistency of adjectives that end in /es/ and /ic/ in English (Lemhöfer & Broersma, 2012). Bilinguals with greater vocabulary knowledge may know more ‹ous› and ‹ic›-ending adjectives. Therefore, they benefited from greater statistical learning of the implicit rule that ‹ous› and ‹ic› words are usually adjectives (Ulicheva et al., 2020). Greater acquisition and consolidation of derivational regularities among participants with higher vocabulary knowledge may have resulted in greater consideration of the sentence context and thus better performance on the experimental choice task.

However, vocabulary knowledge did not predict sensitivity to inflectional spelling regularities in bilinguals. One possible reason for this discrepancy is differences in formal instruction. Performance on inflectional tasks may not be dependent on vocabulary ability as inflectional rules are often explicitly taught. In contrast, sensitivity to derivational spelling regularities, which are often implicitly acquired, may depend on the amount of exposure, which is reflected in vocabulary ability (Cook, 2016; Williams, 2013). Furthermore, spelling ability was not a significant predictor of inflectional or derivational item tasks for bilinguals. This may also be attributable to formal instruction of morphological rules. Bilingual participants may have been explicitly taught these morphological regularities which they were able to apply to the experimental items, even if their spelling ability of English words was variable.

Study limitations and future directions

A limitation of the present study was that restrictions related to the COVID-19 pandemic meant that data needed to be collected entirely online. This may have affected participant compliance with task instructions, including ensuring that participants attended to the audio recording of each experimental sentence item. Online data collection may also have compromised the integrity of the individual-differences measures.

A further limitation was the appropriateness of the English ability tests used for bilinguals. Although the English ability tests significantly predicted experimental item performance for monolinguals, this was not the case for bilinguals. Neither of the vocabulary ability tests were able to index vocabulary ability appropriately across both language groups, since the Nelson-Denny test was too difficult for bilingual participants and the LexTALE was too easy for monolingual participants. Future studies comparing monolingual and bilingual speakers should utilize language tests that are more appropriate for both groups, such as Nation’s Vocabulary Size Test which is designed to measure both L1 and L2 learner’s written English receptive vocabulary size (Nation & Beglar, 2007). However, concerns have been raised regarding the extent to which L1 tests can be used with advanced L2 speakers (see Vermeiren & Brysbaert, 2023).

Some consideration should be given to the implications of the heterogeneity of the bilingual sample in the present study. Although the average age of acquisition of English was 7.15, there was considerable variability in participants’ exposure to English (e.g., the age that they first started living in an English-speaking environment and percentage of time speaking English across daily activities). Variability in English exposure may influence sensitivity to spelling regularities, as proposed by statistical learning (Treiman, 2018). Although our post-hoc analysis provided only limited evidence of an influence of time spent in an English-speaking culture on bilinguals’ task performance, future studies may systematically investigate the influence of English exposure on sensitivity to morphological spelling regularities. The bilingual group also showed substantial variability in their English ability test scores. This variability in English proficiency may obscure the results, raising questions about the extent to which the patterns are attributable to English proficiency versus bilingualism per se. Although a post-hoc analysis did not find significant differences in the pattern of effects between higher- and lower-proficiency bilingual groups, this may have reflected limited statistical power. Further investigation of spelling patterns in relation to English proficiency is warranted.

Conclusion

The present study extends support for Structural Sensitivity Theory to morphological regularities in spelling and provides evidence that late bilinguals may be as sensitive to systematic morphological regularities as monolinguals. Our findings contribute to the growing body of evidence on statistical learning in both late bilinguals and monolinguals. The results of the present study also suggest that formal grammatical education may play a pivotal role in shaping sensitivity to morphological spelling regularities and in guiding the application of these regularities during spelling. As such, we recommend the relationship between explicit grammatical instruction and sensitivity to morphological spelling regularities be explored in future research.