Writing development

Becoming a skilled writer is necessary for academic success and participation in the global economy; yet, it is well-established that around 20% of students struggle with the writing process (DfE, 2019; Graham & Harris, 2009; NCES, 2012). The written products of struggling writers (SWs) are exemplified by shorter texts, increased errors in spelling, punctuation and grammar, reduced lexical diversity, poorly constructed, short or incomplete sentences, and poor compositional quality (APA, 2013; Dockrell & Connelly, 2015; Dockrell, Connelly, & Arfe, 2019; Saddler, Behforooz, & Asaro, 2008a, b; Sumner, Connelly, & Barnett, 2014). Understanding the skills that underpin writing development supports the development of interventions for SWs.

Models are designed to capture both the skills that children need to produce a written text, the more distal factors which underpin these skills and the wider task environment (Graham et al., 2018). The initial models of writing development identified key components in the writing process (transcription and idea generation) and other factors, such as working memory, which supports written text production (Berninger & Winn, 2006; McCutchen, 1996; Olive, 2014). More recently, researchers have moved towards considering both direct and indirect factors influencing writing development (Dockrell et al., 2019; Kim & Schatschenider, 2017). Proximal factors include those skills which directly impact on the production of written text such as spelling. By contrast, distal factors are those which indirectly impact on the writing process, such as oral language and reading. SWs often experience associated difficulties in these areas (see O'Rourke, Connelly, & Barnett, 2018 for a review).

The role of reading and oral language in writing interventions remains underexplored. Poor readers produce texts with more spelling errors, less lexical diversity and reduced compositional quality than typically-developing peers (Caravolas, Hulme, & Snowling, 2001; Sumner, 2013; Sumner et al., 2014). Reading abilities are most closely associated with spelling skills (Abbott & Berninger, 1993; Berninger et al., 2008) and so reading may influence writing both directly and indirectly through spelling. By corollary oral language at word, sentence and text level supports text generation (Abbott & Berninger, 1993; Babayiğit & Stainthorp, 2010; Kim, Al Otaiba, Folsom, Greulich, & Puranik, 2014; Olinghouse & Leaird, 2009; Savage, Kozakewich, Genesee, Erdos, & Haigh, 2017; Sénéchal, Hill, & Malette, 2018). The importance of oral language in supporting written text production is evident by the significant difficulties in writing experienced by children with language problems (for review see Graham, Hebert, Fishman, Ray, & Rouse, 2020).

There is evidence which indicates that children with poor oral language are less responsive to effective reading interventions (Al Otaiba & Fuchs, 2002) or may need more intensive interventions as such poor oral language skills may also affect writing interventions. By extension, reading difficulties may reduce the efficacy of writing interventions given that reading interventions support writing performance (Graham et al., 2018). Thus, understanding the indirect influence of oral language and reading to SWs response to intervention provides the basis for developing effective and targeted writing interventions. The current study explores the role of reading (both directly and indirectly through spelling) and oral language (listening comprehension and oral expression) in SWs responsiveness to writing interventions.

Effective interventions for struggling writers

Identifying where to intervene for SWs in upper primary (ages 7 to 11) is challenging as although many children will have automaticity in the word-level transcription skills (e.g. spelling) the classroom teaching has typically moved to higher-level skills, such as morphology and sentence-level skills (Applebee & Langer, 2011; Dockrell, Marshall, & Wyse, 2016). The process of identifying and adapting effective interventions can be considered within the Response to Intervention (RTI) framework (Jimerson, Burns, & VanDerHeyden, 2007). Tier 1 interventions focus on whole-class teaching practices when children who do not respond to regular effective classroom teaching, progressing them to interventions at Tier 2 should be considered. Tier 2 interventions are designed to supplement classroom-based instruction and typically occur in small groups. Finally, those still falling behind need to progress to more intensive and specialised Tier 3 interventions, delivered individually. The current study implements a writing intervention designed to supplement classroom teaching by providing small-group interventions to support the writing skills in those students who are most at risk of falling behind their peers and are implemented as a Tier 2 intervention (Jimerson et al., 2007).

Sentence combining interventions

Complex, syntactically-correct sentences characterise competent writing. Constructing well-formed sentences can be problematic for SW's. Berninger et al., (2011) found the ability to combine syntactically-correct written sentences develops around seven to eight years of age in typically-developing children. However, this can be delayed for SWs, for whom sentence-level difficulties are a significant weakness up to age 11 and often beyond (Dockrell et al., 2019). Research has established sentence-combining teaching practice as an effective way to develop sentence-level competence for children aged five to 18 years, with moderate to large effect sizes (Andrews et al., 2004; Graham et al., 2012; Graham, Harris, & Hebert, 2011; Graham & Perin, 2007; Saddler, Ellis-Robinson, & Asara-Saddler, 2018; Santangelo & Olinghouse, 2009). Sentence-combining instruction teaches students to combine two or more simple sentences to make one grammatically correct, sentence. Studies have found this leads to significant improvements in sentence-combining ability and the syntactic maturity of written sentences and also improves the compositional quality of children's stories, with moderate to large effect sizes (Andrews et al., 2004; Datchuk & Kubina, 2013; Graham & Perin, 2007; Saddler, Asaro, & Behforooz, 2008a, b).

Morphological spelling interventions

Children who experience significant difficulties with spelling may find sentence level interventions too challenging (Berninger & Amtmann, 2003). Consequently, a word-level spelling intervention complementing existing classroom-based teaching may be more effective. Morphological spelling interventions reflect the shift in focus of classroom-based instruction in upper primary from phonology to morphology. Research has shown morphological spelling interventions can improve children's spelling, sentence-combining, and text-level writing (Bryant & Nunes, 2000; McCutchen & Stull, 2015; McCutchen, Stull, Herrera, Lotas, & Evans, 2014; Nunes, Bryant, & Olsson, 2003). A series of studies by Nunes and Bryant (2006), with children eight years of age, demonstrated that making children explicitly aware of morphemic spelling principles, such as the use of derivational suffixes, improved children's spelling ability. Similarly, McCutchen et al., (2014) found significant improvements in 10–11-year-olds' use of morphologically complex words in a sentence-combining task and an extended writing measure following a 12-week morphological spelling intervention.

Assessing writing

Measures to identify SWs need to be reliable while capturing the key components of written text production. Measures must also discriminate between typically developing writers and SWs at different points in development (Bew, 2011; Dockrell, Connelly, Walter, & Critten, 2017). Standardised measures often focus on the compositional quality of the text, using holistic or analytical scoring. These can be quick to mark; however, they can be unreliable and often lack the sensitivity to change which is required for evaluating intervention effectiveness and often cannot be administered repeatedly across short time intervals (Dockrell et al., 2017; Dunsmuir et al., 2015).

Curriculum-based Measures of Writing (CBM-W), have a dual focus on the writers' productivity and accuracy; they are quick to administer, reliable, valid, sensitive to change, and able to discriminate between SWs and typically-developing children aged 7–12 at the word-, sentence- and text-level (Dockrell et al., 2017; Gansle, Noell, VanDerHeyden, Naquin, & Slider, 2002; Weissenburger & Espin, 2005). CBM-Ws involve pupils writing, for a short (three to seven minutes) time, in response to a prompt, texts are then scored on a range of measures (Dockrell et al., 2017; Gansle et al., 2002). These measures can be repeated over time and thus offer a useful tool for evaluating interventions.

Current study

Significant numbers of students struggle to learn to write; therefore, practitioners need access to effective resources that can be utilized as Tier 2 interventions. To date, research suggests that sentence combining interventions have the potential to improve children's writing. Yet little is known about the moderating effect of reading or oral language in response to the sentence combining interventions nor whether word-level interventions would be more effective for these children.

To address these limitations, the current study used a RTI framework to evaluate the effectiveness of a Tier 2 sentence-combining (SC), intervention on written composition skills in comparison to a word-level morphological spelling (MS) intervention and a waiting list control (WLC) group receiving standard classroom teaching. The SC intervention was adapted from the work of Saddler and colleagues (e.g. Graham et al., 2008; Saddler, 2012; Saddler & Graham, 2005) to be a Tier 2, small group, intervention to support SWs. Furthermore, since children's writing is influenced indirectly by oral language and reading, the current study explored the, currently neglected, role of these skills in children's response to intervention. A CBM-W measure was used to assess writing; capturing both productivity (e.g. total words written, words spelled correctly, and number of sentences) and accuracy (e.g. proportion of correct word sequences) at the word-, sentence- and text level.

It was hypothesized that the sentence-combining intervention would improve sentence combining ability, compositional quality and measures of productivity and accuracy captured by the CBM-W. It was also expected that children with poorer language and spelling skills would be more resistant to change. The MS intervention was predicted to improve spelling accuracy within the text. It was anticipated that this intervention would be beneficial for weaker spellers.



Participants were drawn from three primary schools in the UK which were also participating in a parallel longitudinal study on children's writing development. Screening for the intervention, conducted at the start of the academic year, was the longitudinal study's first-time point. 532 children were screened in years 4 (aged 8–9) and 5 (aged 9–10) and were identified as SWs if, on a standardised writing measure (Progress in English 9, PiE, Kirkup, Reardon, & Sainsbury, 2006), they were in the bottom 20% of their year group from each school.

These 123 identified SWs had significantly lower writing scores on the PiE screening measure than their typically-developing peers, t (326) = 11.72, p < 0.001, d = -1.73. Parental and child consent was provided. Two further exclusionary criteria removed children who were not monolingual English language speakers and children already receiving another writing intervention. Thus, a total of 108 SWs were invited to participate in the intervention, and 76 (70.4%) received parental consent. Attrition rate was 6.6% as five participants failed to complete the intervention through withdrawal (n = 2) or disruptive behaviour (n = 3).

A total of 71 SWs, aged between 7 years 10 months to 10 years and 2 months (mean age = 9.08, SD = 7.85) completed the intervention period and were present for t1 and at least one post-test session (t2 and, or t3). There were 22 girls (10 from school 1, 7 from school 2, 5 from school 3) and 49 boys (24 from school 1, 13 from school 2, 12 from school 3) and no significant school-based differences in t1 performance were found in any measures in the study.

Assessment battery

The assessment battery was administered in class for the PiE, 1:1 for the oral language and reading measures, or in small groups for CBM-W.

Screening and matching

Progress in English (screening)

Children completed the long form for the PiE 9 (Kirkup et al., 2006) in two blocks on consecutive days. This included narrative and non-narrative reading comprehension tasks, a story writing task, a letter-writing task, a ten-word spelling test and a grammar test. SWs were identified using the writing subtests; correlation with teacher assessment levels for writing = 0.65; reliability of the whole PiE assessment battery = 0.93.

Oral expression (t1)

Wechsler Individual Achievement Test 2nd Edition UK (WIAT-II) Oral Expression (Wechsler, 2005) subtests were administered to assess children's ability to communicate using oral language. There are three subtests for this age range, word fluency, visual passage retell and giving directions. The visual passage retell task requires children to look at tell a story based on a pictorial storyboard. Stories are marked, 0, 1 or 2, with the inclusion of specific story elements and elaboration being rewarded. Test–retest reliability = 0.86, internal reliability = 0.83–0.89. The standardised score for this scale was used as a measure of children's oral expression.

Listening comprehension (t1)

WIAT-II Listening Comprehension (Wechsler, 2005) subtests (receptive vocabulary, sentence comprehension and expressive vocabulary) were administered to assess children's ability to understand what they are hearing; reliability = 0.80. The standardised score for this scale was used as a measure of children's listening comprehension.

Single word reading (t1)

The British Ability Scales 2nd Edition (BAS-II, Elliott, Smith, & McCulloc, 1997) word reading subtest was administered to assess children's oral reading of single words, with a focus on their word decoding skills; reliability = 0.93. The measure's ability score was used to assess children's single word reading skills.

Target measures

Sentence combining (t1, t2, t3)

Children completed all five items, including those designed for older children, from the WIAT-II Sentences subtest (Wechsler, 2005). They combined a series of five sentences, of gradually increasing complexity in writing. Each sentence received a score of 0, 1 or 2. Therefore, the maximum score was 10. Inter-rater reliability κ = 0.86.

Single word spelling (t1, t2, t3)

The BAS-II (Elliott et al., 1997) single word spelling subtest was administered to assess children's spelling ability. Children were asked to write the given words which were read out alone and in the context of a sentence. Words gradually increased in difficulty, ceiling and basal rules were applied, and raw scores were converted to ability scores. The test was discontinued when children passed two or fewer words in a block. Reliability = 0.91.

Writing product: curriculum-based measures of writing (CBM-W)

The outcome measure of intervention effectiveness was the CBM-W narrative writing task (Dockrell, Connelly, Walter, & Critten, 2015) undertaken by the children at all three assessment time points. Children were given five minutes to write in response to a prompt, e.g. One day I had the best weekend ever. This measure was used to establish the extent to which the interventions were effective in generalizing to writing by assessing writing productivity and accuracy.

Compositional quality (t1, t2, t3)

The text-level outcome was the compositional quality of the text; this was scored using an adaptation of the WIAT-II Holistic Scoring criteria for written expression (Wechsler, 2005). The stories were scored on a scale from 0 to 6. A low score indicates a limited attempt to respond without additional details. A high score means the text is well organised, clear, uses effective transitions and vivid vocabulary. Inter-rater reliability, κ = 0.82.

CBM-W accuracy measures (t1, t2, t3)

Accuracy measures for the CBM-W were the proportion of correct word sequences (CWS) and the proportion of words spelled correctly (WSC). A CWS is defined as a pair of consecutive words that are grammatically and syntactically correct within the context of the phrase. Interrater reliability (Cohen's Kappa) for the proportion of CWS and WSC were 0.80 and 0.90, respectively. All scoring followed the criteria set out by Dockrell et al., (2015).

CBM-W productivity measures (t1, t2, t3)

Productivity measures for the CBM-W task were the total words written (TWW), the number of complete sentences (CS) and the number of words in complete sentences (WiCS). A sentence was counted as complete if it started with a capital letter, appropriate ending punctuation, had a recognisable subject and ending punctuation. Inter-rater reliability (Cohen's Kappa) for these measures were 1.00, 0.85 and 0.86, respectively. These were scored according to the criteria developed by Dockrell et al., (2015).

CBM-W lexical diversity (t1, t2, t3)

In addition to the established scoring criteria for CBM-W, the narrative scripts' lexical diversity was analysed using the online software, Text Inspector ( Due to the brevity of the texts and to enable normalised gains to be calculated, Type Token Ratio (TTR) was selected as the measure of lexical diversity. TTR is calculated by dividing the number of different words (types) divided by the total number of words produced (tokens).

General procedure

SWs were matched in triads across intervention groups according to their reading and oral language profiles. Children within each triad were randomly assigned to one of the three intervention groups. There were no differences between the groups (SC, MS and WLC) on the oral language and reading measures used for matching (Table 1). The descriptive statistics for these measures suggest many of the SWs also had difficulties with oral language and reading. Furthermore, comparisons between the intervention groups (SC, MS and WLC) at t1 showed they were equivalent in the key measures of WIAT 2 sentence combining (F (2, 70) = 0.63, p = 0.535), BAS spelling (F (2, 69) = 0.63, p = 0.535) and CBM compositional quality (F (2, 62) = 0.41, p = 0.665) (see Table 2).

Table 1 Descriptive Statistics for matching variables by intervention group
Table 2 Raw score means and SD for WIAT-II sentence combining, BAS spelling and CBM-W compositional quality scores at each time point

SWs in the intervention groups were given an intervention targeting either sentence-combining or morphological spelling. Those in the WLC group continued with regular teaching for the duration of the study. The interventions ran twice a week for eight weeks, in small group sessions (4–6 children per group). Sessions lasted 25–30 min. All sessions followed a standard manualized procedure, with a script for each activity, within which it was possible to provide minor adaptations to meet the needs of the individual children. The progress of SWs in the interventions, during the intervention period, at both immediate post-test (t2) and 3-month delayed follow up (t3) was compared to a WLC group. Both interventions were administered by the first author.

Children's writing (spelling, sentence combining and text level), reading, and oral language skills were assessed at baseline (t1, mean age = 9.08 years, SD = 7.93 months). Following the completion of the intervention writing skills at the word-, sentence- and text-level (spelling, sentence combining and text production measures from the CBM-W) were re-assessed (t2, mean age = 9.12 years, SD = 7.85 months). The post-test assessment battery was repeated at a 3-month follow-up (t3, mean age = 9.05 years, SD = 7.84 months). Testing sessions were conducted over two days. At the end of the study, those in the WLC group received the SC intervention.

Intervention procedures

The SC programme was adapted from Saddler (2012). The alternative, MS intervention programme was adapted from the work of Nunes and Bryant (Nunes & Bryant, 2006; Nunes et al., 2003; Nunes, Bryant, & Olsson, 2009). Adaptations focused on developing Tier 2, small group interventions for those at risk of falling behind, which complemented classroom teaching. A brief outline of the interventions is provided below. Contact the first author for further details.

Target intervention: sentence combining

The main body of each session provided strategies, or techniques, teaching the children how to combine sentences, gradually increasing in complexity using guided practice. For the first session, the researcher explained that they were going to learn some ways to make their sentences better, children then practised sentence combining and discussed how they found the activity, they were then introduced to the first strategy of identifying important words. Each subsequent session followed a standard format of a revision of the previous session (approximately 3 min), then two or three activities focused on practising verbal and written sentence combining (lasting between 5 to 15 min). The first activity typically began with explicit modelling by the instructor, followed by guided practice. The final activity asked the children to practice the skills independently. All sessions encouraged students to provide peer feedback. Sessions finished with a summary (approximately 2 min). The final two sessions taught children to break long sentences into simple sentences and then improve them. Children were encouraged to discuss answers, provide formative feedback, and write down responses using either whiteboards or on paper.

Table 3 shows the topics for each intervention session. An example script for a whole session is presented in Fig. 1.

Table 3 Overview of sentence combining intervention
Fig. 1
figure 1

taken from Session

Example session script for Sentence Combining Intervention,

Alternative intervention: morphological spelling

The main body of each session used activities or games to teach children morphemes and their spellings. Children were given opportunities to receive feedback and correct their answers. The target morphemic principles, and adapted instructional materials, were taken from Nunes and Bryant (2006), who designed a series of inflectional and derivational morphological spelling interventions for 8-year-olds. In session 1, children were introduced to the structure of the intervention and told they were going to be taught some spelling rules; they discussed how they felt about spelling and discussed the different ways words can be broken down using a short spelling test. Subsequent sessions started with a revision of the previous session and included two or three activities designed to support the learning of the session's principle. An example activity, called the 'analogy task', taken from Nunes and Bryant (2006) is presented in Fig. 2, for this activity, children were asked to find the missing word. Other activities included: grouping words into word classes, identifying morphemes, finding the missing word and discussion and identification of affixes. See Table 4 for an overview of the topics for each session. To control for differences in the amount of time spent writing between the two interventions, children wrote sentences using some of the target words at the end of each session.

Fig. 2
figure 2

Adapted from Nunes and Bryant (2006, pp. 71–74)

Examples of materials used for analogy game, used to teach suffixes.

Table 4 Overview of morphological spelling intervention

Intervention fidelity

One researcher administered the interventions. To ensure intervention fidelity, the researcher completed checklists designed to ensure each child participated and received feedback and timed the sessions. Checklists did not differ between intervention sessions, and there was no difference in the duration of the intervention sessions, t (10) = 0.24, p = 0.817, with the SC and MS interventions lasting an average of 24.95 and 25.07 min respectively. With the exception of one session from the sentence combining intervention, where one activity was cut from the session due to time constraints, all activities from each session were successfully administered.

Data analysis

Two approaches were taken to control for t1 performance. First, for those measures with a maximum score (sentence combining, BAS spelling raw score, compositional quality, CBM-W accuracy measures and CBM-W Lexical Diversity), normalized gain scores were calculated, these were analyzed using two-way analyses of variance (ANOVAs). Second, for measures with no maximum score (CBM-W productivity measures of TWW, CS, and WiCS), t1 performance was entered as a covariate in a series of one-way analyses of covariance (ANCOVAs). Hedge's g was used to establish the size of the effect for each analysis; effect sizes exceeding 0.40 are reported in the text. Finally, for those measures with significant group-level differences, exploratory regression analyses were conducted to explore the role of oral language, spelling and reading skills in SWs responsiveness to the interventions.


The results are presented in two sections; the first examines the impact of the intervention on sentence-combining, compositional quality, spelling and the CBM-W accuracy and productivity variables. The second section uses exploratory regression analyses to examine the role of oral language, spelling, and reading skills in children's response to the interventions for measures where the SC intervention group showed greater gains than the MS or WLC groups.

Intervention effectiveness

To examine differential progress across the three groups between t1 and the t2 and t3 post-tests normalised gain scores were used for sentence combining, spelling, compositional quality (for raw scores see Table 2), and all the CBM-W accuracy variables (for raw scores see Online Materials 1).

WIAT-II sentence combining

Results from the mixed-measures ANOVA (Table 5) for WIAT-II sentence combining revealed a significant main effect of the intervention group (F (2, 62) = 5.81, p = 0.005, \(\eta_{\rho }^{2}\) = 0.16). Post hoc comparisons (LSD adjustment) revealed the SC group showed greater gains than the MS group who in turn showed greater gains than the WLC group (see Fig. 3). There was no significant main effect of time from t2 to t3 (F (1, 62) = 1.57, p = 0.215), and no significant group by time interaction (F (2, 62) = 0.05, p = 0.949).

Table 5 Mean normalised gain scores, effect sizes at each time point for sentence combining, spelling ability and compositional quality
Fig. 3
figure 3

Normalised gain scores for WIAT-II Sentence Combining for sentence combining (SC) intervention group compared to the morphological spelling (MS) and waiting list control groups (WLC)

Further analyses, using a Bonferroni correction (α = 0.025), show the difference between groups was present at both t2 (F (2, 62) = 4.54, p = 0.014) and t3 (F (2, 62) = 4.67, p = 0.013). Post hoc tests (LSD adjustment) showed the SC group had greater gains, with moderate to large effect sizes, than both the MS (g = 0.78) and WLC (g = 0.76) groups at t2 (see Table 5). At t3, the MS group had made some small gains, though not enough to reach significance. These gains meant the difference between the SC and MS groups at t3 was no longer significant, while still maintaining a moderate effect size (g = 0.61). The significant difference between SC and WLC groups was maintained at t3, with a large effect size (g = 0.84).

Compositional quality

The mixed-measures ANOVA for compositional quality (Table 5) revealed no significant main effect of group (F (2, 62) = 0.61, p = 0.548) or time (F (1, 62) = 1.21, p = 0.277), and no significant group by time interaction (F (2, 62) = 0.76, p = 0.471). Despite the non-significant main effect, it is worth noting the moderate effect size (g = 0.48) present when comparing the SC and WLC groups at t3 (see Table 5), this suggests meaningful improvements, which did not reach significance, for children in the SC group compared to the WLC group at t3.

BAS spelling

The mixed-measures ANOVA for BAS Spelling ability (Table 5) revealed no significant main effect of group (F (1, 61) = 0.31, p = 0.735), a significant main effect of time (F (1, 61) = 4.44, p = 0.039), with an increase in from t2 to t3 and no significant group by time interaction (F (2, 61) = 0.17, p = 0.845).

CBM-W accuracy

Results for the CBM-W accuracy measures are presented in Table 6.

Table 6 Means, SD and effect sizes for the normalised gain scores of CBM-W proportion of words spelled correctly, proportion correct word sequences and TTR

The sample size is smaller for some measures as normalised gain scores could not be calculated for children who had the maximum score at t1.

The mixed-measures ANOVA for proportion of Words Spelled Correctly (WSC) revealed no significant main effect of group (F (2, 55) = 1.04, p = 0.360) or time (F (1, 55) = 0.14, p = 0.715), and no significant group by time interaction (F (2, 55) = 0.13, p = 0.884).

The mixed-measures ANOVA on proportion of Correct Word Sequences (CWS) revealed no significant main effect of intervention group (F (2, 55) = 1.04, p = 0.360). Despite the non-significant main effect for the intervention group, there are moderate effect sizes present when comparing the SC to both the MS (g = 0.56) and WLC (g = 0.69) groups at t2 (see Table 6). These effect sizes were not maintained at t3. There was a significant main effect of time (F (1, 55) = 5.07, p = 0.028, \(\eta_{\rho }^{2}\) = 0.08) where all groups showed greater gains at t3 than at t2 and no significant group by time interaction (F (2, 55) = 1.77, p = 0.180).

The mixed-measures ANOVA for the lexical diversity measure, TTR, revealed the main effect of intervention group was approaching significance (F (2, 46) = 2.83, p = 0.070), no significant main effect of time (F (1, 46) = 0.84, p = 0.364) and no significant group by time interaction (F (2, 46) = 2.95, p = 0.062). Post-hoc tests (LSD adjustment) found the MS group made significantly more gains in TTR than the WLC group with a very large effect size (g = 1.01) at t3. There was also a moderate effect size for the SC group as compared to the WLC group at t3 (g = 0.51). Although not significant, there were moderate effect sizes for the MS group as compared to the SC group at both t2 (g = 0.59) and t3 (g = 0.52).

CBM-W productivity

A series of mixed-measures ANCOVAs, with t1 performance as a covariate, was conducted to explore the performance over time and by intervention group on the CBM-W productivity variables: TWW, CS, and WiCS. There were no significant main effects or interactions. For further details, see Online Materials 2.


To summarise, results for intervention effectiveness found the SC intervention group made significant gains, with moderate to large effect sizes, in comparison to the MS and WLC groups, on the WIAT-II sentence combining task at t2. Differences between the SC and WLC groups were maintained at t3. There was indicative evidence that children in the MS group showed more gains in lexical diversity than those in the WLC group. There were no other group-level differences on the other writing measures (compositional quality, spelling, or CBM-W accuracy and productivity measures).

Exploratory analysis for the role of oral language, reading and spelling.

To explore the potential impact of oral language, reading and spelling on intervention efficacy, exploratory hierarchical regression analyses were conducted on WIAT-II sentence-combining gain scores where group group-level differences were significant. Children's oral language, spelling and reading abilities along with t1 scores for WIAT-II sentence-combining, were entered into the model first. The two dummy coded intervention variables (with the SC intervention as the reference group) were added to the second model. There were no issues of multicollinearity.

At t2, the results of the first model were significant (see Table 7). Children with lower t1 sentence combining scores, lower reading ability and better spelling ability showed greater gains at t2. The results of the second model, adding the intervention groups, significantly improved the model. In addition to reading, t1 sentence combining and spelling ability predictors, both the MS and WLC dummy coded variables predicted children's normalised gain scores on the sentence combining task at t2 suggesting sentence combining groups showed the most gains.

Table 7 Hierarchical multiple regression for sentence combining normalised gain scores at t2

At t3, the relationships between the predictor variables and children's normalised gain scores changed (see Table 8). At this point, the models remained significant, but the second model did not significantly improve on the first model. Single-word reading was no longer a significant predictor of children's gain scores for the sentence combining task. Instead, listening comprehension, spelling ability and t1 sentence combining predicted gains at t3. Furthermore, at t3, MS intervention was catching up with the SC intervention as this predictor was no longer significant; those in the SC group were still doing better than those in the WLC condition.

Table 8 Hierarchical Multiple Regression for sentence combining normalised gain scores at t3


The current study aimed to establish the effectiveness of a sentence-level, SC intervention in comparison to a word-level MS intervention and a business as usual WLC group and to capture the impact of oral language and spelling skills on the effectiveness of the intervention. It was predicted that the sentence-combining intervention would improve sentence combining ability, compositional quality and measures of productivity and accuracy captured by the CBM-W and that children with poorer language and spelling skills would be more resistant to change. The MS intervention was predicted to improve spelling accuracy within the text.

As predicted, the SC intervention was more effective at improving the sentence combining ability of SWs in comparison with the MS and WLC control groups at t2. The difference between the SC and WLC groups was maintained at t3. However, contrary to expectation, the difference between the SC and MS groups on sentence combining was no longer significant by t3. Contrary to predictions, there was no significant group effect of the interventions on compositional quality. The MS intervention did not lead to significant improvements in children's standardised test spelling scores, but there was weak evidence that the children were producing texts with greater lexical diversity than those in the WLC groups. These results are discussed below.

Intervention effectiveness

Consistent with previous research (Andrews et al., 2004; Berninger et al., 2011; Saddler, 2012; Saddler, Asara, et al., 2008; Saddler & Graham, 2005), SWs who received the SC intervention learned to combine sentences more effectively than those in either the MS or WLC groups. Effect sizes comparing SC and WLC (g = 0.48–0.84) for both sentence-combining and compositional quality measures are comparable to previous research. For example, in their meta-analysis, Graham and Perin (2007) found an average ES = 0.50, which was based on five articles with ES ranging from 0.21–0.66.

However, the lack of wider comparative improvements in SWs written texts (in quality, productivity and accuracy) suggests that the focus on sentence combining alone was not appropriate for these SWs. This may reflect either the length of the intervention or the children's baseline skills. For example, in the current study, SWs received an average of 400 min teaching in groups of 4 to 6. By contrast, children in Saddler and Graham's (2005) Tier 3 intervention study received nearly twice the amount of instruction (750 min) in pairs. Further studies are needed to capture the effects of intervention dosage for generalization to other aspects of text production. Dosage and children's baseline skills likely interact. Better spellers showed greater gains on the sentence-combining measure suggesting that they had a greater capacity to benefit from the intervention (McCutchen, 1996).

The weak evidence for greater lexical diversity in the texts of children in the MS group, as compared to the SC and WLC groups, suggests the spelling intervention may have begun to make some text-level impacts. Differences in lexical diversity may result from increased confidence to attempt unfamiliar words (Sumner et al., 2014), or due to exposure to new words during the intervention. Alternatively, the measures used may have been insensitive to developmental change, as a result of the short time between testing points in the current study.

Exploring the roles of oral language and reading

It was expected that children's oral language, reading and spelling abilities would moderate children's response to the interventions. Many participants had poor oral language (listening comprehension and oral expression), reading and spelling skills. The exploratory hierarchical regressions showed that reading, spelling and baseline t1 sentence combining scores best predicted gains for sentence combining at t2. However, longer-term gains in sentence combining at t3 were predicted by listening comprehension, spelling and t1 sentence combining at baseline and confirm the importance of oral language to maintaining learning from this intervention.

Poor readers with a reasonable level of listening comprehension seem to have responded better to the intervention on the sentence combining measure while those with poor oral language skills may be resistant to intervention (Al Otaiba & Fuchs, 2002). Those with poor oral language skills experience difficulties with both word- and sentence-level skills (Graham et al., 2020) and this may explain their resistance to intervention as compared to poor readers. Reading skills are thus important for the initial response to the sentence combining intervention, and there was, in fact, some small gains to reading decoding made by participants perhaps as a side effect of the reading exposure undertaken during the intervention. However, to maintain longer-term gains the distal, but important, language skills measured by listening comprehension was a stronger predictor, and this conforms to other studies on writing growth in children (Dockrell, Connelly, & Arfé, 2019) and the importance of comprehension for sentence combining (Saddler, Ellis-Robinson & Asaro-Saddler 2018).

The consistent influence of the proximal factor of spelling (Dockrell, Connelly, & Arfé, 2019) at t2 and t3 suggests sentence combining interventions may be more appropriate for children without significant spelling difficulties. For poor spellers, it may be more beneficial to use a combined intervention, targeting multiple skills, to help develop their spelling skills alongside other aspects of written text production. The role of handwriting as a pre-, or co-, requisite for spelling development could be a factor depressing spelling levels. However, no measure was taken of handwriting fluency, and this could be examined in future studies.


The small sample sizes, mean these analyses are underpowered, so more work is needed to confirm the roles of oral language and reading in children's response to the interventions. Furthermore, the measure used to assess growth in sentence-combining ability was relatively short in items in comparison to previous studies. For example, Saddler and Graham (2005) used standard scores from a 20-item standardised measure and found their SC intervention was more effective than a grammar instruction intervention. The brevity of the sentence-combining measure used in the current study gives a minimal opportunity for students to demonstrate growth. In the future, a longer measure, such as that used by Saddler and Graham (2005) should be used.

Sentence combining is also, in part, a technique that can be used during revision (B. Saddler, 2012); however, the current study did not encourage children to revise or edit their work. Therefore, adapting the intervention to provide opportunities for children to practice these skills during writing, or revision may improve the current sentence-combining intervention's effectiveness. As SWs commonly also neglect to revise or edit their work (Flower & Hayes, 1980), this sort of alteration may be best combined with an intervention aimed at also developing revision skills.

Implications and future research

The inhibiting role of spelling in the sentence-combining models may help explain why gains were not seen in children's compositional quality, as it suggests difficulties with spelling may have limited the ability of these children to access the higher-level skills needed for written text production. A combined intervention approach, targeting spelling and sentence combining, may be more beneficial for these children and so should be explored in future research.

Despite the similarities seen in texts of children with a range of co-morbid difficulties (Connelly & Dockrell, 2016), using a complete literacy profile (including oral language, reading, spelling and writing skills) may help with identifying which intervention(s) may be most effective. Given that, in the current study, longer-term gains in sentence-combining ability were predicted by baseline levels of spelling and listening comprehension it may not be appropriate to use this intervention with children who are struggling with oral language and have not reached a sufficient level of spelling to allow them to write sentences easily. Researchers should continue to identify which interventions are most effective for which children to enable educators to adapt or combine effective interventions using techniques such as those suggested by Al Otaiba et al., (2018), within a RTI framework.


The current study adds to the growing body of evidence that sentence-combining interventions can be an effective tool for improving children's writing regarding sentence combining. However, they may require a long and intensive course of instruction before they impact text-level measures. Findings suggest children may need competence in single word spelling to benefit from sentence combining. Further research is needed to verify this finding and determine the level of single word spelling that is needed. Furthermore, practitioners may find that sentence combining is more appropriate for SWs whose primary area of difficulty is reading, rather than poor spelling or oral language. The findings suggest researchers should consider language profiles when devising interventions for SWs.