Background

One of the features of clear, intelligible speech is the ability to produce consonant clusters. It is common in most languages to have two or more consonants in one syllable with no vowel in between which is called a cluster [1].

Consonant clusters are important to study for a number of reasons. Firstly, consonant clusters are evident in many world languages [2]. Secondly, one third of the English monosyllables commence with a consonant cluster, which would be interesting to compare with the Egyptian ones. Thirdly, consonant clusters dominate word-final position and are particularly important for marking phonetically complex morphophonemes—grammatical morphemes realized by consonant clusters, such as plural, possessive morphemes, and past tense morphemes [3, 4]. Fourthly, the accurate use of consonant clusters has been associated with improved expressive language outcomes (particularly given their role in marking grammatical morphophonemes) and overall long-term superior literacy development [5]. Finally, difficulty with the production of consonant clusters (e.g., reduction of clusters from two or three elements to one) is common in pre-school-aged children with speech sound disorder [6].

As children learn to produce consonant clusters, a range of errors is possible. The most common error is cluster reduction, whereby two or three elements in the cluster are reduced to one or two [7]. In a study of 50 English-speaking children aged 2 years and 10 months–5 years and 2 months, [8] reported that the frequency of occurrence of cluster reduction decreased from 30% at age 3 years to 10% by 4 years.

Phonologically based therapy programs aim to suppress error patterns by using meaningful words to reorganize the sound system [9]. Among the most commonly used approaches are the phonological contrast therapies, contrasting two words different only in one phoneme [10]. Such approaches include minimal [11], maximal [12, 13], and multiple opposition [14], together with empty set [15]. Because Arabic is the first language of nearly 200 million people in about 24 countries, and a religious language in predominantly Islamic countries [16], the efficacy of the previous approaches in Arabic-speaking children needs to be investigated. To the best knowledge of the authors, only one study using a maximal opposition approach in a single-case experimental design was done in this field [17].

Another well-known phonological therapy procedure is auditory bombardment developed by Hodson and Paden [18]. They proposed that it helped children to develop “auditory images,” against which they monitored incorrect productions as opposed to ‘kinesthetic images” developed by other practice procedures. Using auditory bombardment is common in language therapy focusing on reduction of phonological processes. Its use has been noted to trigger spontaneous rehearsal of bombarded target words by children in the clinic setting [19]. In this approach, many and varied target examples are presented to the child repeatedly and intensively, sometimes in a meaningful context such as a story [20].

Aim of this study

The present study aims to assess the acquisition of consonant clusters in young Egyptian children and to evaluate two different phonological therapies (minimal contrast and auditory bombardment) for remediation of the phonological process of “cluster reduction.”

Methods

Study design

This study was a descriptive, cross-sectional study. The study was applied on 150 monolingual Arabic-speaking Egyptian children, 30 to 48 months, from phoniatric units in different university hospitals and from some nurseries. They were divided into 3 groups representing 6 months age intervals, and the study extended from December 2019 to March 2022.

Subjects

One hundred fifty typically developing (TD) native Egyptian Arabic children aged from 30 to 48 months were divided into 3 groups:

  • Group A: 50 subjects between 30 to 36 months

  • Group B: 50 subjects between 36 months, 1 day to 42 months

  • Group C: 50 subjects between 42 months, 1 day to 48 months

Inclusion criteria

  • To be within the age limit of the study

  • To have typical attention, hearing, language, and mental ability as reported by their teachers and caregivers

  • To be within the high- and low-middle social class, in order to be ensure consistent parents and to be adequately representative of the community

Exclusion criteria

  • Children outside the age range of the study

  • Children with an identified cognitive or hearing impairment

All children were subjected to the following protocol of assessment:

  1. I.

    Elementary diagnostic procedures

    Interview with the child and the caregiver to collect social, natal, and developmental data and to assess language of the child through a naturalistic context sampling

  2. II.

    Clinical diagnostic aids

  1. a.

    Stanford-Binet intelligence scale (5th edition) to provide mental age [21] was done to a sample of 10 children in each age group

  2. b.

    Language evaluation by modified PLS-4 test Arabic edition [22] to objectively ensure typical language development, was done if naturalistic language evaluation was not adequately informative

  3. c.

    Egyptian Monosyllabic Consonant Cluster Test (EMCCT) was designed to assess productions of consonant clusters in a single word context. The task contains 50 words which comprise word-final consonant cluster commonly used in the Egyptian language (see Additional file 1)

Test application

The EMCCT test was applied through the following four stage cuing hierarchy:

  1. 1.

    A question, “What’s this?” was given;

  2. 2.

    If the child did not respond, s/he was given a clue to help identify the word, as telling the child the function of the object;

  3. 3.

    If the child still did not respond, a binary choice was given with the target word first, for example, “Is it /ʔefl/ or /bæ:b/?";

  4. 4.

    If still the name was not identified, delayed imitation of the word was attempted. The child was informed about the name; for example, “This is /ʔefl/ and the child was asked “What is this?”.

Average test time was 5 to 8 min. A digital recording was used for voice recording of the responses of each child during testing.

All responses were phonetically transcribed on the score sheets (see Additional file 1). Environment for applying the test was a quiet, well-lighted, and well-ventilated room with no distracting elements.

Validation and reliability measures

Tests of validity

Face validity test was implemented through three independent and experienced phoniatricians who judged all words and word pictures of Egyptian Monosyllabic Consonant Cluster Test (EMCCT) for being completely relevant to the intended target. Some pictures were subjected to change before test application stage according to judgments’ opinions, in order to be clear and suitable for the younger age group.

Tests of reliability

The reliability of EMCCT test of consonant clusters was tested by inter-rater and test-retest reliability methods. Test-retest reliability was assessed by having the same examiner re-administer the test to the same child on two occasions under similar circumstances. For this procedure, 10 children were selected at random from each age group to undergo retesting. Inter-rater reliability was assessed by having two different raters code the same audio-recording of a single test administration to identify use of processes. This was done for all audio recordings. Agreements were counted upon the use/non-use of cluster reduction process for each word by every child. This is in addition to documentation of phonetic transcription given by each rater.

Study interventions

As cluster reduction is a very commonly occurring problem, children under study not acquiring consonant clusters received two completely different phonological therapy approaches, namely phonological contrast therapy (minimal opposition) and auditory bombardment procedure. The therapy was provided through two therapy sessions per week, 30 min each, over a period of 3 months. Children had a baseline objective assessment after 3 months.

Data management and analysis

Descriptive and inferential statistical procedures were considered. These data were tabulated, coded, and then analyzed using the computer program SPSS (Statistical package for social science) version 16 to obtain the following:

Descriptive statistics

  1. 1)

    Mean

  2. 2)

    Standard deviation (± SD)

  3. 3)

    Median and range

  4. 4)

    Number and percent

Analytical statistics

The significance of difference between groups was tested using one of the following tests:

  1. 1-

    Student’s t test was used to compare between mean of two groups of numerical (parametric) data. For continuous non- parametric data, however, Mann-Whitney U test was used

  2. 2-

    Wilcoxon test was used for two values within the same group (pre and post)

  3. 3-

    ANOVA (analysis of variance) was used to compare between more than two groups of numerical (parametric) data, but Kruskal-Wallis was used for continuous non- parametric data

  4. 4-

    Pearson correlation coefficient (r) test was used in correlating different parameters

  5. 5-

    Inter-group comparison of categorical data was performed by using chi-square test (χ2-value)

  6. 6-

    The agreement between rates was tested by Kappa agreement test

A P value < 0.05 was considered statistically significant.

Results

Descriptive and comparative analysis of the studied groups

Demographic data

Children’s age ranged from 30 to 48 months. This sample is divided into 3 groups:

  • Group A: 50 TD children in the age range 30–36 months (mean ± SD = 33.82 ± 1.42) including 28 males (56%) and 22 females (44%)

  • Group B: 50 TD children in the age range 36 months, 1 day to 42 months (mean ± SD = 39.56 ± 1.63) including 28 males (56%) and 22 females (44%)

  • Group C: 50 TD children in the age range 42 months, 1 day to 48 months (mean ± SD = 45.08 ± 1.65) including 28 males (56%) and 22 females (44%). As shown in Table 1

Table 1 Comparison of study groups regarding chronological age in months, gender, and residence

Age of suppression of cluster reduction

The age of process suppression was defined as the earliest age at which 90% of that age group achieved suppression of the process. The frequency of occurrence of cluster reduction decreased from 74% at age 2 years and 6 months–3 years to 46% at age 3 years–3 years and 6 months to 10% at age 3 years and 6 months–4 years as demonstrated in Table 2.

Table 2 Age of suppression of cluster reduction process among the 3 age groups

Correlation between cluster reduction and chronological age in months

Cluster reduction process was gradually suppressed across age groups from younger children to older children both in terms of frequency and score of processes demonstrated in Table 3.

Table 3 Correlation between cluster reduction and chronological age in months

Statistical association between consonant cluster’s manner of articulation and degree of suppression of consonant cluster among the 3 age groups

The consonant clusters were divided into 4 groups according to the manner classification of each consonant in the cluster: obstruent/sonorant, obstruent/ obstruent, sonorant/obstruent, and sonorant/sonorant.

The number of the types of consonant cluster words in the EMCCT is unequal, i.e., (obstruent/sonorant = 18 words, obstruent/obstruent = 9 words, sonorant/obstruent = 19 words and sonorant/sonorant = 4 words) from the whole 50 words in the test (see Additional file 1).

The number of words that should contain clusters in each age group is 2500 (50 words in the test × 50 children in each age group). The total number of words produced with cluster reduction in each group was then calculated and divided according to the type of cluster. The whole group of words with cluster reduction, together with each group of reduced consonant clusters, were compared to the total of 2500 words in each age group.

There was highly significant statistical association in consonant clusters of obstruents + sonorants (stops + sonorants and fricatives + sonorants) between the different age groups (p value < 0.001*), and there was still cluster reduction of both sub-groups in older children (group c), while all consonant cluster sub-groups acquired in older children are shown in Table 4.

Table 4 Analysis of different types of consonant cluster reduction words across the studied groups

Validity and reliability of Egyptian Monosyllabic Consonant Cluster Test (EMCC)

The reliability of EMCCT was tested by inter-rater reliability and test re-test reliability using Cohen’s Kappa. McHugh [23] suggested that “values of kappa ≤ 0 indicated no agreement, 0.01–0.20 indicated none to a slight agreement, 0.21–0.40 fair agreement, 0.41–0.60 moderate agreement, 0l.61–0.80 substantial agreement, and 0.81–1.00 almost perfect agreement”.

Inter-rater reliability

In all groups, there was substantial significant agreement (kappa = 0.7, p < 0.001), and also in group A and group B, there was substantial significant agreement (kappa = 0.62 and 0.65, p < 0.001 respectively). In group C, there was almost perfect significant agreement (kappa = 0.89, p < 0.001) as shown in Table 5.

Table 5 Inter-rater reliability

Test re-test reliability

There was excellent significant reliability (r = 0.996, p < 0.001) in all groups, and also in groups A, B, and C (r = 0. 996, 0.987, and 0.983, p < 0.001 respectively) (Table 6).

  • 1: perfect reliability

  • ≥ 0.9: excellent reliability

  • ≥ 0.8 < 0.9: good reliability

  • ≥ 0.7 < 0.8: acceptable reliability

  • ≥ 0.6 < 0.7: questionable reliability

  • ≥ 0.5 < 0.6: poor reliability

  • < 0.5: unacceptable reliability

  • 0: no reliability

Table 6 Test re-test reliability

Comparison between minimal contrast and auditory bombardment therapies for remediation of “cluster reduction” process

  1. 1.

    Minimal contrast (N = 30)

The pre-therapy score range of 6–45 cluster reduction words showed significant improvement as compared to post-therapy score range of 0-4 words as shown in Table 7 (Fig. 1).

  1. 2.

    Auditory bombardment (N = 30)

Table 7 Measuring the difference between pre- and post-therapy scores for minimal contrast therapy
Fig. 1
figure 1

Minimal contrast regarding pre and post therapy score1

The pre-therapy score range of 5–18 cluster reduction words showed significant improvement as compared to post-therapy scores range of 0–4 words shown in Table 8 (Fig. 2).

Table 8 Measuring the difference between pre- and post-therapy scores for auditory bombardment therapy
Fig. 2
figure 2

Auditory bombardment regarding pre and post therapy score

Total intervention duration and cumulative intervention intensity for each type of therapy

The application of evidence-based practice (EBP) puts into consideration the efficiency of the intervention intensity [10, 24], including the number of treatment doses, the treatment intensity, and treatment duration [25]. In this study, each of these parameters was held across the two treatment approaches. Each child received treatment doses (i.e., the number of single word examples included per session). For minimal contrast therapy, the treatment dose was 4 words, while for auditory bombardment therapy, it was 10 words. The dose frequency (i.e., the number of times the treatment is administered) was once per day for 30 min, twice per week. Given that some children were absent for some days, the specific number of treatment days per child is reported. The overall treatment duration was 12 weeks, for a total intervention duration of up to 24 to 34 days.

This yielded a cumulative intervention intensity (dose number × dose frequency/day × total intervention duration in days) of mean = 556.00 for auditory bombardment and mean = 291.20 for minimal contrast therapy across children. There is no significant difference between total intervention duration in both therapy techniques. However, there is highly significant association between cumulative intervention intensity and minimal contrast therapy (p value < 0.001*) as shown in Table 9.

Table 9 Comparison between types of therapy regarding total intervention duration and cumulative intervention intensity

Discussion

Syllable structure processes are one of phonological processes that act to decrease number of syllables in a word or to simplify the syllable structure. As stated by Saleh et al. [26], the child at a younger age relies mostly on syllable structure processes to simplify word shape in order to suit his/her articulatory capacity. The incidence of these processes gradually decreases as the age increases, with a more evident need toward phoneme simplification through the processes of assimilation and substitution.

Roberts et al. [27] reported that the majority of phonological error patterns resolved rapidly between 2 years and 6 months–4 years. Consequently, the use of phonological processes gradually decreases across age groups from younger to older children. Cluster reduction is one of the syllable structure processes. Acquiring consonant clusters is considered to be more challenging than single consonant attainment, as consonant clusters take longer to acquire [28, 29]. Clusters in Colloquial Egyptian Arabic (CEA) consist of two consonants in the coda position of the word. Nouns that contain clusters in CEA are monosyllabic. By the process of cluster reduction, the number is reduced to one consonant instead of two.

The developmental criterion for the acquisition level of a phonological sound or structure is the percentage of subjects who must produce the sound/structure correctly in order to be considered developmentally stable or acquired. It is considered acquired when 90% of children in the examined age group can correctly utter the target consonant clusters [30]. The first objective of this study is to determine the age of suppression of cluster reduction; i.e., the age at which the child can produce consonant cluster words correctly. Age of suppression of cluster reduction process in our study was found to be 3 years 6 months. This is 6 months earlier than was reported by Shriberg and Kwiatkowski [31]; but matching with the results of Owaida [32] and Abou-Elsaad et al. [33]. Both found that disappearance of cluster reduction is between 3 and 4 years of age.

As seen in Table 2, the mean percentage of correct consonant clusters was 26.0% in children aged 2 years and 6 months–3 years. This figure rose to 54.0% in children aged 3 years–3 years and 6 months and reached 90.0% in children aged 3 years and 6 months–4 years. In a longitudinal study of 16 typically developing Australian-English-speaking children aged 2 years–2 years and 11 months by McLeod et al. [34], a mean percentage of 29.5% was reported for correct consonant cluster production. McLeod and Arciuli [35] reported a percentage of 88.1% in children aged 4 years–4 years and 11 months, and finally 94.5% in children aged 5–12 years.

As children learn to produce consonant clusters, a range of errors is possible. The most common error is cluster reduction, whereby two or three elements in the cluster are reduced to one or two [7]. In a study of 50 English-speaking children aged 2 years and 10 months–5 years and 2 months, Haelsig and Madison [8] reported that the frequency of occurrence of cluster reduction decreased from 30% at age 3 years to 10% by 4 years.

In our study, the frequency of occurrence of cluster reduction decreased from 74% at age 2 years and 6 months–3 years to 46% at age 3 years–3 years and 6 months to 10% at age 3 years and 6 months–4 years. Similarly, both Hodson and Paden [6] and Smit [7] reported that cluster reduction was rare in the production attempts of their 4-year-old participants with typical development, being one of the last phonological patterns to be eliminated.

Some consonant clusters are easier to master than others. Children typically master consonant clusters that consist of stop + liquid elements (e.g., /pl/) before fricative + liquid clusters (e.g., /sl/) [36,37,38]. For example, Powell and Elbert [39] noted that 75% of 4-year-old subjects were able to produce all stop + liquid clusters (except /gr/), but they could not produce any fricative + liquid clusters. This finding was supported by Powell [40], who studied 4- and 5-year-old children.

In the present study, it was noticed that consonant clusters of obstruents +sonorants (e.g., stops+sonorants /tn/, /tr/, /bl/, /dn/, /gl/,/gn/ and fricatives + sonorants /hr/,/ħr/,/sr/,/sr/,/ʕr/,/sm/,/fl/) were the ones that still had cluster reduction in group (C) as seen in Table 4. A remarkable finding was that no other type of cluster combinations had any cluster reduction in group (C). The group of clusters that was most difficult for the children to master is that which started with a completely or partially closed vocal tract and shifted quickly to an open one. Even when the same consonant types were present in the reverse order, sonorant + obstruents, the children could master them well in group (C).

McLeod et al. [34] stated that the most common word-final clusters produced by the 2-year-old subjects in their study contained nasals and are frequently found in English (e.g., /nd/, /nt/, /ŋk/). It could be noticed here the nasals with an open vocal tract were the first consonants in the acquired clusters. Likewise, Dyson [41] reported that the only word-final cluster used by over half of the 2- to 3-year-old subjects was /ts/; transitional clusters included (/ps/, /ns/). Again, the acquired clusters either presented a shift from open to a closed oral cavity, or both consonants had the manner of articulatory occlusion. These findings are homogenous with the findings in the present study. From an articulatory perspective, it is easier to put the manner of articulation with a closed or narrow vocal tract at the end, not at the beginning, of a cluster. Clusters that had similar manners of articulation in both consonants of the cluster, either open or closed, were easy to acquire.

In languages other than English, word-final consonant clusters have been reported to be acquired before word initial clusters. For example, /nt/ was the first consonant cluster to be acquired by Mexican-Spanish children, as reported by Macken [42]. On the other hand, Powell [40] reported that the position in which the cluster occurred (i.e., word-initial versus word-final) was not a factor in the difficulty of the cluster for the 4- to 5-year-olds, but that it may be for younger children. Three element clusters were more difficult to produce than two element clusters.

Various interventions for phonological disorders in children have been developed and phonological contrast therapies are among the most widely used [24]. Such approaches include minimal [11], maximal [12, 13], and multiple opposition [14], as well as empty set [15]. Also, speech sound perception training is used to help a child to acquire a stable perceptual representation for the target phoneme or phonological structure, and auditory bombardment is a recommended procedure that targets speech perception of phonemes or phoneme combinations [12, 20].

Young [43] investigated the effects of phonological treatment on cluster reduction in children with articulatory errors. Participants displayed improved production accuracy within six or seven sessions. In 6-week follow-up sessions, participants demonstrated 60 to 100% accuracy on trained and untrained probes for targets. These results suggest that targeting consonant clusters using a phonological approach is effective.

Hoffman et al. [44] compared a whole language approach and minimal pairs approach to treating consonant cluster reduction in two 4-year-old brothers with noted phonological delays for three 50-min treatment sessions over 6 weeks. Post-treatment measures revealed that both participants improved in phonology and reduced consonant cluster reduction. However, the participant who was treated using the phonological approach had greater improvement in overall phonology and exhibited greater accuracy of consonant cluster productions than the child treated via the whole language approach.

In the current study, we compared between pre- and post-therapy scores of 60 Egyptian Arabic-speaking children of typical language development; 30 receiving auditory bombardment and 30 receiving minimal contrast therapy. By using minimal contrast therapy, participants generalized to untreated monosyllabic words during intervention and produced consonant clusters with 90–100% accuracy during treatment as well as the maintenance follow-up session, also resulting in an improvement in overall intelligibility. Additionally, production knowledge of /r/ in consonant clusters emerged in some participants. Minimal contrast therapy causes the child to be more aware with phonemes inside syllable. These results provided additional evidence on the effectiveness of using minimal contrast therapy for correcting consonant cluster errors, leading to generalization and maintenance for monosyllabic words.

During the last 30 years, number of studies showed limited effectiveness of auditory bombardment on functional speech sound disorders as there is no significant differences between using auditory bombardment as an adjunct to conventional therapy over the usage of conventional therapy alone [45,46,47].

The majority of the studies that addressed consonant cluster intervention indicated that treating consonant clusters requires about 6 weeks (3 to 9 h of therapy) to eliminate errors [43, 44, 48], although Baker and McLeod [49] reported that children may require a longer duration (7 to 9 weeks) to treat consonant clusters. The children in the current study acquired each target consonant cluster within 8 to 12 weeks for both minimal contrast therapy and auditory bombardment.

The variation of treatment effectiveness may be related to various parameters, including the number of treatment doses, the treatment intensity, and treatment duration [25].

In this study, the overall treatment duration was 12 weeks, for a total intervention duration of up to 24 to 34 days. This yielded a cumulative intervention intensity (dose number × dose frequency/day × total intervention duration in days) of mean = 556.00 for auditory bombardment and mean = 291.20 for minimal contrast therapy across children.

Plante et al. [46] reported that the treatment using auditory bombardment therapy was administered over a period of 5 weeks, for a total intervention duration of up to 25 days. This yielded a cumulative intervention intensity (dose number × dose frequency/day × total intervention duration in days) of 408 to 600 across children (M = 552). None of these treatment variables were significantly different between groups (two-tailed t test at p < .01).

In general, the intensity within research protocols for the conventional minimal pairs approach is 100 trials per session, with twice weekly sessions across approximately 18 sessions (if 1 h each) or 36 sessions (if 30 min each) for a total of 18 h, leading to a cumulative intervention intensity of 100 × 2 × 18–36 = 3600–7200 [50].

Our study shows that both therapy techniques are effective for cluster reduction remediation as there is no significant difference between total intervention duration and both therapy techniques. However, there is highly significant association between cumulative intervention intensity and minimal contrast therapy (p value < 0.001).

It is important to note that the influence of maturation of the subjects in this study on performance cannot be discounted. However, since improvement was observed for all children, it is unlikely that maturation alone was responsible. Most likely, there were positive effects from both maturation and the specific treatment procedures used.

Limitation

The present study provided valuable information about acquisition of consonant clusters and remediation of cluster reduction. Generalization to multisyllabic words or connected speech was not tested. Thus, further studies are needed to examine whether a phonological approach is effective for more complex speech stimuli.

Conclusion

The Egyptian Monosyllabic Consonant Cluster Test (EMCCT) is a valid and reliable assessment tool for identification of cluster reduction processes. Both minimal contrast and auditory bombardment approaches are applicable for cluster reduction remediation in Egyptian Arabic-speaking children.