Introduction

Background

Prosody represents the melodic and rhythmic aspects of the spoken language. It can be defined as “the supra-segmental features of speech that are conveyed by the parameters of fundamental frequency (F0), intensity, and duration” [1]. Fundamental frequency, intensity, and duration, when combined, create various prosodic components that convey various communicative functions, such as intonation. Intonation is the use of pitch in communication, and it is conveyed by pitch variation [1].

Perceptual prosodic cues are pitch, loudness, and duration. These cues are conveyed by their objective acoustic correlates which are fundamental frequency (F0) expressed in Hz, intensity in dB, and the length of the acoustic intervals, respectively. Stress is exhibited by the increase of the three acoustic correlates of prosody during the production of a specific syllable, word, or phrase. Durational aspects of prosody are reflected in the rhythm and tempo of speech. Accordingly, prosodic components, such as rhythm, intonation, and stress, reflect the indispensable role of prosody in language, emotional expression, and communication [1, 2].

In spite of the recent progress in the field of hearing aids and cochlear implants (CI) technology, the acoustic signal provided by these devices is still lacking. Cochlear implants allow the perception of more auditory information than those accessed by the conventional hearing aids, an advantage that is reflected in the better language development and communication of CI users [3].

Literature reports that children with cochlear implants nowadays attain better scores in speech perception and production measures [4]. However, the acoustic signal provided by CIs remains inferior. The effects of the degraded signal are particularly evident in the difficulties encountered by CI users with temporal fine structural processing, the perception of fundamental frequency and harmonic rapid variations, and intensity. Changes in duration, on the other hand, show less variations, and the durational aspects of speech could still be adequately captured by CIs [2, 5, 6]. The deteriorated signal might prove challenging in adequate speech comprehension and, in turn, in the production of intelligible speech [7].

Children with CIs have difficulty in discrimination between emotions, and they rely mainly on rhythmic cues and following speech in songs to perceive music, with significantly poorer performance when compared to their normal-hearing peers [8,9,10].

Inadequate perception of music and emotions seen in CI recipients is explained by poor representation of f0. Encoding of f0 is limited by the number of electrodes of the CI, and the insertion depth, which might be limited in some electrode arrays resulting in lack of stimulation of areas in the cochlea associated with lower frequencies [11]. Despite facing difficulties with pitch discrimination-based tasks, CI users are still able to successfully perform tasks with obvious F0-related tasks, such as gender identification with typical male and female voices [11, 12].

Regarding linguistic prosody, perception of some aspects such as stress positions on syllables and words is difficult for CI users [13]. Research has extensively studied the speech of children with CIs, albeit focusing on the segmental aspects [14]. The suprasegmental aspects of speech of the hearing-impaired children with CIs have not been as thoroughly investigated, despite the evident role of prosody in language development and speech intelligibility [14].

Studies concerned with the expression of prosody in children with CIs have focused on specific features with no comprehensive studies including the full scale of prosodic features and components. Attempting to explain the inadequate use of prosody by children with CI, literature shows that inadequate prosodic expression might be attributed to the poorer perception of fine spectral information that conveys pitch and intensity of speech. Temporal cues reflecting duration, on the other hand, are believed to be better perceived by CI devices [15, 16].

Prolonged exposure to a degraded speech signal leads CI users to resort to alternative methods for better perception of prosodic cues. CI recipients rely on the available durational cues and, to some extent, intensity cues to access meaning provided by prosodic cues [6].

Research shows that examining the production of speech prosody of CI users in comparison with their normal-hearing peers shows that CI children perform poorly regarding several prosodic aspects [17]. Regarding speech rate, CI subjects produce longer vowel duration, longer words, intra syllabic pauses, and longer pauses between words and sentences. This results in an overall slower rate, characteristic to their speech [18, 19].

Acoustic analysis of pitch parameters in CI children shows higher values of f0, in addition to a restricted pitch range, which results in monotonous speech production. Other studies show excessive intonation and instability resulting from excessive pitch variations. Shallow declination of fundamental frequency at the end of utterances, in addition, poor statement question discrepancy, is also reported [20,21,22,23,24]. Peng et al. (2008) [25] elicited the production of interrogative and declarative contrasts and showed that CI children showed less distinction between the two types of utterances in addition to the production of inappropriate intonation contours [25].

Moreover, difficulties in loudness control are detected in the prelingually deaf individuals with CI, such as inappropriate increase or decrease in loudness level and excessive amplitude variability [24, 26].

Several studies of F0, intensity, and duration of vowels produced by children with CI and normal-hearing children concluded that children with CI demonstrate a delay in the acquisition of the stress pattern of the language and produce a restricted pitch range. Additionally, CI children show less prominent lexical stress patterns compared to their normal-hearing peers [27]. Atypical stress patterns are attributed to inappropriate control of duration, intensity, and pitch parameters [4].

Lyxell et al. (2009) [17] examined the prosody production of 34 children with CI from the ages of 5 through 13 years and reported lower scores achieved by CI children than normal-hearing peers in the examined prosodic features on the word and phrasal level. Lenden and Flipsen (2007) [4] found significant deviations in the production of prominence and resonance by 6 children with CI in addition to problems with intensity, speech rate, and voice quality. Phrasing and pitch deviations were inconsistent.

Affect perception and production by CI individuals is of particular interest due to its social and communicative implications. All the previously mentioned prosodic features share in the appropriate expression of different emotions. Studies have shown that CI children find it difficult to make use of different intonational patterns to mimic different emotions [28]. Expression of affect in Arabic-speaking children with hearing impairment has improved with prosodic training in a study by Sobhy et al. (2021), even though the study had its limitations, as it included no reference to normal-hearing controls [25].

Arabic prosody resembles other languages in the manner of stress production. Arabic stress is conveyed by increasing the intensity, duration, and pitch of the stressed syllable (the minimal Arabic prosodic unit), with rules regulating stress placement, thus making the stress pattern of Arabic words predictable by the type, and the number of syllables of the word, with the penultimate syllable being the most common place for word stress placement [29].

However, Arabic stress is distinguished from some languages, such as tonal languages, in that changing the word’s stress pattern has no implication on changing the word’s meaning or its morphological category [30, 31]. All Arabic dialects are intonational, exhibiting differences in meaning at the sentence level, i.e., the post-lexical level. In terms of speech rhythm, Arabic is a stress-timed language, where stressed syllables placed at regular intervals are responsible for creating the rhythmic beat [2].

Research on prosody has exceedingly been concerned with prosody in English. There is a poverty of literature examining prosody in Arabic speakers. Furthermore, prosody production by Arabic-speaking CI children has received even less attention. To the extent of our knowledge, there are, for example, only two studies of the production of suprasegmental features in Egyptian Arabic-speaking hearing-impaired children, but the studies had the limitation of examining the effect of a prosodic intervention on prosodic production of children with hearing impairment with no reference to a control normal-hearing group [25, 32]. Where methodology is concerned, there has been less reliance on objective assessment, with the prevalence of subjective measures for assessment of prosody of CI children.

Objectives

The current study aims at the following:

  1. 1.

    Objectively analyzing the prosodic production of children with CI hereby addressing limitations in the current literature by studying Egyptian Arabic-speaking CI children’s production of prosody using objective measures.

  2. 2.

    Comparison of objective parameters of prosody between hearing-impaired CI children and their normal-hearing peers.

On the basis of the earlier review, the following hypothesis was tested: Prosodic features differ significantly between CI children and their normal-hearing peers.

Methods

The research was conducted in the time frame between September 2018 and July 2021 after approval by the ethics committee (IRB No: 00007555).

Study participants

Study participants were divided into two groups as follows:

  • Case group: Included 30 hearing-impaired CI verbal children aged from 6 to 10 years

  • Control group: Included 30 normal-hearing children within the same age range and normal language development

Children who met the following inclusion criteria were included in the study:

  1. 1.

    School-age children from the age of 6 through 10 years with normal hearing and language development

  2. 2.

    School-age CI children using auditory-verbal mode of communication with a sufficient aided response with their CIs (pure-tone threshold with CI less than 40 dB across all frequencies)

  3. 3.

    CI children with expressive language abilities that enable them to produce sentence length of at least 5–6 words

Children with other disorders affecting prosody such as childhood apraxia of speech, dysarthria, or intellectual disability were excluded. Prior to the assessment, informed consent was taken from the parents of all the children participating in the study.

The sample size was calculated using two proportions power analysis in NCSS and PASS program, for a minimum sample size required to detect a difference of 0.5 that achieves 80% power with a target significance level at 5%. The sample size calculated was 30 participants per group.

Evaluation of prosody

Evaluation of prosodic skills was done using selected tasks from the Arabic Prosodic Assessment Tool (APAT) [33]. The objective analysis tasks selected were real-time pitch, generic syllabic rate, and intonation stimulability. Recording of the signals was done in an acoustics lab in the phoniatrics unit via a microphone placed at a distance of 10 cm from the participant, and analysis of both groups’ speech samples was done using real-time pitch software, motor speech profile-advanced version software. Generic syllabic rate and intonation stimulability protocols were used from the motor speech advanced software to examine intonation and durational aspects of speech respectively.

The assessment protocol proceeded as follows:

  1. 1.

    Evaluation of pitch and energy using real-time pitch (Model 5121-Kay Pentax) as follows: The pitch range protocol required the participant to produce continuous phonation of the vowel /a/ over a period of 15 s, starting at a convenient pitch, gradually increasing to the highest pitch, and then gradually decreasing towards the lowest pitch that could be produced by the participant. By analysis of the recorded sample, the following parameters were obtained: mean frequency, mean fundamental frequency (Fo), pitch range, minimum frequency, maximum frequency, and standard deviation. Additionally, semitone range, semitones, mean Fo semitones, and standard Fo semitones were also obtained. Moreover, we also assessed mean energy dB, range, minimum energy, maximum energy, and standard deviation.

  2. 2.

    Motor speech profile using advanced version (Model 5141-Kay Pentax) as follows: (A) Intonation stimulability protocol was used to assess intonation. Participamts were required to imitate the production of a pre-recorded speech sample /Ɂentæ mesæfer Ɂelnæhærda wællæ bokra?/ (are you traveling today or tomorrow?). Analysis and computation of the recorded speech sample yielded the following parameters: running speech F0 (rF0) in Hz, running speech pitch period (rT0) in ms, highest fundamental frequency (rFhi) in Hz, lowest fundamental frequency (rFlo) in Hz, standard deviation of F0 (rSTD) in Hz, frequency variability (rvf0) %, and amplitude variability (rvAm)%.(B) Generic syllabic rate protocol was used to asses duration and pause. Participants were required to repeat a sentence at a comfortable level (composed of 11 syllables): /Ɂænæ rægæ3t bælædi Ɂembæreħ/ (I returned to my country yesterday). After recording the utterance, the clinician placed tags in appropriate positions to signal the beginning and the end of each syllable, followed by automatic computation to obtain the following parameters: average syllabic rate (SL rate) /s, average syllabic duration (SL sdur) in ms, average pause duration (SL pdur) in ms, percent speaking time (SL spk%), and percent pause time (SL pau%).

Statistical methodology

IBM SPSS software package version 24.0. was used for statistical analysis of the obtained data. Normally distributed quantitative data were described using mean and standard deviation. Independent t-test was used to compare the two normally distributed independent groups. Significance test results were quoted as two-tailed probabilities. The 5% level was considered for the significance of the results.

Results

Participant demographics

Sixty participants’ data was analyzed in the current study: Thirty CI children and 30 normal-hearing, age-matched peers in the age group (6 to 10 years). The ages in the case group ranged from 6 years to 10 years, 4 months with a mean of 8.28 ± 1.39, while the ages in the control group ranged from 5 years and 6 months to 10 years with a mean of 7.97 ± 1.09. Male-to-female ratio was 15 males (50%) to 15 females (50%) in the cases group and 14 males (46.6%) to 16 (53.3%) females in the control group.

All CI children had prelingual hearing impairment and bilateral profound SNHL in unaided conditions. The duration of CI use ranged from 2 to 6 years. Mean age of cochlear implantation for the cases group was 4.82 ± 1.24.

Descriptive analysis indicated matching between the two studied groups with respect to their demographic data and history.

Real-time pitch

Real-time pitch parameters are presented in Table 1. A statistically significant difference was found between both groups regarding the following pitch parameters: pitch range (p = 0.001); maximum frequency (p = 0.027); standard deviation of frequency (p = 0.001); and semitone range (p = 0.001). No statistically significant difference was found in mean frequency, fundamental frequency range, or minimum frequency.

Table 1 Comparison between the two groups regarding real-time pitch software

Regarding the assessed energy parameters, a statistically significant difference was found in mean energy (p = 0.001) and maximum energy (p = 0.001). No statistically significant difference was found between normal and hearing-impaired children using CI as regards energy range, minimum energy, and standard deviation.

Intonation stimulability

Intonation stimulability parameters are shown in Table 2. No statistically significant difference was detected between both groups in the assessed parameters, rF0 (p = 0.559), rT0 (p = 0.413), rFhi (p = 0.093), rFlo (p = 0.189), rSTD (p = 0.165), rvf0 (p = 0.331), and rvAm (p = 1.441).

Table 2 Comparison between the two groups regarding intonation stimulability protocol

Generic syllabic rate

Generic syllabic rate parameters are presented in Table 3. A statistically significant difference was found between normal hearing and CI children regarding SL rate (p = 0.01), SL sdur (p = 0.001), SL pdur (p = 0.0001), SL spk% (0.02), and SLpau% (0.022) denoting a statistically significant difference in all of the assessed parameters in the generic syllabic rate task.

Table 3 Comparison between the two groups regarding generic syllabic rate

Discussion

The syntactic, pragmatic, affective, and indexical roles of prosody are undoubtedly important contributors to effective communication [34]. Our objective in this study was to compare the acoustic parameters of prosody in Arabic-speaking Egyptian children with CI and their normal hearing (NH), age-matched peers, to detect the specific objective measures that differentiate children with CIs from those with NH, and to detect acoustic parameters that could serve as rehabilitation targets for CI children to improve their speech prosody.

Previous studies have mainly focused on the comparison between normal hearing and CI children regarding prosody perception, with less focus on prosody production [35]. Moreover, compared to other languages, Arabic prosody analysis has not received significant attention.

The results of objective analysis in the real-time pitch task demonstrated a statistically significant difference between CI and NH children regarding pitch range, maximum frequency, standard deviation of frequency, and semitone range. The obtained results might be attributed to the poor perception of frequency-specific information with CIs in comparison with normal hearing. As a result, deficient speech input might have a negative impact on the adequate control of pitch production. Thus, the poor pitch perception may hinder the adequate production of speech components that require adequate pitch modulation. Furthermore, children with CI might not be able to master the coordination between auditory, somatosensory, and motor-control subsystems necessary for adequate production of prosody. This might be another consequence to the poor perception of frequency-specific information with CIs in comparison with normal hearing. These findings regarding pitch parameters are consistent with previous studies that demonstrated a restricted range of frequency in hearing-impaired children that resulted in monotonous speech. Inappropriate decline of F0 at the end of utterances in addition to poor statement question contrast has also been reported [20,21,22,23,24]. Lee Y. and Sim H. (2020) showed that children with CIs tended to produce less distinctively statement and question utterances when compared to children with NH [36]. The difficulties in F0 alteration in these studies were attributed to poor breath control, fewer syllables/breath, or lower lung volumes. Pettinato et al. (2017) [27] reported that children with CI showed lesser pitch range than NH children during the production of disyllabic sequences.

No statistically significant difference was detected in any of the parameters of the intonation stimulability task parameters between both groups. This might be attributed to the inconsistent sensitivity and specificity of the intonation stimulability task in the assessment protocol which led to the results being statistically insignificant. Similar results were obtained by the previous use of this task in other studies such as Sobhy et al. (2021) [32] that also showed no significant difference in the parameters measured by the intonation stimulability task, despite the presence of significant differences in pitch parameters measured by other tasks.

Concerning the energy parameters evaluated in the current study, a statistically significant difference was found in mean energy and maximum energy. No statistically significant difference was found between NH and CI children as regards energy range, minimum energy, and standard deviation. This could be explained by inadequate respiratory control, abnormal speech respiratory patterns, and inadequate control of laryngeal functions exhibited by children with hearing impairment. Additionally, inadequate auditory feedback contributes to poor and inappropriate control of loudness levels in children with CI. This is consistent with previous reports on loudness parameters in the voice of prelingual hearing-impaired children that included inappropriately high or low loudness levels and excessive amplitude variability [24, 26].

A statistically significant difference was found between normal hearing and CI children regarding SL rate, SL sdur, SL pdur, SL spk%, and SLpau% denoting a statistically significant difference in all of the assessed parameters in the generic syllabic rate task. The results of the current study demonstrated prominent differences among the studied groups as regards duration parameters. This could be attributed to the inability of children with CI to maintain suitable speech rate and rhythm with a tendency to produce a significantly slower rate of speech as a result of lengthening of phonemes and in turn syllables. Inadequate breath control also contributes to the presence of pauses between words and syllables. The speech of those with HI tends to be produced in disparate words as opposed to the continuous flow in NH speakers; this results in an overall slower rate of speech and longer duration of syllables, words, and sentences and results in inappropriate speech rhythm. This is in line with studies that demonstrated that CI subjects have a distinctly slower speech rate due to longer vowel duration, longer words, intra-syllabic pauses, and longer pauses between words and sentences [18, 19].

The speech duration of hearing-impaired children is recounted in literature to be lengthy, with longer vowel duration than the speech of normal children. Inappropriate duration of speech conceivably leads to inappropriate speech intonation due to the fact that duration is one of the main features of prosody [37, 38]. This might have an implication on speech intelligibility as slower speech rates have been previously linked to poor speech intelligibility [18, 39].

A further explanation of the limited prosodic abilities in children with CIs is the period of early deprivation of auditory input prior to implantation which is critical for the timely development of adequate receptive and expressive speech and language skills [5, 23]. A considerable amount of research shows that newborns already have established considerable knowledge of prosody, and this is plausibly connected to their prenatal exposure to the native tongue [3]. The prenatal speech signal helps in forming the infants’ perception abilities as they perceive speech signals carrying information about vocalic segments of their native language. These signals convey information about pitch, intensity, and duration signaling prosody [40].

Conclusions

Prosody of speech in CI children is significantly different from the prosody of those with normal hearing in terms of pitch, loudness, and duration parameters. Objective assessment of prosody serves as a method of distinguishing the speech of CI children from those with normal hearing and could be used to specify rehabilitation goals to improve speech prosody. Objective measures of prosody could be used to track improvement in speech prosody or lack thereof following speech rehabilitation. Based on our findings and the findings of previous studies, the intonation stimulability task might not be sensitive to adequately assess speech intonation; it is preferable to use dynamic pitch range profile and speech rate parameters in the generic syllabic rate task when using the prosodic assessment protocol, as abnormalities in these measures reflect abnormal stress and intonation patterns.

Study limitation

The results of our study are considered mindful of some limitations. It would be useful for future research to compare objective parameters to subjective assessment measures. Additionally, it is recommended to assess the linguistic and phonological production skills of CI children, and correlate the results with the prosodic skills of children with CI, given that prosody has an important role in language development and speech intelligibility. Moreover, the assessment of prosody in Arabic-speaking patients with other communication disorders such as apraxia of speech and stuttering is recommended to obtain a prosodic profile for patients with the aforementioned disorders.