Amplitude envelope perception, phonology and prosodic sensitivity in children with developmental dyslexia
- First Online:
- Cite this article as:
- Goswami, U., Gerson, D. & Astruc, L. Read Writ (2010) 23: 995. doi:10.1007/s11145-009-9186-6
- 547 Views
Here we explore relations between auditory perception of amplitude envelope structure, prosodic sensitivity, and phonological awareness in a sample of 56 typically-developing children and children with developmental dyslexia. We examine whether rise time sensitivity is linked to prosodic sensitivity, and whether prosodic sensitivity is linked to phonological awareness. Prosodic sensitivity was measured by two reiterant speech tasks modelled on Kitzen (2001). The children with developmental dyslexia were significantly impaired in the reiterant speech tasks and in the phonological awareness tasks (onset and rime awareness). There were significant predictive relations between basic auditory processing of amplitude envelope structure (in particular, rise time), prosodic sensitivity, phonological awareness, reading, and spelling. The auditory processing difficulties that characterise children with developmental dyslexia appear to impair their sensitivity to phrase-level prosodic cues such as metrical structure as well as to phonology, but in this study phonological and prosodic sensitivity made largely independent contributions to reading.
The phonological difficulties experienced by children with developmental dyslexia are considered a cognitive hallmark of this neurodevelopmental condition across languages (Snowling, 2000; Ziegler & Goswami, 2005). Recently, there has been renewed interest in the possibility that impaired perceptual processing of the auditory speech signal underlies the phonological deficit in developmental dyslexia (e.g., Goswami et al., 2002). Following extensive investigation of the possibility that impaired rapid auditory processing explains the phonological difficulties experienced by children with dyslexia (Boets, Ghesquiere, van Wieringen, & Wouters, 2007; McArthur & Bishop, 2001; Tallal, 1980, 2004), the experimental focus has been shifting to accurate perception of the speech envelope and of the slower amplitude-driven modulations that are important for speech intelligibility. The most consistently-impaired auditory parameter in developmental dyslexia has been found to be amplitude envelope onset (rise time) or its correlate, amplitude modulation depth (Corriveau, Pasquini, & Goswami, 2007; Goswami et al., 2002; Hämäläinen, Leppänen, Torppa, Muller, & Lyytinen, 2005; Hämäläinen, Salminen, & Leppänen, 2009; Lorenzi, Dumont, & Fullgrabe, 2000; Muneaux, Ziegler, Truc, Thomson, & Goswami, 2004; Pasquini, Corriveau, & Goswami, 2007; Rocheron, Lorenzi, Fullgrabe, & Dumont, 2002; Richardson, Thomson, Scott, & Goswami, 2004; Thomson & Goswami, 2008; Thomson, Fryer, Maltby, & Goswami, 2006).
Indeed, in a recent meta-analysis, Hämäläinen et al. (2009) reported that 100% of studies using rise time measures with dyslexic participants had found a relationship between rise time sensitivity and reading, with a median effect size (Cohen’s d) of 1.0.
Rise times in the speech envelope are correlated with the onsets of syllables, particularly stressed syllables. Difficulties in rise time perception (the rate of change of the amplitude envelope at onset) should therefore affect syllabic segmentation and accurate perception of the components of the syllable, for example syllable onset and nucleus (Greenberg, 1999, 2006; Greenberg, Carvey, Hitchcock, & Chang, 2003). When a syllable begins with a sonorant sound, such as /w/, it will have a more gradual onset and therefore a gentler rise time than a syllable that begins with a plosive sound, such as /b/. Syllables that begin with plosive sounds will have very rapid onsets and therefore sharp rise times. Rise time is also correlated with vowel onset. Therefore, impaired perceptual sensitivity to auditory parameters like rise time is likely to be associated with impaired phonological awareness of syllables and their onset-rime constituents (Goswami et al., 2002). We have recently demonstrated that impaired sensitivity to rise time is also associated with what classically has been termed phonetic discrimination. We found that children with developmental dyslexia were not impaired at discriminating a synthetic sound like/ba/from a sound like/wa/when the phonetic discrimination depended on the rate of formant frequency change. In fact, they were more sensitive than age-matched controls. However, when the synthetic ba-wa distinction instead depended on the rate of change of rise time, children with developmental dyslexia were significantly impaired at discriminating/ba/from/wa/ (Goswami, Fosker, Huss, Fegan, & Szűcs, 2008). Therefore, rise time perception affects phonetic discrimination as well as the child’s awareness of larger phonological grain sizes like syllables and rimes.
Classically, rise time has been most closely associated with the perceptual experience of speech rhythm and stress (Hoequist, 1983; Scott, 1998). This suggests that impaired amplitude envelope perception should reduce sensitivity to speech prosody and rhythm (Corriveau et al., 2007). Prosody is a term used in linguistic theory to cover all aspects of grouping, rhythm and prominence, from sub-parts of the syllable up through the organisation of words in the phrase (Lehiste, 1970; Pierrehumbert, 2003). Indeed, many linguists propose that the units of prosodic organisation are arranged into a hierarchical structure, so that, for instance, syllables form feet (a strong syllable, and one or more weak syllables), feet form words, and words form different types of phrases (Nespor & Vogel, 1986). The prototypical foot is a structure with two syllables, a stressed syllable (strong) and an unstressed syllable (weak, e.g., “doctor”, “baby”). The distribution of stress in an utterance is governed by metrical “rules” (e.g., Hayes, 1995). At the level of the auditory signal, these aspects of grouping, rhythm and prominence are correlated with pauses (gaps) and with changes in fundamental frequency, duration and amplitude. Classical theories (e.g., Fry, 1954) accorded fundamental frequency the key role in stress perception, with duration and intensity (amplitude) playing secondary roles. Recent investigations using natural speech have shown that amplitude and duration cues apparently play a stronger role in prosodic prominence than fundamental frequency (Choi, Hasegawa-Johnson, & Cole, 2005; Greenberg, 1999; Kochanski, Grabe, Coleman, & Rosner, 2005).
Reduced sensitivity to prosodic and rhythmic cues could also affect the development of phonological awareness. For example, recent theories of developmental phonology have afforded a much greater role to prosodic sensitivity in explaining phonological development (Gerken, 1994; Pierrehumbert, 2003; Vihman & Croft, 2007). One recent theoretical perspective is that phonological development depends on the child storing language-specific phonotactic templates, or prosodies (e.g., CVCV, VCV), based on their specific experiences of adult input and their own babbling practices (Vihman & Croft, 2007). These templates are effectively high-dimensional spectro-temporal auditory patterns (described as ‘rich phonology’ in Port’s theory of language representation, see Port, 2007). Such high-dimensional auditory patterns would correspond to the speech envelopes for different phonotactic templates. Vihman and Croft argued that their “radical template phonology” was the same as Pierrehumbert’s (2003) notion of an early-acquired “prosodic structure”. Pierrehumbert (2003) adopted a bottom-up statistical approach to language acquisition, proposing a model based on the infant acquiring complex language-specific exemplars from the input that were stored in rich phonetic and prosodic detail. Pierrehumbert dismissed the classical notion of a universal inventory of phonemes or phones, according to which rapid changes in frequency and intensity (formants) are the acoustic correlates of phonemes (Blumstein & Stevens, 1981). She argued instead that phonetic perception is dependent on the prosodic context. At the auditory level, the prosodic context is amplitude envelope structure. Pierrehumbert proposed that as prosodic encoding is learned concurrently with segmental encoding, the resultant infant templates would be relatively coarse-grained (i.e., reflecting syllable onsets and rimes). Consistent with this theoretical position, experimental studies indeed show that infants are sensitive to both stress templates and segmental transitions (e.g., Curtin, Mintz, & Christiansen, 2005; Mattys & Juscyk, 2001). Similarly, phonological awareness prior to literacy acquisition is based on the coarser grain sizes of syllables, onsets, and rimes (Ziegler & Goswami, 2005).
With respect to basic auditory processing of the amplitude envelope, the template perspective would suggest that the early acquisition of such “prosodies” should vary with the quality of rise time perception. The dyslexic child’s reduced sensitivity to the auditory structure of the amplitude envelope should therefore impair their prosodic sensitivity. Dyslexic children’s impaired phonological awareness of linguistic units like onsets and rimes may either accompany this impaired prosodic sensitivity, or be causally related to it, consistent with Pierrehumbert’s (2003) proposal for infancy. If prosodic phonology is further allowed a role in syntactic development, as proposed by Gerken and her colleagues (e.g., Gerken, 1994; Gerken & McGregor, 1998), then impaired basic auditory processing of amplitude envelope cues could also play a role in speech and language impairments (see Corriveau & Goswami, 2009; Corriveau et al., 2007, for relevant data).
To date, the contributions made by prosodic sensitivity to reading development have largely been explored from the perspective of reading fluency and reading comprehension rather than phonological awareness and decoding (e.g., Goetry, Wade-Woolley, Kolinsky, & Mousty, 2006; Schwanenflugel, Hamilton, Kuhn, Wisenbaker & Stahl, 2004; Wade-Woolley & Wood, 2006; Whalley & Hansen, 2006). For example, Miller and Schwanenflugel (2008) suggested that prosodic features in reading aloud such as appropriate phrasing, intonation and stress reflected the otherwise invisible process of reading comprehension. Dysfluent reading, marked by long pausing, was more likely related to word decoding difficulties. Whalley and Hansen (2006) also suggested that prosodic sensitivity should be important for reading comprehension. They pointed out that prosodic structure highlights important semantic and pragmatic information at the sentence level. Only a minority of studies have considered links between prosodic sensitivity, phonological awareness and decoding (Holliman, Wood, & Sheehy, 2008; Wood, 2006; Wood & Terrell, 1998). For example, Wood (2006) found that 9% of unique variance in rhyme awareness in a sample of 31 children aged 5–7 years was predicted by a metrical stress sensitivity task in which the “wrong” syllable was stressed (“soFA” instead of “SOfa”). Holliman et al. (2008) found that the same metrical stress task accounted for unique variance in reading ability (3.8%) in a new cohort of 5- and 6-year-olds, even when rhyme and phoneme awareness were both controlled using multiple regression techniques. Wood (2006) suggested that it might be easier to become aware of phonemes in stressed syllables. This would be consistent with an amplitude envelope perspective, as the rise times and the depth of amplitude modulation in stressed syllables will be greater than in unstressed syllables.
A different method for studying the role of prosodic sensitivity in reading comprehension is found in a study by Whalley and Hansen (2006). They used two tasks to assess prosodic sensitivity, drawn from Kitzen (2001). One was a compound noun task (in which a change in stress changed meaning, e.g., HIGHchair vs. HIGH CHAIR) and one was a “DeeDee” task (when each syllable in a familiar phrase was replaced with the word “Dee”, discussed in detail below). Whalley and Hansen showed that prosodic sensitivity was indeed associated with reading comprehension, but also with reading accuracy, in a group of typically-developing 9-year-old children. The associations were significant even when phonological awareness was controlled using multiple regression techniques. Whalley and Hansen further demonstrated that prosodic sensitivity contributed 5% of unique variance to reading comprehension after phonological awareness and decoding accuracy were controlled (this relationship was found for the DeeDee task only, the compound noun task was not a significant predictor). Whalley and Hansen (2006) argued that the DeeDee task captured the prosodic skills that were important for reading comprehension.
Whalley and Hansen’s (2006) version of the DeeDee task was adapted from a ground-breaking Ph.D. dissertation by Kitzen (2001). Kitzen developed the reiterant speech technique (Nakatani & Schaffer, 1978) for use with dyslexic participants. In reiterant speech, each syllable in a word is converted into the same syllable (here DEE), thus removing most phonetic information while retaining the stress and rhythm patterns of the original words and phrases. Kitzen converted film and story titles into “DeeDees”, so that (for example) “Casablanca” became DEEdeeDEEdee (STRONG weak STRONG weak). Adolescent participants with dyslexia heard a tape-recorded DeeDee sequence while viewing three alternative (written) choices, for example “Casablanca”, “Omega Man” and “The Godfather”. Kitzen reported that the participants with dyslexia were significantly poorer in the DeeDee task than age-matched controls. Performance in the DeeDee measure was significantly associated with syllable and phoneme segmentation skills, word reading abilities and reading comprehension. In logistic regression analyses carried out to predict group membership (dyslexic vs. control), the DeeDee measure was a highly significant predictor of group status (along with syllable segmentation and rapid naming measures). All three measures together predicted group membership with 97% accuracy (phoneme segmentation was not a significant predictor). Interpretation of these findings is hampered by the fact that the DeeDee task involved written stimuli. Nevertheless, Kitzen (2001) argued that it was time to broaden dyslexia research from a focus on “phonemic phonology” to incorporate “prosodic phonology”.
As a first attempt at this, we set out here to use the reiterant speech task to explore associations between prosodic sensitivity, phonological awareness, and basic auditory processing in younger dyslexic children. We created two novel measures of prosodic sensitivity for use in the current study. Both used stimuli recorded by an adult female speaker of Southern British English and were digitised for computerised presentation to ensure consistency. In the first measure, based on celebrity names, the adult “spoke” the words or phrases in “DeeDees”, as in Whalley and Hansen (2006). The “DeeDees” therefore retained the metrical (phrase level) structure of the original words or phrases. The second measure removed this information by utilising four synthesised tokens, “DEE” and “dee” in initial and final position, that were combined in the appropriate strong–weak syllable sequence to reflect the chosen stimuli. This task was based on film and book titles, essentially providing a synthetic version of Whalley and Hansen’s task based on stress, so that “Harry Potter” became [DEE dee DEE dee]. Both measures were expected to tap children’s prosodic sensitivity, and to be more difficult for children with developmental dyslexia. In addition, we expected that the accuracy of children’s rise time perception would be associated with their performance in the DeeDee tasks. Based on our prior studies, we also expected that basic auditory processing of amplitude envelope structure would be related to phonological awareness. To examine this, we included a range of auditory processing measures. Finally, we expected that performance in the DeeDee tasks would be related to both phonological awareness and reading abilities.
Fifty-six children aged between 8 and 15 years participated in this study. These children had been taking part in a longitudinal study of developmental dyslexia (for group performance and concurrent relations with phonology and reading in phase 1, see Thomson & Goswami, 2008). Only children who had no diagnosed additional learning difficulties (e.g., dyspraxia, attention-deficit/hyperactivity disorder, autistic spectrum disorder, speech and language impairments), a nonverbal IQ above 80, and English as the first language spoken at home were included. All participants received a short hearing screen using an audiometer. Sounds were presented in both the left and right ear at a range of frequencies (250, 500, 1,000, 2,000, 4,000, 8,000 Hz), and all subjects were sensitive to sounds within the 20 dB HL range.
Participant characteristics by group
Age in monthsa
Reading BAS SSb
Reading ability scorec
Spelling BAS SSb
Spelling ability scorec
TOWRE SWE SSb
Thirty-six typically-developing control children were included from local schools. Of these, 21 were chronological age-matched controls (CA group; 7 male, 14 female; mean age 12;0, s.d. 14 months; retained from an original group of 23 children) and 15 were originally reading level matched controls (RA group; 9 male, 6 female; mean age 9;3, s.d. 20 months; retained from an original group of 21 children). The RL group was still matched to the children with dyslexia when raw scores on the tests of reading and spelling were compared, as there were no significant differences in raw scores by group. These scores are also shown in Table 1. The relatively small number of RL children retained and the large reading deficits shown by the children with dyslexia meant that in the sample as a whole, age was negatively correlated with reading standard score, r = −.31, p = .02. The BAS ability scores and the TOWRE raw scores were positively correlated with age, however, (e.g., BAS reading ability and age, r = 0.35, p = .009). The ability scores rather than the standard scores are used in the analyses reported below, however, essentially the same pattern of results was found when standard scores were used in place of ability scores.
Standardized ability tests
All children had completed four subscales of the Wechsler Intelligence Scale for Children in an earlier phase of the study (WISC-III; Wechsler, 1992): Block Design, Picture Arrangement, Similarities and Vocabulary (these four scales yield a short-form IQ). Literacy and number skills were re-assessed at the current test point using the British Ability Scales Reading, Spelling, and Mathematics subtests (Elliott, Smith, & McCulloch, 1996), along with the real word and nonword subtests of the Test of Word Reading Efficiency, (TOWRE; Torgesen, Wagner, & Rashotte, 1999). A measure of receptive vocabulary, the British Picture Vocabulary Scales, was also re-administered (Dunn, Dunn, Whetton, & Pintillie, 1982).
Phonological awareness measures
Two oddity tasks were administered, using digitised speech created from a native female speaker of standard Southern British English (rhyme oddity) and a native female speaker of Spanish (Spanish onset oddity). In each case, the children listened to sets of three words and had to select the nonrhyme (e.g., gap, nap, Jack) or the item with a different onset (e.g., bir, sal, boz). We included a Spanish phonological awareness task as we were concerned that the rhyme oddity measure (also administered when the study began) might show ceiling effects, however, this was not the case. The children listened to the words through headphones and their responses were recorded using a minidisc recorder. Three different orders of trial presentation were used, counterbalanced across children, and practice trials were always given.
Prosodic sensitivity (DeeDee) tasks
Children’s familiarity with the target stimuli was first ascertained. The children looked at a booklet of pictures that represented the different names and phrases being used with the experimenter, and named those that they knew. Performance in the DeeDee tasks was then scored on the basis of prior familiarity with the target phrase (e.g., if a child did not name Harry Potter correctly, then his/her response on the Harry Potter trial was discounted for the purposes of analysis). The DeeDee tasks were delivered on the computer, with the child listening through headphones in a two alternative forced choice paradigm. The child saw the picture representing the target phrase, and then pressed a button to listen to two DeeDee phrases, approximately matched for overall number of syllables. We did not use exactly the same targets in the two tasks as this was not motivating for our relatively old participants. The phrases were presented consecutively with a separation of 2 s. One of the phrases matched the target picture, and the child’s task was to choose the DeeDee sequence that they thought matched the picture. Practice trials were always given prior to the commencement of the task, using novel targets. All trials used are shown in the Appendices.
i. Famous names. This task was based on celebrities familiar to our young teen participants, such as David Beckham (a footballer) and Ant and Dec (popular television presenters of a children’s show). A female speaker (the second author) spoke these names in “Dee Dees”, retaining natural phrase-level prosodic patterning. Famous names were chosen to be similar to the film and book titles used in the second task. There were 9 trials in total (see Appendix A). Cronbach’s alpha for this task was 0.473.
ii. Film and book titles. This task adapted the stimuli used by Whalley and Hansen (2006) and Kitzen (2001) and used 18 trials. It was originally developed in collaboration with Jenny Thomson and Martina Huss (Thomson, Huss, & Goswami, 2006). However, in contrast to Whalley and Hansen (2006) and Kitzen (2001), the stimuli for this task were not produced by an adult “speaking in Dee Dees”. Instead, four synthesized Dee tokens (stressed and unstressed in initial vs. final position) were created that incorporated no cues to phrasal-level constituents, which were then used in the appropriate sequence for each phrase. Hence The Lion King became [dee DEE dee DEE]. Successful matching in this task was expected to be more demanding, as natural prosodic cues had been eliminated and stress was the only relevant parameter. This task will henceforth be referred to as the Films task (see Appendix B). Cronbach’s alpha for this task was 0.454.
A battery of tasks was designed in order to measure different aspects of amplitude envelope structure. As rise time is a key aspect of our theory, three rise time measures were included. Two had been administered to this sample 2 years earlier (1 Rise, Rise ABABA), and a third was novel (Rise Duration Rove, created by Martina Huss). A non-rapid measure of sensitivity to frequency was also included, as fundamental frequency is important in prosodic perception. This measure utilized an ABABA format to enhance task sensitivity, following advice from an auditory expert (Brian Moore, see Thomson & Goswami, 2008). A measure of sensitivity to intensity was included as an auditory control task, and therefore also utilized an ABABA format. None of our prior studies have found simple intensity discrimination (a 2 interval forced choice task, 2IFC) to be impaired in developmental dyslexia. Vowel duration is also an important parameter of syllable structure, hence an ABABA duration measure was included (and sensitivity to nonsense syllable duration is a significant predictor of reading, see Richardson et al., 2004). For further information and rationale for these particular auditory tasks, see Thomson and Goswami (2008). Finally, a measure of rhythmic sensitivity was included (Tempi task). Speech rhythm is linked to prosody, and in prior studies (Corriveau & Goswami, 2009) we have found that tapping to a rhythmic beat is related to language development. Hence it was deemed of interest to see whether rhythmic perception of tone sequences would be linked to prosodic sensitivity as measured by the DeeDee tasks.
The presentation format used for all the auditory tasks was 2 interval forced choice (2IFC). The tasks were administered using a dinosaur software programme originally created by Dorothy Bishop, Oxford University. Children were introduced to a pair of cartoon dinosaurs. It was explained that each dinosaur would make a sound and that the child’s task was to decide which dinosaur’s sound had a particular characteristic. The child then participated in five practice trials in which they heard sound pairs and were asked to judge which dinosaur sound showed the target feature. As an integral part of the software programme feedback was given after every trial on the accuracy of performance. During the practice period, feedback was accompanied by further verbal explanation and reinforcement by the researcher. The dinosaur programme is adaptive and the more virulent PEST (Parameter Setting by Sequential Estimation) procedure (Findlay, 1978) was used in order to determine how much and in what direction the stimulus level should be shifted as a result of the child’s previous performance. The maximum trial number was 40. At the end of each task, a threshold value was yielded which indicated the smallest difference at which the participant could still discriminate between the two sounds with a 75% accuracy rate.
i. Amplitude envelope onset discrimination task (1 Rise). In the one ramp rise time discrimination task, two 800 ms tones were presented. One of the tones in each pair was always the standard tone, which had a 300 ms linear rise time envelope, 450 ms steady state, and 50 ms linear fall time. The second tone varied in rise time along a continuum, with the sharpest rise time being 15 ms. Within each trial, each pair of sounds was presented with an inter-stimulus-interval of 500 ms. This task will be described as the “1 Rise” task.
ii. Amplitude rise time discrimination task(Rise ABABA). Two tone sequences were used comprising five 210 ms tones with an inter-stimulus interval of 100 ms. One sequence in the pair consisted of tones with a constant standard rise time of 15 ms (‘AAAAA’), whilst the other sequence would consist of five tones alternating between the standard rise time and a longer rise time (‘ABABA’). The task used a continuum of 60 stimuli which increased in rise time in equal intervals from 15 to 192 ms. Using the dinosaur programme described above, it was explained to children that each dinosaur would make a series of sounds and that their task was to decide which dinosaur made the sounds that went louder and quieter (this was the most noticeable perceptual change in the ABABA format). This task will be described as the “Rise ABABA” task.
iii. Roving amplitude envelope onset discrimination task (Rise Duration Rove). This was exactly like the 1 Rise task, except that the duration of each stimulus varied randomly across the experiment. This was done by randomly roving the duration of the steady state portion of the stimulus from 450 ms to 735 ms. If an amplitude envelope is always 800 ms long with a 50 ms fall time (as in the 1 Rise task), and the rise time is either 15 ms or 300 ms, then the steady state portion of the first stimulus will be 735 ms whereas for the second it will be 450 ms. Although perceptually both stimuli sound similar in length, it is possible that in the 1 Rise task, children could discriminate between the stimuli on the basis of the difference in steady state duration rather than on the basis of the difference in rise time. This could not occur here, as the duration of the steady state roved randomly throughout the experiment, whereas rise time always provided a consistent cue to discrimination. This task was a pure measure of rise time sensitivity. This task will be described as the “Rise Duration Rove” task.
iv. Intensity discrimination task (Intensity ABABA). This task similarly employed two tone sequences. In each sequence, five 200 ms sine tones were presented with 50 ms rise time, 50 ms fall time, and inter-stimuli intervals of 100 ms. In one sequence, the tones were all of constant intensity 75 dB (‘AAAAA’) whilst in the other sequence, alternate tones had reduced intensity (‘ABABA’). The task used a continuum of 40 stimuli which decreased in intensity at constant 1.7% steps from the standard 75 dB tone. It was explained to children that each dinosaur would make a series of sounds and that their job was to decide which dinosaur made the sounds that went louder and quieter. Note that while both the intensity task and the Rise ABABA task gave a perceptual experience of varying loudness, in the rise time task this percept arose because of the different rates of amplitude change at onset, while in the intensity task this percept arose because the sounds differed in intensity while maintaining a constant rate of amplitude change. Hence, if children with developmental dyslexia can discriminate changes in loudness in the Intensity ABABA task but not in the Rise ABABA task, this would strongly implicate rise time as the source of their difficulties (as was the case in Phase 1 of this study, see Thomson & Goswami, 2008).
v. Frequency discrimination task (Frequency ABABA). As in tasks (ii) and (iv), children were presented with two sequences of sounds in a 2IFC format. In each sequence, five 25 ms sine tones were presented with 10 ms rise time, 10 ms fall time and inter-stimuli intervals of 100 ms. In one sequence, the tones were all of constant frequency (600 Hz; ‘AAAAA’) whilst in the other sequence, alternate tones had higher frequency (‘ABABA’). The task used a continuum of 60 stimuli which increased in frequency at constant 2.6 Hz intervals from the standard 600 Hz tone. The task was introduced by explaining to children that each dinosaur would make a series of sounds and that their task was to decide which dinosaur made the sounds that went up and down.
vi. Duration discrimination task (Duration ABABA). Children were again presented with two sequences of sound. This time, in each sequence, five 160 ms sine tones were presented with 50 ms rise time, 50 ms fall time, and inter-stimuli intervals of 100 ms. In one sequence, the tones were all of constant duration (160 ms; ‘AAAAA’) whilst in the other sequence, alternate tones had a longer duration (‘ABABA’). A continuum of 40 stimuli was used which increased in 1.8 ms steps from the standard 160 ms tone. Children were instructed that each dinosaur would make a series of sounds and their task was to decide which dinosaur made the sounds that went shorter and longer.
vii. Receptive Rhythm: Tempi discrimination (Tempi). Using the dinosaur program interface, this task required a decision from children as to which sequence of sounds with a regular tempo had a larger inter-stimulus-interval (ISI), thus sounded slower. A continuum of 30 sound sequences was created and each sequence contained nine 40 ms sine tones. The standard tone sequence had ISIs of 210 ms and successive stimuli had ISIs increasing at 1 ms intervals (range 210–240 ms).
Mean performance by group in the experimental tasks
Famous names % correct
Film titles % correct
Oddity rhyme (Max. = 20)
Oddity onset Spanish (Max. = 20)
Rise Duration Roveb
For the auditory processing tasks, comparisons with the chronological age controls showed significant differences between groups for two of the rise time tasks, Rise ABABA and Rise Rove Duration (t = 2.02, p = .05 and t = 2.86, p = .007), and for the Frequency ABABA task (t = 3.47, p = .001). The Rise ABABA task had also shown a group difference 2 years earlier (previous mean group thresholds 16.1 and 8.1, respectively), as had the 1 Rise task (previous mean group thresholds 18.5 vs. 10.7, respectively). Sensitivity in the 1 Rise and Rise ABABA tasks had thus clearly developed considerably for both groups. The Frequency ABABA and Duration ABABA tasks had also shown group differences 2 years earlier (mean group thresholds 27.8 and 17.7 [Frequency] and 16.3 and 10.6 [Duration], respectively). While sensitivity had increased in both tasks for both groups, the relatively small increase for the control children in the Duration ABABA task meant that sensitivity to duration now appeared to be age-appropriate for the children with dyslexia. Finally, while both the Intensity ABABA task and the Rise ABABA task gave a perceptual experience of varying loudness, the children with developmental dyslexia were only impaired in the latter task, as in Thomson and Goswami (2008). This strongly implicates rise time as a major source of dyslexic children’s auditory perceptual difficulties, as in the rise time task the varying loudness percept depended on manipulating rise time while keeping amplitude constant, while in the intensity task it depended on actual differences in amplitude, keeping rise time constant (see Thomson & Goswami, 2008, for a figure of the stimuli).
Pearson correlations between the auditory measures, DeeDee tasks, phonological awareness, and literacy
Rise D Rove
Rise D Rove
Pearson correlations between DeeDee performance, phonology, and literacy measures
As our hypothesis is that difficulties in perceiving the auditory structure of amplitude envelopes affects prosodic sensitivity as well as phonological awareness, it was of interest to explore the predictive relationships between the children’s auditory processing skills and their performance in the DeeDee tasks, phonological awareness tasks and reading once age and IQ were controlled. These relationships were explored using multiple regression procedures with the age-matched children (N = 40). Firstly, predictive relations between the different auditory measures, the DeeDee tasks, rhyme awareness (the native language phonological measure) and reading ability were explored using 3-step fixed entry multiple regression equations, in which age was always entered at Step 1, and IQ at Step 2. The IQ measure was an estimate of full-scale IQ and hence controlled for verbal as well as non-verbal IQ, a relatively stringent control. Secondly, we investigated whether prosodic sensitivity made an independent contribution to literacy from phonological awareness, using 4-step fixed entry multiple regression equations, in which age was always entered at Step 1, IQ at Step 2, and either a phonological (rhyme awareness or phoneme awareness [Spanish onset task]) or prosodic (Famous Names task) measure was entered at Step 3. Either a prosodic (Famous Names task) measure or a phonological (rhyme awareness or phoneme awareness) measure was then entered at Step 4.
Stepwise regressions showing the unique variance in the DeeDee tasks, rhyme awareness and reading ability accounted for by the different auditory processing measures (standardized Beta and R2change)
2. WISC IQ
3. 1 Rise
3. Rise ABABA
3. Rise Duration Rove
3. Frequency ABABA
3. Duration ABABA
3. Intensity ABABA
With respect to phonological awareness and reading, our previous studies with younger children suggest that the three rise time measures (1 Rise, Rise ABABA and Rise Duration Rove) should predict unique variance in both outcome measures. In prior studies, duration and frequency have also sometimes been predictors of individual differences in phonological awareness, whereas intensity and tempo discrimination have not. Table 5 shows that the significant predictors of performance in the rhyme oddity task in the current study were Rise Duration Rove, Frequency and Duration ABABA, and Tempi perception. The Rise ABABA measure just missed being a significant predictor of rhyme awareness (8% of unique variance, p = .051). Hence results with these older children for phonological awareness are in line with our prior studies (the exception, the 1 Rise task, showed low thresholds for both dyslexics and controls in this study).1 The strongest predictor of rhyme awareness was the novel Rise Duration Rove measure, a pure measure of rise time discrimination, which accounted for 19% of unique variance in rhyming ability. This measure requires accurate rise time perception for successful performance, as the duration of the steady-state portion of the stimulus does not provide a complementary acoustic cue. Hence amplitude envelope onset measures, as well as sensitivity to frequency, duration and rhythm, are predictive of the quality of children’s lexical phonological representations. Regarding reading ability, there were only two significant auditory predictors, Rise Duration Rove and Frequency ABABA. These measures accounted for 20 and 17% of unique variance in reading, respectively.
Stepwise regressions showing the unique variance in reading, spelling and nonword reading (BAS ability and TOWRE raw scores) contributed by prosodic sensitivity compared to phonological awareness (standardized Beta and R2change)
2. WISC IQ
3. Famous DeeDee
4. Famous DeeDee
3. Famous DeeDee
4. Famous DeeDee
Stepwise regressions showing the unique variance in reading, spelling and nonword reading (BAS ability and TOWRE raw scores) contributed by prosodic sensitivity compared to phonological awareness after controlling for rise time sensitivity (standardized Beta and R2change)
2. WISC IQ
3. Rise D Rove
4. Famous DeeDee
5. Famous DeeDee
In this study, we set out to explore the prosodic sensitivity of children with developmental dyslexia as measured by the DeeDee task and potential associations between prosodic sensitivity, basic auditory processing, phonological awareness and literacy development. Two novel DeeDee tasks, modelled on Kitzen (2001) and Whalley and Hansen (2006), were designed to measure prosodic sensitivity, one of which (the Famous Names task) preserved metrical or phrase-level prosodic structure. The other task (the Films task) removed all prosodic cues except stress. For the Famous Names task, the children had to abstract the prosodic organisation (syllable, foot, phrase) and match this representation to someone speaking in “DeeDees” that preserved syllable, foot-level and some phrase-level information. The Films task used less prosodically-rich stimuli. Children with developmental dyslexia found both of the DeeDee tasks more difficult than their age-matched controls, performing at an equivalent level to younger reading-level matched children. Both tasks were performed at above-chance level by all participant groups.
Multiple regression analyses indicated that each measure of prosodic sensitivity was predicted by basic auditory processing skills, but the nature of the association found depended on how the prosodic measure had been constructed. Performance in the Famous Names task, which preserved phrase-level prosodic structure, was significantly predicted by two of the three rise time measures and was almost significantly predicted by frequency discrimination (p = .054). This suggests that rise time is an important cue to prosodic patterning, as would be expected if it is an important cue for speech rhythm and stress. Classical accounts of stress perception accord key roles to fundamental frequency, amplitude and duration (Fry, 1954). Our data suggest that the Frequency ABABA task is a useful way of assessing the contribution of frequency discrimination to performance in the DeeDee tasks. However, the Duration ABABA task does not appear to be a useful way of assessing the contribution of duration discrimination to performance in DeeDee tasks. The failure of the CA control group to show an improvement in duration thresholds over a 2-year period (comparing their performance to that reported by Thomson & Goswami, 2008) could mean that the ABABA format is unsuited as a developmental measure of duration discrimination. For example, the relatively long tone sequences involved could place such a heavy processing load on short-term memory that this obscures individual differences in auditory sensitivity. For the Films task, only the measure of sensitivity to intensity was a significant predictor of successful performance. The 1 Rise measure was almost a significant predictor (p = .051). This suggests that stress perception was the key factor for successful performance in the Films task.
We were also interested in possible links between auditory processing, prosodic sensitivity and phonological awareness. A brief review of recent research in child phonology (e.g., Pierrehumbert, 2003; Vihman & Croft, 2007) was used to establish that children’s earliest phonological representations code both phonetic and prosodic information, with phonetic representation dependent to some extent on prosodic context (see also Greenberg, 1999, 2006, for a converging view from auditory science). Firstly, the multiple regression analyses established that phonological awareness of rhyme was significantly predicted by a number of the measures of amplitude envelope structure, notably Rise Duration Rove, Frequency ABABA and Duration ABABA. This replicates findings from previous studies using other measures of rise time, frequency and duration discrimination (e.g., Corriveau et al., 2007; Goswami et al., 2002; Richardson et al., 2004; Thomson & Goswami, 2008). The novel Rise Duration Rove measure in particular accounted for 19% of unique variance in rhyme awareness, and a measure of rhythmic perception (Tempi task) accounted for a significant 9% of unique variance. These findings suggest that rhythmic awareness and rhyme awareness are linked. The Rise ABABA measure contributed 8% of unique variance to rhyme awareness, just missing significance (p = .051).
However, the correlational analyses suggested that rhyme awareness and prosodic sensitivity as measured by the DeeDee tasks were not significantly associated with each other. Instead, the DeeDee measures were significantly associated with a measure of phonological awareness of onsets of unfamiliar lexical forms (Spanish words). This association was likely due to shared auditory processing demands, as the only significant predictor of performance in the Spanish onset oddity task was Rise Duration Rove, which accounted for 11% of unique variance. Therefore, the DeeDee tasks may not be the optimal measure for assessing the relationship between phonological awareness and prosodic sensitivity in children, at least when the children are relatively old (here, 12 years). More promising may be a measure developed by Wood and Terrell (1998), which used low-pass filtered speech. In their study, natural utterances were low-pass filtered, and 8-year-old children were asked to identify what was being said. The children were asked to match the filtered sentence to one of two natural utterances, one with the same metrical structure as the target filtered sentence, and one with a different metrical structure. Wood and Terrell found that task performance was significantly associated with reading development in their sample of good and poor readers (phonological awareness was not measured). Using a different measure of metrical sensitivity (the ‘soFA’ task), Wood (2006) reported that individual differences in recognising the intended referent predicted variability in rhyme detection and spelling development in children aged 5–7 years, but not in reading development, whereas Holliman et al. (2008) also found a relationship with reading development. It would be interesting to give the filtered speech task to older children, to see whether stronger associations between phonological awareness and prosodic sensitivity would be found than was the case in the current study.
We were also interested in the associations between prosodic sensitivity, phonological awareness and literacy acquisition. Two possibilities regarding how phonological awareness and prosodic sensitivity might be predictive of literacy acquisition were considered. The first was that the two types of sensitivity would account for largely shared variance in literacy outcomes. This is because according to the theories of phonological development proposed by Pierrehumbert (2003) and by Vihman and Croft (2007), both are developmentally inter-dependent in determining the quality of the child’s phonological representations. Alternatively, it could be that as children get older the two types of measure become relatively independent and therefore will each account for independent variance in literacy outcomes. Phonological awareness and prosodic sensitivity measure different aspects of phonological representational quality, which might diverge with age and reading experience (for example, it is possible to produce perfectly accurate phonological structures without the appropriate prosodic intonation, as in the “wooden” or “robotic” diction of some children with autism, Peppe, McCann, Gibbon, O’Hare, & Rutherford, 2007). Four-step multiple regression equations exploring whether prosodic sensitivity and phonological awareness made independent or overlapping contributions to progress in reading and spelling real words showed that both measures contributed unique variance, whatever their order of entry, when rhyme awareness was the phonological measure. This was not true for phoneme awareness, which appeared to share overlapping variance with prosodic sensitivity. This provides some support for Wood’s (2006) suggestion that it is easier to become aware of phonemes in stressed syllables (i.e., in the presence of stronger prosodic cues). For real word reading and spelling, rhyme awareness and prosodic sensitivity together explained up to 40% of the variance. When nonword decoding was considered, the two measures together explained up to 39% of the variance. These are large amounts of variance, and together with age and IQ, rhyme awareness and prosodic sensitivity accounted for between 51 and 62% of the total variance of these different aspects of literacy in the current sample.
Therefore, the auditory processing difficulties that characterise children with developmental dyslexia are indeed associated with both reduced sensitivity to prosodic structure and reduced phonological awareness. Both in turn are associated with impaired progress in literacy. Our basic hypothesis (e.g., Goswami et al., 2002; Richardson et al., 2004; Corriveau et al., 2007) has been that sensitivity to amplitude modulation and the structure of the amplitude envelope is important for setting up the phonological lexicon from birth. We have shown here that reduced sensitivity to the auditory structure of the amplitude envelope also has consequences for prosodic sensitivity, as well as for phonological awareness of rhyme. In addition, the findings for the Frequency ABABA task (developed for this longitudinal study) suggest that children’s discrimination of rises and falls in frequency (rather than rapid frequency detection, see Richardson et al., 2004) is also important for phonological awareness and reading.
These auditory findings fit well with the recent theories of phonological development discussed in the introduction (Pierrehumbert, 2003; Vihman & Croft, 2007). Information about speech prosody or rhythm is carried in particular by the onsets of amplitude envelopes or rise times, amplitude depth and fundamental frequency, and it is known that awareness of the basic rhythm type of the ambient language arises early in infancy (Mehler et al., 1988; de Boysson-Bardies, Sagart, & Durand, 1984). Rhythm awareness helps infants with syllabic segmentation. For example, according to Cutler’s “rhythmic segmentation hypothesis” (e.g., Cutler, 1996), listeners adopt the unit of metrical organisation prevalent in their language as a prelexical cue to word boundaries (e.g., the foot in English, the syllable in French or Spanish, the mora in Japanese). Rise time is important for rhythmic segmentation in all of these languages (Hoequist, 1983). Mehler and colleagues have argued that infants need to identify the metrical template of the ambient language (foot, syllable or mora) before they can develop a suitable segmentation strategy, and that rhythmic cues help them to do this (Mehler, Dupoux, Nazzi, & Dehaene-Lambertz, 1996). Reduced sensitivity to rise time should therefore impair the identification of these metrical templates, affecting both prosodic and phonetic encoding from birth onwards. The data from our study are consistent with this developmental perspective. In particular, the data from the Famous Names task support the importance of the identification of metrical structure for the development of phonological awareness (see also Wood, 2006). The impaired lexical phonological representations which are a consequence of this reduced perceptual sensitivity and impaired encoding then in turn impair the acquisition of literacy.
The multiple regressions for the Spanish Onset measure are not shown as there was only one significant auditory predictor, Rise Duration Rove, B = −.342, R2 change = 0.11.
We would like to thank the head teacher, teachers, children and parents of the schools who participated in this study. This research was supported by funding from the Economic and Social Research Council (ESRC), grant RES-000-23-0475, awarded to Usha Goswami. Requests for reprints should be addressed to Usha Goswami, Centre for Neuroscience in Education, 184 Hills Rd, Cambridge CB2 8PQ, UK.