Intact Spectral but Abnormal Temporal Processing of Auditory Stimuli in Autism
The perceptual pattern in autism has been related to either a specific localized processing deficit or a pathway-independent, complexity-specific anomaly. We examined auditory perception in autism using an auditory disembedding task that required spectral and temporal integration. 23 children with high-functioning-autism and 23 matched controls participated. Participants were presented with two-syllable words embedded in various auditory backgrounds (pink noise, moving ripple, amplitude-modulated pink noise, amplitude-modulated moving ripple) to assess speech-in-noise-reception thresholds. The gain in signal perception of pink noise with temporal dips relative to pink noise without temporal dips was smaller in children with autism (p = 0.008). Thus, the autism group was less able to integrate auditory information present in temporal dips in background sound, supporting the complexity-specific perceptual account.
KeywordsAutism Connectivity Auditory Perception Speech-in-noise
Several studies have found abnormal low-level perceptual capabilities in autism in the visual (Bertone et al. 2005; Behrmann et al. 2006) and auditory domains (Samson et al. 2006). Atypical processing of low-level (i.e. early) perceptual information processing is, therefore, considered to be a characteristic feature of autism (Happe 1999). It is, however, not clear which processes give rise to the atypical perceptual processing. Two opposing hypotheses on perception in the visual domain have been formulated (Bertone et al. 2005): the pathway-specific hypothesis and the complexity-specific hypothesis. The distinction between these two theories has important conceptual consequences for the inferred organisation of the autistic brain, since essentially, the pathway-specific hypothesis states that perceptual deficits in autism can be traced back to deficits in specific cortical modules, whereas the complexity-specific hypothesis states that general integrational functional processes that are not bound to specific cortical modules are atypical in autism.
The finding that people with autism were less sensitive to global motion than to static visual stimuli inspired the pathway-specific hypothesis, which states that the autistic brain has a deficient dorsal (visual motion information processing) stream but an intact ventral (visual static information processing) stream (Blake et al. 2003; Spencer et al. 2000; Milne et al. 2002). However, a recent study found ventral stream deficits in autism as well (Bertone et al. 2005). Bertone et al. found that people with autism showed an enhanced sensitivity to the orientation of static luminance-defined stimuli that require V1 processing only, while the static texture-defined stimuli that require additional V2/V3 processing were diminished. They therefore concluded that, rather than the neural pathway, the amount of neural integrative processing required for the task (relative to the hierarchical posterior–anterior organisation of the visual cortex) predicted perceptual performance in autism.
Several authors have suggested that the complexity-specific hypothesis also applies to the auditory domain (Bertone et al. 2005; Samson et al. 2006; Mottron et al. 2006). Mottron and colleagues’ Enhanced Perceptual Functioning model (EPF) (Mottron et al. 2006) states that there is an inverse relation between increasing levels of neural complexity and the level of performance in low-level perceptual tasks in autism independent of the sensory domain, thus providing an explanation for both enhanced and diminished perceptual functioning in autism. More specifically, Samson et al. hypothesised that, based on the hierarchical neural organisation of the auditory cortex, spectro-temporal complexity of auditory stimuli may explain the autistic pattern of performance in the auditory domain, such that perception of simple low-level auditory stimuli will be enhanced in autism, while perception of complex low-level auditory stimuli will be spared or impaired (Samson et al. 2006).
Given the hypothesis of the EPF model, we sought to create an auditory task that required neural integration of stimuli in the hierarchical and system-wide tonotopic organisation of the auditory system. The hierarchical organisation is reflected by the fact that spectral and temporal cues are processed separately in early (subcortical) parts of the auditory pathway, and that these are progressively integrated from the midbrain inferior colliculus to primary (A1) and secondary (non-A1) auditory cortex (A2), where the spectral and temporal response characteristics are most complex and broadly tuned. The tonotopic (spectral) organisation refers to neighbouring neural assemblies responding to similar frequencies, such that orderly maps are formed with lowest frequency on one end and highest frequency on the other. It is implied naturally from this organisation that short-range lateral connections mediate integration or segregation of spectral information from simple sounds such as pure tones. The processing of more complex sounds, such as noise or speech, involves larger neural assemblies (Scott and Johnsrude 2003) and, moreover, segregating different simultaneously presented sounds sources requires reallocation of additional supporting neural processing resources (Pichora-Fuller et al. 1995). Also, the auditory system has a hemispheric lateralisation. The left auditory cortex is more involved in the perception of temporal information, whereas the right auditory cortex is committed to spectral processing (Robin et al. 1990; Zatorre 1997).
According to the hierarchical neural organisation of the auditory pathway, pitch identification of pure tones is the simplest task that requires least neuro-integrative processing and is mediated more by A1 than by A2. The EPF model predicts that A1-mediated perceptual processing will be superior in autism. Indeed, superior low-order auditory perception has been reported in experimental paradigms involving pitch perception (Bonnel et al. 2003) and chord segmentation (Heaton 2003). However, few studies assessing complex low-level perceptual tasks in the auditory domain that require extensive neural integration have yet been conducted (Teder-Salejarvi et al. 2005; Alcantara et al. 2004). One study by Alcantara et al. 2004 pioneered research on complex low-level auditory information processing, studying speech-in-noise perception in autism. Using several types of noise that contained either spectral dips, temporal dips or a mixture of both, they found significant differences between the control and subject groups in the ability to disembed speech from noise, which were mainly attributable to the temporal dips in the noise. These results are suggestive of diminished complex low-order auditory perception in autism, but two disadvantages make inferences on the predictions of the EPF model rather difficult. First, the noise stimuli were designed to mimic naturalistic speech; the temporal dips in the noise therefore varied from seconds to milliseconds, and were consequently not well controlled. Second, the whole-sentence material impeded differentiation between language-mediated top-down influences (higher linguistic ability in the control group may lead to better sentence recognition) and bottom-up perceptual effects as predicted by the EPF model.
To assess complex low-level auditory perceptual functioning in autism, we extended Alcantara and colleagues’ speech-in-noise perception task by (a) controlling for language-mediated top-down influences and (b) by using well-defined background stimuli. In random dot kinematograms, isolated processing of single dots is not sufficient to perceive global motion since local information fragments must be integrated across time and space before a global motion direction can be discriminated. Similarly, we presented participants with single words and concurrent masking background stimuli in which the background stimuli were fragmented both temporally and spectrally to vary the neural demand needed to integrate information present in spectral and temporal dips. Based on the predictions of the EPF model, we hypothesised that, with increasing neuro-integrative demand, task performance in autism would decrease, such that enhanced speech perception in a simple low-level auditory task is met by normal or decreased complex low-level perceptual performance in the auditory domain. More specifically, we hypothesised that the participants with autism would show a preserved or reduced (rather than an enhanced) to integrate information present in temporal and spectral dips in the background sounds.
Methods and Materials
Demographic and descriptive variables (mean ± SD)
p value (T or X2)
Age (years) (mean ± SD)
14.00 ± 1.8
13.84 ± 1.5
Full scale IQ (mean ± SD)
96.52 ± 16.0
102.91 ± 10.7
Verbal IQ (mean ± SD)
99.26 ± 17.6
103.30 ± 9.9
Performance IQ (mean ± SD)
94.22 ± 14.3
102.87 ± 15.2
ADI-R social (mean ± SD)
19.52 ± 4.8
ADI-R non-verbal (mean ± SD)
7.35 ± 2.7
ADI-R verbal (mean ± SD)
13.30 ± 4.0
ADI-R stereotypy (mean ± SD)
5.17 ± 2.0
ADI-R onset (mean ± SD)
2.09 ± 0.99
Pink noise (or 1/f noise) is a variant of Gaussian white noise with a power spectrum that is proportional to the reciprocal of the frequency. In this way the acoustic energy is equally divided across the logarithmically organised frequency bands of the human auditory system, making it a most general and effective mask for natural sound stimuli (Fig. 1c). From a neural perspective, disembedding pink noise from speech is relatively simple as these two stimuli have very different spectro-temporal features (the spectrogram of speech is very dynamic, both temporally and spectrally, whereas that of pink noise is not).
Amplitude-modulated pink noise
The pink noise was amplitude-modulated with a 10 Hz sinusoidal function to introduce temporal masking dips (i.e. every tenth second) that may help the listener to reconstruct the original speech information (Fig. 1d). Such dip listening is neurally more complex since separate pieces of information need to be stored and temporally integrated.
The spectral power in the so-called ripple stimulus (Chi et al. 1999) is not evenly distributed across all frequencies, but its profile is modulated with a sinusoidal function in the temporal as well as the spectral dimension (simultaneously; Fig. 1e). In the spectral dimension, the modulation was 2 cycles/octaves (ripple density) and in the temporal dimension, 3 Hz (ripple frequency). Ripple sounds are neurally harder to distinguish from speech than pink noise (or modulated pink noise) because its complicated temporal and spectral features resemble those of speech sounds more closely. As a result, ripple stimuli sound very dynamic and distracting and are difficult to separate from speech.
Amplitude-modulated moving ripple
The last background consists of a moving ripple that is amplitude-modulated with a 10 Hz sinusoidal function, introducing the same temporal dips as in the modulated pink noise. Compared with the modulated pink noise, the advantage of detecting speech features via the unmasked temporal dips is now countered by the interfering and complicated spectro-temporal pattern of the ripple sound.
After each presentation of an embedded word, the subjects immediately repeated what they heard. Using an adaptive procedure, the signal-to-noise ratio (SNR) was varied by decreasing or increasing the level of the background sound, first in 4 dB steps, than in 2 and finally in 1 dB steps to define the 50% correct point on the SNR psychometric function. The 50% correct point defined the speech reception threshold (SRT) and is the steepest part of the stimulus response curve in which the perception rapidly changes from full perception to no comprehensible signal detection. Subjects were instructed to ignore the background sounds and focus on the words. They were furthermore encouraged to guess. No feedback was provided throughout the testing trials. All subjects were tested first in the standard +25 SNR condition. The order of testing of the background sound was counterbalanced to control for learning effects. Testing was completed in a single 1.5-h session.
We used SPSS for Windows (Release 14.0) for statistical analysis. The significance tests were two-tailed and were evaluated at 0.05 alpha. Group differences on demographic variables were examined through independent-sample t-tests, and the SRTs were analysed by means of a two-by-two-by-two mixed model MANOVA design with repeated measures. ‘Group’ (autism vs. controls) was entered as a between-subject variable, and background complexity (pink noise vs. ripple sound) and temporal dips (present vs. absent) as within-subject variables. Additional univariate tests were run to find individual differences for each dependent variable.
Speech reception thresholds in continuous and amplitude-modulated background sounds (mean ± SD)
17.96 ± 2.1
11.82 ± 2.2
20.26 ± 2.0
15.20 ± 2.5
19.03 ± 1.6
10.87 ± 1.8
19.96 ± 1.9
15.62 ± 1.9
Summary of MANOVA
Group by noise complexity
Group by temporal dips
Noise complexity by temporal dips
Noise complexity by temporal dips by group
In the current study we examined auditory processing at increasing levels of neural complexity in adolescents with autism and normal controls matched on age, IQ and gender. The complexity-specific hypothesis of the EPF model predicts that, depending on the neuro-integrative demand needed to perform a task, perception of simple low-level stimuli will be enhanced in autism, while perception of complex low-level stimuli will be spared or impaired (Mottron et al. 2006). This was indeed what we found using a complex low-level auditory discrimination paradigm, as the gain in speech perception from amplitude-modulated pink noise (relative to continuous pink noise) was significantly smaller in subjects with autism than in controls. This finding suggests that subjects with autism have a diminished ability to integrate auditory information fragments present in temporal dips, analogous to a diminished ability to integrate the movements of dots in random dot kinematograms in the visual domain (Spencer et al. 2000). However, we found no difference in the ability to integrate auditory information fragments present in spectral dips in ripple sounds. This suggests that the more complex spectro-temporal properties of the ripple sounds interfere with normal temporal grouping of the intermitted speech signal (i.e. interrupts transient auditory memory formation), such that subjects with autism and controls resort to the same processing strategy.
Several studies had already found enhanced simple low-level visual processing (O’Riordan et al. 2001; Plaisted et al. 1998) and spared (Spencer et al. 2000; Blake et al. 2003) or diminished complex low-level visual processing (Milne et al. 2002). In the auditory domain, enhanced simple low-level processing had been reported for pitch discrimination and chord disembedding (Bonnel et al. 2003; Heaton 2003), in which perceptual performance depends on spectral processing. Using a complex low-level auditory task, Alcantara also found diminished temporal but intact spectral processing in children with autism (Alcantara et al. 2004). Järvinen-Pasley and colleagues (Jarvinen-Pasley et al. 2008a) found, using the PEPS-C task (a computerised task that tests both prosody form perception and prosody function perception), that children with autism performed poorer on affective intonation, chunking (phrasing) and long-sound discrimination. Especially the chunking (in which a short silence between words is informative for sentence meaning) and long-sound discrimination (short-sound discrimation was unaffected) suggest that temporal processing was reduced in the autism group as well. In another experiment by the same group (Jarvinen-Pasley et al. 2008b), pitch contours in sentences and in musical tones were perceived with greater accuracy in the autism group, suggesting that spectral processing was enhanced in the autism group. Below we will discuss the putative neural origin of the abnormal pattern of auditory processing in autism and the implications of our results for theories on abnormal connectivity.
The visual perceptual profile in autism (enhanced or preserved/diminished perceptual performance depending on stimulus complexity) is thought to be a consequence of abnormal cortical processing since pre-cortical structures (e.g. the parvocellular and magnocellular systems) are unaffected in autism (Bertone et al. 2005; Pellicano et al. 2005). Presumably, the auditory perceptual profile is a consequence of atypical cortical processing as well. Several neurophysiological studies found cortical processing anomalies of auditory stimuli in autism (Ceponiene et al. 2003; Gervais et al. 2004; Boddaert et al. 2004). Moreover, evidence suggestive of abnormal peripheral processing at the level of the brainstem has been reported as well (Khalfa et al. 2001b). However, the fact that we found no overall group differences but a significant task manipulation effect suggests that peripheral hearing is unimpaired and that speech-in-noise perception differences are attributable to atypical central auditory processing in autism.
The cortical origin of the perceptual pattern in autism may be explained by two partly overlapping hypotheses that are both gaining support: atypical neural connectivity and increased lateral inhibition. Although these hypothesises are still in need of empirical confirmation and the current study does not directly validate or falsify them, they may be relevant for the current findings. Atypical neural connectivity refers to the underfunctioning of integrative neural circuitry resulting in a deficient integration of information at neural and cognitive levels (Just et al. 2004). Reduced functional synchrony between cortical regions has been proposed as an explanation for the perceptual performance in autism because higher levels of performance on simple low-level patterns may be a consequence of analysis by a single or a few dedicated brain regions (Bertone et al. 2005). Perception of complex patterns on the other hand requires communication among multiple cortical regions, which, in case of reduced functional synchrony, would operate less efficient. In visual perception, feedforward (from V1 to V2–V4) and feedbackward connections that amplify activity of neurons in lower order areas (from V2–V4 to V1) are needed to perceive complex patterns (Angelucci et al. 2002). Likewise, in the auditory system, communication between A1 and A2, and furthermore, feedforward and feedback communication between the auditory cortex and subcortical nuclei such as the inferior colliculus are needed for the perception of spectro-temporally complex sounds and their segregation from background noise (Khalfa et al. 2001a). Abnormal specialisation of neocortical processing centres may therefore lead to the atypical perceptual pattern observed in autism.
More speculatively, the current findings may be related to the functional units from which the cortex is constructed (i.e. minicolumns). Gustafsson proposed that stronger lateral connectivity between adjacent minicolumns could predict enhanced sensory discrimination of simple stimuli and impaired global perception (referred as to increased lateral inhibition (Gustafsson 1997)). Physiological support for his hypothesis has come from neuropathological studies showing an atypical distribution of interneurons in autism (Casanova et al. 2003; Casanova 2006). Essentially, increased lateral inhibition is an extrapolation of normal physiological processes that enhance resolution and contrast of perception. In the tonotopically organised auditory system, this would translate to better spectral discrimination. The temporal response properties, however, are mediated by more complex connectional networks that involve integrative processes such as transient auditory memory. In this light, our findings of impaired temporal processing do not directly support the hypothesis of increased lateral connectivity, but rather hint to a connectivity deficiency over longer range.
Differences in perceptual functioning between spectral and temporal processing in autism (i.e. a significant difference in the pink noise condition but not in the ripple condition) may also be explained by the functional specialisation of the cortex, in which spectral segregation is predominantly carried out by the right hemisphere and temporal segregation by the left hemisphere (Robin et al. 1990; Zatorre 1997). In the pink noise condition, the spectro-temporal features of the speech signal are relatively easily separated from the background noise (see Methods section). Our finding that the subjects with autism are less able to detect speech features when the task is manipulated in the temporal domain points to deficient left-hemisphere processing. This is in line with converging evidence from structural, ERP and MEG studies that point to other left hemisphere deficits in autism as well (Bruneau et al. 2003; Lepisto et al. 2006; Flagg et al. 2005; Wilson et al. 2006; Murias et al. 2007) Compared with the pink noise condition, the ripple signal separation task was extended significantly in the spectral domain (see Methods section), which required additional right hemisphere involvement. The higher detection thresholds in this condition suggest that resolving the spectral components was the main bottleneck to perform the task, and equally so in both subject groups.
Evidence from cortical auditory evoked potentials in autism, especially mismatch negativity (MMN) evidence, suggests that children with autism have difficulty encoding auditory information into transient memory (Bomba and Pang 2004; Seri et al. 1999; Jansson-Verkasalo et al. 2003; for a review see Groen et al. 2008). MMN reflects the neural processing that is required when an incoming auditory stimulus is processed against stimulus representations that are already stored in transient auditory memory. Thus, our finding that subjects with autism were less able to integrate information spanning over subsequent dips in the background noise may be explained, in part, by a difference in available transient auditory memory.
Thus, the complexity-specific hypothesis of diminished neuro-integrative functioning in the auditory domain may primarily be restricted to impaired processing of temporal characteristics of sounds, while spectral grouping may be relatively intact in autism. It should be noted that the current results should not be generalised to all children on the autism spectrum since only high-functioning children participated. In the future, other participant groups, such as children with dyslexia, could be included to determine whether these results are specific to autism. Hopefully, from studies such as the current that used psycho-acoustic measures to uncover abnormalities within different stages of early neural processing, we may gather a more complete picture of the perceptual pattern in autism and obtain a deeper understating of its neural underpinning. The challenge ahead will be to disentangle the heterogeneous clinical phenotype of autism and its higher level neural correlates from their lower level perceptual counterparts.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- Achenbach, T. M. (1991). Manual for the child behavior checklist. Burlington: University of Vermont.Google Scholar
- Alcantara, J. I., Weisblatt, E. J., Moore, B. C., & Bolton, P. F. (2004). Speech-in-noise perception in high-functioning individuals with autism or Asperger’s syndrome. Journal of Child Psychology and Psychiatry and Allied Disciplines, 45, 1107–1114. doi:10.1111/j.1469-7610.2004.t01-1-00303.x.CrossRefGoogle Scholar
- American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders. Washington: American Psychiatric Association.Google Scholar
- Bruneau, N., Bonnet-Brilhault, F., Gomot, M., Adrien, J. L., & Barthelemy, C. (2003). Cortical auditory processing and communication in children with autism: Electrophysiological/behavioral relations. International Journal of Psychophysiology, 51, 17–25. doi:10.1016/S0167-8760(03)00149-1.PubMedCrossRefGoogle Scholar
- Ceponiene, R., Lepisto, T., Shestakova, A., et al. (2003). Speech-sound-selective auditory impairment in children with autism: They can perceive but do not attend. Proceedings of the National Academy of Sciences of the United States of America, 100, 5567–5572. doi:10.1073/pnas.0835631100.PubMedCrossRefGoogle Scholar
- Jansson-Verkasalo, E., Ceponiene, R., Kielinen, M., Suominen, K., Jantti, V., Linna, S. L., et al. (2003). Deficient auditory processing in children with Asperger Syndrome, as indexed by event-related potentials. Neuroscience Letters, 338, 197–200. doi:10.1016/S0304-3940(02)01405-2.PubMedCrossRefGoogle Scholar
- Lepisto, T., Silokallio, S., Nieminen-von Wendt, T., Alku, P., Naatanen, R., & Kujala, T. (2006). Auditory perception and attention as reflected by the brain event-related potentials in children with Asperger syndrome. Clinical Neurophysiology, 117, 2161–2171. doi:10.1016/j.clinph.2006.06.709.PubMedCrossRefGoogle Scholar
- Lord, C., Rutter, M., & Le Couteur, A. (1994). Autism Diagnostic Interview-Revised: a revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. Journal of Autism and Developmental Disorders, 24, 659–685. doi:10.1007/BF02172145.PubMedCrossRefGoogle Scholar
- Pellicano, E., Gibson, L., Maybery, M., Durkin, K., & Badcock, D. R. (2005). Abnormal global processing along the dorsal visual pathway in autism: A possible mechanism for weak visuospatial coherence? Neuropsychologia, 43, 1044–1053. doi:10.1016/j.neuropsychologia.2004.10.003.PubMedCrossRefGoogle Scholar
- Plaisted, K., O’Riordan, M., & Baron-Cohen, S. (1998). Enhanced discrimination of novel, highly similar stimuli by adults with autism during a perceptual learning task. Journal of Child Psychology and Psychiatry and Allied Disciplines, 39, 765–775. doi:10.1017/S0021963098002601.CrossRefGoogle Scholar
- Wechsler, D. (2002). Wechsler Intelligence Scale for Children. Editie NL Handleiding. London: Psychological Corporation.Google Scholar
- Zatorre, R. J. (1997). Cerebral correlates of human auditory processing: perception of speech and musical sounds. In J. Syka (Ed.), Acoustical signal processing in the central auditory system (pp. 453–468). New York: Plenum Press.Google Scholar