Brain Topography

, Volume 24, Issue 3, pp 271–278

Neuroimaging Evidence for Top-Down Maturation of Selective Auditory Attention

Original Paper

DOI: 10.1007/s10548-011-0182-1

Cite this article as:
Herdman, A.T. Brain Topogr (2011) 24: 271. doi:10.1007/s10548-011-0182-1


This study investigated maturational differences of selective auditory attention effects on transient evoked responses and 40-Hz auditory steady-state responses between children and adults. Magnetoencephalography (MEG) was recorded from children and adults performing a task where they attended to 40-Hz amplitude-modulated (AM) tones of 1,200 Hz while ignoring 40-Hz AM tones of 800 Hz. By using standard dipole-modeling procedures, the N1m of the transient evoked fields and the 40-Hz ASSRs were localized to secondary and primary auditory cortices, respectively. Source waveforms for the transient evoked fields and ASSRs were reconstructed at these locations and compared between attended and unattended tones. Source waveforms revealed attention enhances the sustained negativity of the transient evoked responses in both adults and children around 250 and 400 ms. ASSRs were also found to be enhanced within this time range but only for adults. The results provide evidence for a limited role of attention modification of the 40-Hz ASSRs in children around the age of 12 years old. Because ASSRs are generated in a lower auditory processing stage as compared to the transient auditory evoked responses, findings from the present study could indicate that the maturation of attention progresses in top-to-bottom manner. These findings fit with the notion that as a person gains sensory experience selective gating of relevant from irrelevant information likely occurs at earlier and earlier processing levels in order to become more automatic and efficient.


Auditory evoked responses Auditory steady-state responses Magnetoencephalography Development 


Children’s ability to selectively attend to relevant auditory information becomes increasingly more focused and accurate as they develop into adults (Doyle 1973; Sexton and Geffen 1979; Gomes et al. 2000). Considering this behavioural development with competing theories of late- versus early-stage selection processes of attention (Treisman 1969; Broadbent 1958; Posner 1980; Norman 1968), a question that arises is if maturation of selective auditory attention shifts from late- to early-stages of selection as children develop auditory skills (i.e., top-down maturation)? Auditory experience and learning could cause lower-level centers to become more efficient and better at filtering relevant from irrelevant sensory information in order to free up higher-level cognitive resources. But a top-down maturation of attention (late- to early-stage selection) along the auditory pathway has not been shown conclusively. The aim of the present study; therefore, was to use magnetoencephalography (MEG) to determine if lower- (earlier) and upper- (later) levels of auditory processing in children are controlled differentially by attention as compared to those in adults. MEG can temporally and spatially separate neural activity amongst the different parts of the auditory pathway, which made it a useful method for addressing this issue.

Many previous electroencephalography and MEG studies showed larger event-related potentials (ERPs) and fields (ERFs) when a person attends to relevant stimuli as opposed to ignoring irrelevant stimuli (e.g., Hillyard et al. 1973; Schwent et al. 1976; Hansen and Hillyard 1980; Hillyard and Kutas 1983; Ross et al. 2010; Näätänen et al. 1978; Näätänen and Michie 1979; Woldorff et al. 1993). One of the most consistently recorded attention-related ERPs/ERFs in adults occurs as a greater negativity to attended (or relevant) than unattended (or irrelevant) sounds (Hansen and Hillyard 1980). Negative differences (Nd; Hansen and Hillyard 1980) or processing negativities (PN; Näätänen et al. 1978) can be calculated by subtracting ERPs to the unattended sounds from the ERPs to the attended sounds. Depending on stimulus parameters and task demands, these difference negativities can persist for several hundreds of milliseconds (Hansen and Hillyard 1980; Näätänen et al. 1978). They can also be recorded in older children (Berman and Friedman 1995; D’Angiulli et al. 2008) and sometimes takes on a positive response difference depending on the child’s age and type of task performed (Stevens et al. 2009). Although the functional meaning of these difference negativities is still unclear, Hillyard et al. (1973) proposed that attentional modulation of the N1 response could reflect an early-stage tuning of the perceptual filters to a relevant stimulus attribute or an information channel that needs to be selected for over irrelevant stimulus attributes (i.e., perceptual noise). Later, Näätänen (1992) proposed that the early difference negativity is a superimposed response on the N1 that reflects comparison of the incoming stimulus information with an endogenous template required to perform the task. He also suggested that the late difference negativity reflects updating of the endogenous template for more efficient processing on subsequent trials (for review see Chap. 12 in Picton 2010). Based on earlier studies of topographic analyses of the Nd components, neuroimaging studies using MEG have provided evidence to show that a main component of the Nd can be localized to the lateral aspect of the superior temporal plane, in the vicinity of the secondary auditory cortices (Ross et al. 2010; Woldorff et al. 1993). Other source components were localized outside the auditory cortices within frontal and parietal regions (Ross et al. 2010); however, further discussion is beyond the scope of this paper. What is important is that this provides evidence that attention can play a major role in modulating sensory processing within the auditory cortices. But how early in the auditory processing stream can attention modulate sensory input?

Auditory steady-state responses (ASSRs) are evoked responses that might help to determine the auditory level at which attention can modify sensory information. ASSRs are neural oscillations driven at the rate of stimulus presentation or rate of amplitude/frequency modulation (Galambos et al. 1981; Stapells et al. 1984). The 40-Hz ASSRs are known to have neural generators within the primary auditory cortices that are separable from the transient evoked responses originating in the secondary auditory cortices (Herdman et al. 2003; Pantev et al. 1993; Pantev et al. 1996). A few studies have identified that 40-Hz ASSRs can be enhanced by attention (Lazzouni et al. 2010; Ross et al. 2004; Tiitinen et al. 1993) and possibly by auditory training (Bosnyak et al. 2004). Thus, ASSRs might provide a means to identify maturational changes of attentional control within the primary auditory.

The present study investigated if maturation of selective attention occurs in a top-down fashion, whereby lower-level auditory areas are modulated by attention later in maturation. Dipole source waveforms of transient evoked responses and ASSRs were modeled from MEG recorded from children and adults performing a selective auditory attention task in order to determine if secondary and/or primary auditory cortices can be modified by attention.

Materials and Methods


Thirteen right-handed participants (age 22 ± 2.3 years; 7 females) and 12 children (age 12 ± 0.7 years; 6 females) volunteered in this study. All participants were screened for normal hearing using tympanometry and pure tone audiometry. For hearing level sensitivity 1000-Hz tones were presented via insert earphones to the participants ears while seated in the magnetically shielded room. The background noise level in the room was 46 dB SPL, which likely caused a slight elevation in hearing thresholds. All participants’ thresholds were below 30 dB HL, which we considered normal because of the elevated environmental noise level. All participants reported normal cognitive and neurological functions and were drug free. Informed consent was signed by all participants or by their guardians/parents. Due to insufficient trials and MEG artefacts three adults’ and two children’s datasets were rejected from further analyses; leaving 10 adults’ (five females) and 10 children’s (five females) datasets.

Stimuli and Task

The stimuli for the task were four tones randomly presented binaurally at random stimulus onset asynchrony between 1,000 and 1,500 ms. The four tones were coded as relevant deviants, irrelevant deviants, relevant standards, and irrelevant standards. Relevant-deviant tones had a carrier frequency of 1,200 Hz, duration of 175 ms, and 10% probability of being presented (24 trials per block). Irrelevant deviant tones had a carrier frequency of 800 Hz, duration of 175 ms, and 10% probability of being presented (24 trials per block). Relevant-standard tones had a carrier frequency of 1,200 Hz, duration of 500 ms, and 40% probability of being presented (96 trials per block). Irrelevant-standard tones had a carrier frequency of 800 Hz, duration of 500 ms, and 40% probability of being presented (96 trials per block). All stimuli were presented at 70 dB sound pressure level (SPL) and were 100% amplitude modulated at a rate of 40 Hz. The rise and fall times of AM stimuli were half a cycle of the modulation frequency of 40 Hz (i.e., 12.5 ms). Stimulus levels and spectra were measured from the acoustic output at the end of the insert earphone tubes using a sound level meter (Quest Technologies Inc., model QC-20).

For the task, participants were asked to attend to the relevant high-pitch (1,200 Hz) tones, ignore the irrelevant low-pitch (800 Hz) tones, and to press a button with their right index finger when they heard a high-pitch (1,200 Hz) relevant-deviant (target) tone of short duration (175 ms). Participants were provided a brief practice session before collecting MEG data. During MEG recordings, participants performed four blocks of the task with brief 30–60 s breaks between blocks. This amounted to randomly presented 96 relevant-deviant, 96 irrelevant-deviant, 384 relevant-standard, and 384 irrelevant-standard tones.

Data Acquisition

Magnetoencephalography was recorded using a 151-sensor whole-head biomagnetometer system (CTF system, VSM Inc.) while participants were seated in a magnetically shielded room at the Down Syndrome Research Foundation in Burnaby, Canada. MEG data were continuously collected using first-order gradiometers at a sampling rate of 600 Hz. Magnetic noise recordings of the empty room were performed prior to experimental recordings and had noise floors below 10 femtoTesla between 1- and 100-Hz with the exception of a large peak at 60-Hz that reflected line noise. Each participant’s head shape was collected by digitizing at least 1000 points covering the surface of the participant’s scalp using a digitizing system (Polhemus Inc., Vermont, USA). This procedure took approximately 20 min. The participant’s head shape was then matched to a best-fitted head shapes created from individual MRIs from a database in order to approximate anatomical information to overlay dipole locations. In addition, button responses to relevant-deviant tones were recorded from a fibre-optic button box during the MEG scanning. Reaction times and accuracy were calculated from these button presses for each participant.

Data Analyses

Because a main objective of this study was to understand differences in adults’ and children’s neural responses between attending to relevant-standard tones and ignoring irrelevant-standard tones, MEG responses only to the relevant- and irrelevant-standard tones were considered in this paper. All analysis procedures were equally performed on adult and child datasets. Stimulus trials that had MEG responses exceeding 2 picoTesla were automatically rejected from further processing. Event-related fields (ERFs) were calculated by averaging the third-order gradiometer signals across artifact free trials, subtracting a prestimulus baseline (averaged fields between −250 and 0 ms), and low-pass filtering the signals at 30 Hz for the transient ERFs and band-pass filtering the signals between 30 and 50 Hz for the ASSRs.

Dipole modeling of transient evoked responses was performed by spatiotemporally fitting two unconstrained dipoles to 12 samples around N1m peak of the transient auditory ERFs that were averaged across relevant-standard and irrelevant-standard tones. Similarly for ASSR dipole modeling, two unconstrained dipoles were spatiotemporally fitted to 2-cycles of the ASSRs calculated as an average of four 50-ms bins spanning from 300 to 500 ms. Dipole fits for the transient (N1m) and ASSRs resulted in less than 15% unexplained residual variance for each participant. Three-way Analysis of Variance (ANOVA) tests across group (adult vs. child), evoked component (transient vs. ASSR), and hemisphere (left vs. right) were performed on the dipoles’ x, y, and z coordinates. Differences were considered statistically significant at P < 0.05 and post-hoc t-tests were performed for significant main effects and interactions of the ANOVA results.

Transient ERFs and 40-Hz ASSRs to relevant- and irrelevant-standard tones were signal-space projected to the fitted two N1m and two ASSRs dipoles, respectively for each participant (Hämäläinen et al. 1993). Significance levels of source waveform differences between relevant- and irrelevant-standard tones for the transient evoked responses and for the Hilbert-transformed amplitude envelope of the ASSRs were calculated using a Student’s t-test for each group separately. Significant interactions between group (adult vs. child) and stimulus type (relevant vs. irrelevant) were also calculated for the transient source waveforms and ASSR source amplitude envelopes. Intervals with significant differences and interactions are reported at P < 0.05 and P < 0.01. Because conclusions from this study were partially based on a null finding of insignificant attentional effects on children’s ASSRs, a statistical power analysis was performed to estimate Cohen’s (d) and δ for matched samples (Howell 1997, pp. 216–220). The mean adult’s ASSR differences between relevant- and irrelevant-standard stimuli averaged across 250–500 ms was used as the expected sampled population difference and the mean of the standard deviation of children’s ASSR differences across 250–500 ms was used as the standard deviation of the sampled population. Thus, Cohen’s (d) and δ provide estimates of the probability of not finding ASSR differences between relevant- and irrelevant-standard tones of the same size seen in adults given the variance in the children’s ASSR differences. By reversing the power calculations, possible effect sizes for children’s ASSRs that might be found were also estimated for powers of 0.3 (small), 0.5 (medium), and 0.8 (large) at P < 0.05.


Behavioural Data

Child and adult participants accurately identified on average 76 ± 4 and 94 ± 1% (mean ± standard error of the mean) of the relevant-deviant (target) stimuli. This demonstrated that children and adults were paying attention during the task. Adults were significantly more accurate by 18% (t = 4.145; df = 1.18; P = 0.001). Button responses to relevant-deviant stimuli were 56 ms faster for adults (638 ± 14 ms) than for children (694 ± 13 ms; t = −2.991; df = 1.18; P = 0.008).

Source Locations

Dipole locations for the transient (N1m) and 40-Hz ASSR were located within the lateral and medial aspects of Heschl’s gyri, respectively (Fig. 1). Dipole locations for the 40-Hz ASSRs (circles; Fig. 1) were significantly more medial compared to the N1m dipole locations (triangles) in the left hemisphere only, as indicated by post-hoc testing of the significant ANOVA interaction of evoked component by hemisphere (F = 11.364; df = 1.19; P = 0.037). There was no evidence of significant differences in dipole locations between adults and children for any other main effects or interactions of evoked components and hemisphere (P > 0.05).
Fig. 1

Grand-averaged dipole locations for the adult N1m (black triangles), child N1m (white triangles), adult 40-Hz ASSR (black circles), and child 40-Hz ASSR (white circles). Note the significant medial-to-lateral separation between N1m and ASSR dipoles for children and adults

Source Waveforms

Transient ERF source waveforms from bilateral secondary auditory cortices exhibited typical response morphology of a P1–N1–P2 complex followed by a sustained response for adults and children (Fig. 2). The P1–N1–P2 complex showed no appreciable differences between relevant- and irrelevant-standard tones; however, sustained responses were significantly larger to relevant- than irrelevant-standard tones around 250 and 400 ms for both adults and children. Children’s transient responses had a large negative peak (N2) following P2 that was not apparent in adults. This secondary negativity in children was significantly larger than the sustained responses in adults at around 250 ms (P < 0.05). There were no significant response interactions of groups (adults vs. child) by stimulus type (relevant vs. irrelevant) for the transient evoked responses.
Fig. 2

Transient evoked responses from sources located within left and right auditory cortices of adults (top) and children (bottom). Evoked responses to irrelevant-standard tones (green lines) were subtracted from relevant-standard tones (blue lines) to yield difference responses (red lines). Filled boxes plotted underneath the waveforms depict significant response differences between stimulus conditions (relevant vs. irrelevant; upper boxes) and significant response interactions for group (adult vs. child) by stimulus condition (relevant vs. irrelevant; lower boxes). Grey and black boxes denote intervals with significant differences at P-values <0.05 and <0.01, respectively (Color figure online)

Clearly discernible transient gamma-band responses (tGBRs) followed by 40-Hz ASSRs were found bilaterally in the primary auditory cortices of adults and children (Figs. 3, 4). Adults had larger tGBRs and ASSRs than children between 250 and 500 ms (P < 0.05). Consistent with the current study’s main hypothesis of top-down maturational growth of attention was the finding that amplitude envelopes of ASSRs were significantly larger to relevant- than irrelevant-standard tones in adults between 200 and 500 ms but not in children (upper filled boxes in each plot of Fig. 3). In accordance, there were significant ASSR interactions of group (adults vs. child) by stimulus type (relevant vs. irrelevant) within these same intervals (lower filled boxes in each plot of Fig. 3). Because there was no statistical evidence to support attention effects on children’s ASSR, a statistical power analysis was performed and revealed a Cohen’s d of 1.07 and a δ value of 3.10. This δ value translated into large powers of 0.87 and 0.7 for α levels of 0.05 and 0.01, respectively. Moreover, effect sizes for children’s ASSR differences having statistical powers of 0.3, 0.5, and 0.8 at P < 0.05 were estimated to be 0.309, 0.419, and 0.617 nAmp. These were 45, 61, and 90% of the adults’ ASSR differences that were used for the original estimate of the effect size. These results indicate that there was large statistical power to identify attentional effects on children’s ASSRs, even though such effects were not found in the present study. In summary, the results showed that attention enhanced transient-evoked responses and ASSR in adults but only transient-evoked responses in children.
Fig. 3

Grand-averaged amplitude envelopes of tGBRs and 40-Hz ASSRs from sources located within left and right auditory cortices of adults (top) and children (bottom). Filled boxes plotted underneath the waveforms depict significant response differences between stimulus conditions (relevant vs. irrelevant; upper boxes) and significant response interactions for group (adult vs. child) by stimulus condition (relevant vs. irrelevant; lower boxes). Grey and black boxes denote intervals with significant differences at P-values <0.05 and <0.01, respectively

Fig. 4

Grand-averaged tGBRs and 40-Hz ASSRs from sources located within left and right auditory cortices of adults (top) and children (bottom)


Sources for ASSRs and transient responses were found to be within the medial and lateral aspects of Heschl’s gyri, respectively. This is consistent with previous reports (e.g., Herdman et al. 2003; Pantev et al. 1996). Given such spatial dissociation for these evoked components between the primary and secondary auditory cortices, the effects of attention on these components could also be spatially separated. When considering the following discussion; however, it should be kept in mind that the transient evoked responses and 40-Hz ASSR reflect only part of the cortical processing that occurs within the auditory core and belt areas. Attention might affect other cortical processing in a different manner. Notwithstanding, results from the present study provide evidence for maturational differences in attentional modulation of auditory evoked responses.

The transient auditory evoked response results from both adults and children showed that directing attention to relevant-standard tones and away from irrelevant-standard tones enhanced the sustained responses between 200 and 500 ms. This was to be expected from the highly reliable finding that response negativities are greater to attended than unattended sounds (e.g., Hillyard et al. 1973; Näätänen and Michie 1979; Berman and Friedman 1995; D’Angiulli et al. 2008). These results demonstrated that the experimental procedures were sufficient to elicit attentional effects in both children and adults. The current results cannot rule out contributions from other extrasensory regions (Ross et al. 2010) because a whole-brain source analyses was not performed (e.g., event-related beamforming; Cheyne et al. 2006). However, the present study provides further evidence that attention-related difference negativities exist within the lateral aspect of Heschl’s gyrus (i.e., secondary auditory cortices) in adults (Ross et al. 2010; Woldorff et al. 1993) and children (present study). The transient responses showed different morphologies in that an N2 response was more prominent in children than adults. This could be maturational/developmental differences in response morphologies that are known to exist for auditory stimuli (for reviews see Picton and Taylor 2007; Wunderlich and Cone-Wesson 2006). However, the development of N2 under attentional demands is yet unknown and requires further investigation.

The current study also provided evidence that attention could modulate the 40-Hz ASSRs in adults. This is consistent with previous findings of significantly larger 40-Hz ASSRs to attended than unattended sounds (Lazzouni et al. 2010; Ross et al. 2004; Tiitinen et al. 1993). Because 40-Hz ASSRs were shown to be generated from the medial aspect of Heschl’s gyri (i.e., primary auditory cortices; (Herdman et al. 2003; Pantev et al. 1996; current study), attention appears to play a role in modifying cortical response at this level in adults. The novel finding of the present study was that children’s 40-Hz ASSRs were clearly present (Fig. 4) but not modulated by attention as was the case in adults (Fig. 3). Moreover, children’s transient and sustained responses showed attentional effects and thus revealed that attention can modulate evoked responses in children, at least at the levels beyond the primary auditory cortex. This could indicate that the selective auditory attention network involving the primary auditory cortex is not yet fully matured by the age of 12 years old.

These findings might indicate that the maturation of attention proceeds from a top-down direction; whereby discrimination between relevant and irrelevant stimulus information is downloaded to lower and lower processing stages in order to increase processing efficiency. This would also free up resources at higher processing stages (i.e., cognitive areas) in order to perform more complex functions, such as responses selection and monitoring. A top-down maturation of attention also fits with previous findings that children with low-socioeconomic status (SES) had significantly reduced attentional modulation of ERPs as compared to children from high-SES (D’Angiulli et al. 2008; Stevens et al. 2009). D’Angiulli et al. (2008) also found that low SES children had significant frontal theta response differences between attended and unattended sounds as compared to trivial theta responses differences in high-SES children. One interpretation of these findings was that low-SES children recruited higher-level executive resources (i.e., frontal theta) to filter relevant from irrelevant auditory information, whereas high-SES children recruited lower-level auditory cortical resources to filter such information. Although much research is still needed in this area, these findings could indicate low-SES children have more immature selective auditory attention networks compared to their high-SES peers. Low-SES children might receive less experience in selectively attending to auditory information, such as in music lessons or parent–child reading sessions, which could delay maturation of the selective auditory attention network. However, further research is needed to determine this. Accordingly, having an attentional filter earlier in the sensory processing stream would allow the system to be more efficient and free up executive resources to perform other demanding functions. Thus, the lower in the system that attention can control information processing the more automatic and efficient it likely can become at processing the wealth of information impinging on the sensory apparatus.

In summary, the current study provided evidence of attentional modulation of both children’s and adults’ transient sustained responses but only of adults’ 40-Hz ASSRs. These results indicate that selective auditory attention has yet to fully develop for processes underlying the 40-Hz ASSRs generated within the primary auditory cortices in children around the age of 12 years old. Moreover, these findings support a top-down mechanism for the maturation of attention, whereby development shifts the selection of auditory information (relevant versus irrelevant) from higher to lower processing levels along the auditory pathway in order to free up executive resources needed to perform other complex demanding function (e.g., response selection and monitoring).


Grants from the Natural Sciences and Engineering Research Council of Canada, Human Early Learning Partnership, and Michael Smith Foundation for Health Research supported this project.

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.School of Audiology and Speech Sciences, Faculty of MedicineUniversity of British ColumbiaVancouverCanada

Personalised recommendations