Background

Autism is defined by a cluster of symptoms in the domains of social communication and restricted or repetitive behaviors or interests (American Psychiatric Association, 2013).Footnote 1 Hypo- as well as hyper-sensitivity to sensory information is a common feature of autism (American Psychiatric Association, 2013), with sensitivity to auditory stimuli commonly reported in the autism literature (Dunlop et al., 2016; Matsuzaki et al., 2012; Matsuzaki et al., 2014; for review see O'Connor, 2012;Williams et al., 2021b). Although auditory brain processes can be assessed using functional magnetic resonance imaging (for review see Talavage et al., 2014), electrophysiology methods such as electroencephalography (EEG) and magnetoencephalography (MEG) are preferred as they provide measures of brain activity in real time (Hari & Puce, 2017), a necessity given that changes in brain activity in response to auditory information occur on the time-scale of milliseconds (Poeppel, 2003; Poeppel et al., 2012). As auditory processing is crucial for language development (e.g., Benasich et al., 2014), understanding the neural maturation of the auditory system in both typically developing (TD) children and children with autism is of clinical importance.

Time-domain EEG recorded P50 and N100 evoked responses (and MEG analogs the M50 and M100) are commonly examined auditory neural measures. These evoked components are associated with early sound detection and feature encoding (Hari & Puce, 2017; Lee et al., 1984; Liégeois-Chauvel et al., 1994; Näätänen & Picton, 1987). The P/M50 is the dominant auditory evoked response in young children (for MEG review see Green et al., 2020), with the infant analog of the P/M50 observed even in neonates and in utero (see review by Chen et al., 2019). The P/M50 is generated in the left and right superior temporal gyrus (Huotilainen et al., 1998; Mäkelä et al., 1994; Pelizzone et al., 1987; Reite et al., 1988; Yoshiura et al., 1995; Yvert et al., 2001). P/M50 latency is between 85 and 95 ms in children 5 to 6 years old (Ceponiene et al., 2002; Wunderlich & Cone-Wesson, 2006), and decreases as a function of age until reaching adult-like latencies by late adolescence (Edgar et al., 2020; Paetau et al., 1995; Sharma et al., 2002).

By late adolescence the N100 (EEG) or M100 (MEG) response is the dominant response. In adults, the N/M100 occurs between 75 and 130 ms post-stimulus and is generated in left and right Heschl’s gyrus and the planum temporale (Hari, 1990; Lütkenhöner & Steinsträter, 1998). The N/M100 is often not observed in young children (Bruneau et al., 1997; Ceponiene et al., 1998; Edgar et al., 2014, 2020; Paetau et al., 1995; Ponton et al., 1996a, 1996b; Ponton et al., 1996a, 1996b), and recent research shows large between-subject variability in the presence/absence of the M100 in children 6 to 12 years old (Edgar et al., 2020). By early adolescence, however, the N/M100 is consistently observed (Bruneau et al., 1997; Ponton et al., 2000, 2002; Satterfield et al., 1984), generally first in the right than left hemisphere given hemisphere differences in maturation rates (Edgar et al., 2014; Huotilainen et al., 1998; Kotecha et al., 2009; Parviainen et al., 2019; Port et al., 2016; Roberts et al., 2000).

Studies also examine auditory encoding neural processes via assessing activity within frequency bands as a function of time, often referred to as time–frequency measures. Time–frequency measures reflect local neural circuit activity such as phase locking (PL) and total power (TP) (Başar et al., 2001; Lachaux et al., 1999; Makeig, 1993; Pfurtscheller & Klimesch, 1991). PL is a normalized measure of trial-to-trial similarity in neural activity within a specific frequency band, measuring intra-individual variability. TP is the pre- to post-stimulus change in brain activity within a specific frequency band. To date, only a few pediatric studies have reported on associations between age and auditory time–frequency measures for pure tone stimuli (Edgar et al., 2015a; Edgar et al., 2015c; Port et al., 2019) or 40 Hz auditory stead-state response stimuli (Edgar et al., 2016).

An ever-growing pediatric literature identifies differences in auditory encoding measures (time-domain and time–frequency) between TD and autism (TD; De Stefano et al., 2019; Edgar et al. 2015a, 2015c; Port et al., 2019; Roberts et al., 2010, 2013, 2019; Seymour et al., 2020; Stephen et al., 2017; Williams et al., 2021a). In a recent meta-analysis of EEG and MEG studies examining auditory evoked responses, Williams et al., (2021a) found evidence for earlier P/M50 and M100 latencies in TD versus autism. Studies of school-age children have also reported time–frequency differences between children with TD and children with autism, such as greater PL in TD than autism (Edgar et al., 2015a; Port et al., 2019). Auditory pre-stimulus activity has also been examined, with stronger pre-stimulus activity in TD than autism during an auditory encoding task interpreted as indicating increased baseline noise in autism (Edgar, et al., 2015c).

Nearly all studies examining auditory encoding in autism are cross-sectional, with only a few small sample longitudinal studies (Port et al., 2016; Yoshimura et al., 2018). As detailed in position papers from our group (Edgar, 2020; Edgar et al., 2019, 2020), comparing groups can be a concern in pediatric brain imaging studies given potential group differences in brain maturation. In particular, given group differences in brain maturation rates, pediatric studies that average findings across a large enough age range will miss effects, and cross-sectional comparisons will be complicated. As examples of differential brain maturation in TD and autism, studies examining resting-state neural activity have found that the peak alpha frequency is associated with age in children with TD but not children with autism (Dickinson et al., 2018; Edgar et al., 2015b, 2019). With respect to brain structure, Lee et al. (2021) found that given TD and autism group differences in gray-matter maturation trajectories, the pattern of gray-matter group differences changed across time.

The present study examined TD and autism group differences in primary/secondary auditory cortex neural activity in children 6 to 9 years old at their first visit, which is an age range younger than the majority of previous studies in this area (for meta-analysis see Williams et al., 2021a). Auditory encoding measures were obtained at Time 1 and then ~ 18 (Time 2) and 36 months later (Time 3). The longitudinal design allows for an examination of group differences in maturation rates. Given the possibility that TD and autism group differences in brain function are region-specific (e.g., auditory encoding group differences only in the right auditory cortex (e.g., Edgar et al., 2015c; Roberts et al., 2010)), brain function was assessed in brain space. Study findings were expected to identify group difference measures specific to age, brain location, and brain process, thus helping to understand brain development in autism as well as inform the design of future clinical and treatment studies. Based on previous findings, the following group differences were hypothesized: (1) delayed M50 and M100 latencies in autism versus TD (Edgar et al,.2015a, c; Roberts et al., 2013, 2019; Stephen et al., 2017;Williams et al.,2021a PL group differences in autism versus TD (De Stefano et al., 2019; Edgar et al,.2015a, c; Rojas et al., 2008; Seymour et al., 2020), and (2) higher pre-stimulus activity in autism versus TD (Edgar et al., 2015c).

Methods

Participants

TD children and children with autism 6 to 9 years old were recruited (Time 1). Although the original study was designed so that the Time 1, Time 2, and Time 3 visits would occur ~ 18 months apart, due to a pause in data collection caused by COVID-19, some of the Time 2 and Time 3 participants came for their visits outside of the 18-month interval. TD and autism children were selected according to the following criteria: a) native English speakers, b) no history of traumatic brain injury or other neurological condition, c) no genetic syndromes with extremely high incidence of autism (e.g., Fragile X), d) no intellectual disability, e) no premature birth, f) no reported hearing loss (based on parent report, physician concerns, or hearing aid use), g) no structural MRI (sMRI) contraindications, and h) no neuroleptic/antipsychotic medication, guanfacine (for attentional deficit/hyperactivity disorder or ADHD), or selective serotonin reuptake inhibitor treatment at Time 1. Participants taking stimulant medications were required to withhold medication for 24 to 36 h prior to their visits.

At Time 1, recruited participants with autism had a prior diagnosis, made according to Diagnostic Statical Manual—Fifth Edition (DSM-5; American Psychiatric Association, 2013) criteria by an expert clinician at the Children’s Hospital of Philadelphia (CHOP) or by autism specialists in the community. A targeted diagnostic study battery administered at Time 1 confirmed original diagnosis with the Autism Diagnostic Observation Schedule-2 (ADOS-2; Lord et al., 2012) and parent report on the Social Communication Questionnaire (SCQ; Rutter et al., 2003). Dimensional indices of autistic behaviors were obtained by parent report on the Social Responsiveness Scale-2 (SRS-2; Constantino & Gruber, 2012).

Members of the TD group were evaluated by licensed psychologists who ruled out the presence of DSM-5 diagnoses based on clinical judgment, review of the child's medical history form, parent screening interview, results of parent ratings on standardized behavior questionnaires (Child Behavior Checklist, Behavior Rating Inventory of Executive Functioning), and cognitive testing results completed by the child during the Time 1 visit. TD-specific inclusion criteria included scoring below the cut-off for autism concern on the SCQ and SRS-2 parent questionnaires. Additional TD-inclusion criteria included no first-degree relatives with autism and no history of speech/language disorder, learning disability, ADHD, or psychiatric disorders.

To rule out intellectual disability for both groups, an estimated nonverbal intelligence quotient (eNVIQ; obtained from the Fluid Reasoning Index) or a verbal intelligence quotient (VIQ; obtained from the Verbal Comprehension Index) ≥ 70 on the Wechsler Intelligence Scale for Children—Fifth Edition (WISC-V; Wechsler, 2014) was required. See Table 1 for age, SRS-2, and eNVIQ scores for the Time 1 sample.

Table 1 Sample Demographics

Sample at Each Time Point

Group demographics and evaluable samples by group and time are listed in Table 1. At recruitment, 8 children with autism (~ 12%) were prescribed stimulant medications, with all children withholding stimulants for at least 24 h, but no more than 36 h, prior to testing throughout the study. Time 2 data were obtained on average 19 months (SD = 2) after the Time 1 exam. Time 3 data were obtained on average, 21 months (SD = 3) after the Time 2 exam.

Reasons for unevaluable Time 1 imaging data included excessive motion, metal dental work, or the child being unable to tolerate the full auditory task. Reasons for attrition following Time 1 were as follows: missed Time 2 visit due to COVID-19 lab closure (N = 12), missed Time 3 visit due to COVID-19 lab closure (N = 3), a new exclusion diagnosis at Time 2 or Time 3 (N = 13), placement of braces between time points (N = 3), or the family no longer wishing to participate, primarily due to COVID-19 concerns (N = 28). New exclusion diagnoses at Time 2 or Time 3 included: diagnosis of a depression (N = 2), anxiety (N = 3), attentional deficit disorder (N = 2), intermittent explosive disorder (N = 1), or learning disability (N = 2) in TD children and seizure disorder in children with autism (N = 3). Diagnoses were made in the community and voluntarily disclosed to the study team during the phone screening prior to scheduling follow-up study visits.

Auditory Stimuli

Immediately prior to the MEG exam, tones were presented binaurally and incrementally until reaching auditory threshold for each ear to obtain individual thresholds (i.e., stepwise approach). During the thresholding process, study staff ensured that each child demonstrated an auditory evoked response. During the auditory encoding exam, tones of 500 Hz frequency and 300 ms duration were binaurally presented at 45 dB sensation level above threshold via ear-tip inserts (ER3A, Etymotic Research, IL, USA). The inter-trial interval varied between 600 and 2000 ms, with 520 trials collected over 14 minutes. Children were in the supine position during data collection and were instructed to ignore the sounds while watching a silent movie (e.g., Pixar Shorts).

MEG and sMRI Data Acquisition

MEG data was obtained in a magnetically shielded room, using a 275-channel MEG system (CTF MEG, Coquitlam, BC). Electro-oculogram (EOG) and electrocardiogram (ECG) were also obtained. After applying a band-pass filter (0.03 to 150 Hz), EOG, ECG, and MEG signals were digitized at 1,200 Hz with 3rd-order gradiometer environmental noise reduction. Structural magnetic resonance imaging provided T1-weighted, 3-D MP-RAGE anatomical images for source localization (3 T Siemens Prisma scanner).

Source Localization

MEG data were down-sampled to 500 Hz. Artifact correction was applied to remove eyeblink activity using BESA 6.1 (MEGIS Software GmbH, Grafelfing, Germany). Non-eyeblink artifact trials were rejected by amplitude and gradient criteria (amplitude > 1200 fT/cm, gradients > 800 fT/cm/sample). Artifact-free trials (from -500 ms to +500 ms) were then averaged. As shown in Table 2, autistic children had fewer artifact free trials than TD at all time points (Time 1 = 1.5% fewer trials, Time 2 = 1.5% fewer trials, Time 3 = 4% fewer trials). T-tests revealed significant group differences in number of artifact free trials only at Time 3 (p < 0.001).

Table 2 Number of trials by group and time point

Source localization for each subject was performed using their grand-average evoked response. For each session, MEG and sMRI data were co-registered using BESA 6.1. In cases where sMRI data was not evaluable due to motion or could not be collected due to scheduling conflicts (Time 1: TD = 3, autism = 4; Time 2: TD = 0, autism = 3; Time 3: TD = 0, autism = 5), the child’s MEG data were co-registered to an age-matched template (Richards et al., 2016).

Left and right M50 and M100 ms sources were examined. As the primary generator of the M50 and M100 is well modeled by a single dipole in left and right superior temporal gyrus (STG) and surrounding regions (see Edgar et al., 2003, 2014), source localization was performed using an anatomical constraint. In particular, each child’s left and right Heschl’s Gyrus were visually identified and a dipole regional source was manually placed at the ‘center’ of each Heschl’s Gyrus.

Dipole orientations were obtained in BESA 6.1 after applying a 2 Hz (24 dB/octave, zero-phase) to 55 Hz (48 dB/octave, zero-phase) band-pass filter to the sensor evoked data used in source localization. The orientation of the principal axis of each left- and right-hemisphere regional dipole source was rotated to the maximum of the left and right M50 and then left and right M100 response for each child to optimize the orientation of the standard regional sources.

Identification of M50 and M100

M50 and M100 identification was performed blind to age and group. M50 and M100 scoring for each subject was carried out separately by two authors. In cases of a disagreement, both raters met and discussed the waveform until the consensus was reached. Presence of an M50 and M100 response was determined based on amplitude, latency, and hemisphere ingoing and outgoing flux topography. M50 was operationally defined as the first reversal in magnetic-field topography preceding M100 (or the first within a 35 to 125 ms range if M100 was not present). M100 was scored if the magnetic flux topography was characteristic of the M100 response (i.e., for M100 left hemisphere ingoing anterior, outgoing posterior), was preceded by M50 (i.e., flux topography opposite M100), and followed by M200 (i.e., flux topography same as M100), and with source strength greater than baseline. When an M50 or M100 response was observed, left and right M50 (35 to 125 ms) and M100 (80 to 195 ms) scoring windows were used to identify the signal maxima in each window. When M50 or M100 responses were observed, peak latency values were recorded. For an extended discussion on scoring M50 and M100, see Edgar et al. (2020).

Time–Frequency Measures

In the time–frequency domain, PL and TP were examined. The calculation of single-trial phase and magnitude for the left and right STG sources used a modification of Hoechstetter et al. (2004) procedures where for each participant, the derived source model was applied to the raw unfiltered data. Transformation from the time-domain to the time–frequency domain used complex demodulation procedures implemented in BESA 6.1 (Papp & Ktonas, 1977),Footnote 2 using frequencies between 4 and 56 Hz, in steps of 2 Hz. Continuous data were analyzed relative to the tone onset every 25 ms, utilizing ± 39.4 ms and ± 2.83 Hz (full width at half maximum parameters) of contiguous data at each 25 ms step. Time–frequency measures were computed from -500 to 500 ms relative to stimulus onset. Total power was calculated by averaging the time–frequency spectra of each MEG epoch. When baseline power is subtracted, post-stimulus TP assesses the pre- to post-stimulus change in oscillatory activity. PL is a normalized measure assessing the trial-to-trial similarity of oscillatory activity (Lachaux et al., 1999). Baseline activity was also measured by averaging the time–frequency spectra of the baseline epoch.

Statistical Analyses

Independent samples t-tests examined group differences in age, eNVIQ, and SRS-2 upon entry into the study (Time 1). As an M100 response was not observed in most children (especially at Time 1 and 2 – see Section 3.1), the presence/absence of an M100 across hemisphere, time, and group was conducted via a series of chi-square tests. For the remaining auditory encoding measures (M50, PL, and TP), linear mixed models were conducted with M50 latency, PL, or TP as the dependent variable, and time, group, and hemisphere as fixed factors, and subject as a random factor. Bonferroni was used to adjust for multiple comparisons when assessing pairwise comparisons. The following paragraph details how the PL and TP dependent measures were obtained.

Initial time–frequency analyses used t-tests to examine activity at each even-number frequency between 4 and 56 Hz in 25 ms bins and from -500 to 500 ms from stimulus onset, resulting in 1,107 t-tests (27 frequencies × 41 time bins) for each hemisphere. A cluster size needed to obtain familywise correction was computed using ANFI 3dClustSim (using the 3dttest +  + option and setting the z-dimension to 0), computing a cluster-size threshold for a given pixel-wise p-value of p = 0.05 (https://afni.nimh.nih.gov/). For PL, given an initial p-value threshold of 0.05 for each time–frequency value, permutation testing (1000 iterations) showed a cluster size (adjacent in time and/or frequency) of 35 at Time 1, a cluster size of 32 at Time 2, and a cluster size of 40 at Time 3, provided a family-wise error corrected p = 0.05. For TP, a cluster size of 89 at Time 1, 96 at Time 2, and 94 at Time 3 provided family-wise correction. Once significant time–frequency group-difference regions-of-interest (ROIs) were identified, an average PL or TP value within each ROI (using a fixed time and frequency range across visits) was obtained for each child. These were used for further statistical examination. Finally, exploratory analyses assessed associations between auditory brain measures and eNVIQ or SRS-2 scores.

Given assumptions inherent to linear mixed models (necessary giving missing data), all primary analyses were also conducted via ANOVAs with Huynh–Feldt correction (i.e., including only the participants with evaluable data at all three time points), with time and hemisphere as within-subject factors and group a between-subject factor (25 TD and 25 autism had data from all 3 time points). All primary analyses were also rerun excluding the female participants (Time 1: TD N = 11, autism N = 9; Time 2: TD N = 6, autism N = 7; Time 3: TD N = 4, autism N = 2). For both the ANOVA analyses and the male-only analyses, the pattern of findings remained unchanged, and thus these supplemental analyses are not reported.

Results

As shown in Table 1, TD and autism did not differ on age (ps = 0.16—0.72) or eNVIQ (ps = 0.18—0.55; see Table 1) at any time point. At all times, lower SRS-2 scores were observed in TD than autism (ps < 0.001). In each group, t-tests showed that eNVIQ and SRS-2 scores did not differ between time points (TD: ps = 0.73—0.91; autism: ps = 0.64—0.97), indicating that although sample size decreased across time, the general clinical characteristics of the sample did not differ across visits. Chi-square analyses showed that the proportion of males to females remained the same for each group across time (ps = 0.40—0.82).

Presence or Absence of M100

M100 latency values by hemisphere are provided in Table 3. As most children did not show an M100 response, M100 latency group differences were not examined (see Edgar et al., 2020 for several examples on scoring the presence/absence of an M100). Uncorrected exploratory chi-square analyses were carried out examining differences in the presence/absence of M100 between group, time, and hemisphere. At Time 1, whereas no group difference was observed for the presence of left M100 (present: TD = 8%, autism = 14%, p = 0.33, phi = 0.09), a right M100 was more likely in children with autism than TD (TD = 20%, autism = 38%, p = 0.03, phi = 0.20). At Time 2 and Time 3, no group differences (ps ranging from 0.08—0.60) were observed in the presence of a left (Time 2: TD = 9%, autism = 23%; Time 3: TD = 42%, autism = 35%) or right M100 (Time 2: TD = 42%, autism = 50%; Time 3: TD = 58%, autism = 68%). Regarding time (collapsing across group), M100 responses were observed more often at Time 3 than Time 2 than Time 1 in both the left (Time 1: 11%, Time 2: 16%, Time 3: 39%, p < 0.01, phi = 0.28) and right hemisphere (Time 1 = 29%, Time 2 = 46%, Time 3 = 63%, p < 0.001, phi = 0.27). Regarding hemisphere (collapsing across group), the M100 response was more often observed in the right than left hemisphere at all time points (Time 1: left = 11%, right = 29%, p < 0.001, phi = 0.23; Time 2: left = 16%, right = 46%, p < 0.001, phi = 0.32; and Time 3: left = 39%, right = 63%, p = 0.01, phi = 0.24).

M50 Time-Domain Findings

See Table 3 for M50 latency values (see also online supplement Fig. 1 for grand average M50 source waveforms). At Time 1, M50 latency did not differ between children with data at all three time points and those with just Time 1 data (ps = 0.68—0.76), indicating a lack of bias among subjects with incomplete data.

Table 3 Latency values (ms) in subjects with an observed M50 or M100

Linear mixed model analysis on the full sample (Time 1: TD N = 61, autism N = 66; Time 2: TD N = 43, autism N = 47; Time 3: TD N = 31, autism N = 31) revealed main effects of time (F(2, 447.93) = 56.27, p < 0.001, M50 ~ 10 ms earlier at Time 3 than Time 1) and hemisphere (F(1, 422.48) = 27.32, p < 0.001, M50 ~ 4 ms earlier in the right than left hemisphere). Neither the main effect of group nor any of the group interaction terms were significant (ps = 0.47—0.96).

eNVIQ and SRS-2 scores were not associated with M50 latency at any time point for either hemisphere (eNVIQ: ps = 0.12—0.77; SRS-2: ps > 0.27—0.94).

Time–Frequency Analyses

Due to insufficient observations of an M100 response at any time point, time–frequency measures were derived using the M50 source model (i.e., using the continuous raw data left and right M50 source timecourses).

Phase Locking

As shown in Fig. 1 (see online supplement Fig. 2 for raw time–frequency plots), PL statistic maps showed a post-stimulus Time 1 group difference from ~ 75 to 250 ms and ~ 10 to 20 Hz (family-wise corrected ROI highlighted; a cluster size of 35 needed for significance and a cluster size of 62 observed). No group-difference PL ROIs were observed at subsequent time points. Figure 1 shows the Time 1 group ROI means (with standard error bars) for each hemisphere and time (online supplement Fig. 3 displays this information using violin plots). Figure 2 shows the same information as Fig. 1 but as a function of group and time.

Fig. 1
figure 1

Left panel: PL family-wise corrected statistical maps for the left and right hemisphere. ROIs are outlined with a black box. Right panel: PL mean and ± 2 standard error bars for each time and group (TD = blue, autism = red) and hemisphere (10 to 20 Hz activity averaged from 75 to 250 ms) with PL = 1 reflecting no phase variability and PL = 0 reflecting maximal phase variability across trials. Note: * p < 0.05, ** p < 0.01

Fig. 2
figure 2

Left panel: PL mean and ± 2 standard error bars for each group and time and hemisphere. ROIs are outlined with a black box. Right panel: Pre-stimulus activity mean and ± 2 standard error bars for each group and time and hemisphere. Note: * p < 0.05, ** p < 0.01

Using the Time 1 ROI, PL values for each child were obtained for Time 2 and Time 3. A linear mixed model carried out on the full sample (Time 1: TD N = 61, autism N = 65; Time 2: TD N = 43, autism N = 46; Time 3: TD N = 31, autism N = 30), revealed a main effect of hemisphere (F(1, 409.39) = 42.33, p < 0.001, with a + 0.03 PL value in the right than left hemisphere when collapsing across time) and a group X time interaction (F(2, 444.41) = 7.59, p < 0.001). Simple-effect analysis of the interaction showed higher PL in autism than TD at Time 1 (p < 0.001) and Time 2 (p = 0.04) but not at Time 3 (p = 0.57). To directly assess Time 1 to Time 3 group PL maturation differences, the linear mixed model analysis was repeated eliminating the Time 2 data so that the interaction term considered only the change from Time 1 to Time 3. The group X time interaction was again significant (p < 0.001), with PL differences between TD and autism at Time 1 (p < 0.001) but not Time 3 (p = 0.34).

Given that M100 responses (when observed) occurred at approximately the same time interval as the Time 1 PL group-difference ROI (75 to 200 ms), exploratory analyses examined associations between the presence of M100 and the group-difference PL ROI values (see Table 4). Due to small sample sizes, presence/absence of M100 was not explored separately by group (see Table 1 in the online supplement for mean PL values by group and presence/absence of M100). At Time 1, whereas children with and without a left M100 did not have significantly different left PL values (p = 0.30, d = 0.36), children with a right M100 had greater PL than those without a right M100 (t(49.18) = -3.30, p = 0.002, d = 0.71). At Time 2, children with a left M100 had greater PL than those without a left M100 (t(86) = -3.95, p < 0 0.001, d = 0.99). Similar Time 2 results were observed for the right hemisphere (t(85) = -4.58, p < 0.001, d = 0.91). At Time 3, children with a left M100 had greater left PL than those without a left M100 (t(59) = -1.98, p = 0.05, d = 0.66). Similar results were observed for the right hemisphere (t(59) = -4.37, p < 0.001, d = 1.15).

Table 4 M50 PL values by group and presence/absence of M100 for Time 1 and Time 2

In the children with M100 responses, exploratory regressions evaluated associations between PL and M100 latency. Associations were significant at Time 1 in the right hemisphere (N = 37, R2 = 0.10, p = 0.05), Time 2 in the left hemisphere (N = 15, R2 = 0.43, p = 0.01), and at Time 3 in both the left (N = 24, R2 = 0.27, p = 0.01) and right hemisphere (N = 39, R2 = 0.11, p = 0.04).

Finally, Time 1 regressions (i.e., for entire sample) revealed a negative association between eNVIQ and right PL (p = 0.04) and positive associations between SRS-2 and left (p = 0.02) and right PL (p = 0.002). These associations, however, were driven by group differences in assessment scores as well as PL values, with no associations found for either group when run separately (TD and autism ps = 0.76 – 0.77). Time 2 and Time 3 analyses revealed no associations between eNVIQ or SRS-2 and left or right PL (ps = 0.16—0.92).

Total Power

Although groups did not differ on post-stimulus TP, a pre-stimulus group difference from -500 to -25 ms and 30 to 50 Hz was observed in the right hemisphere at Time 1 (see Fig. 3 family-wise corrected ROI highlighted; a cluster size of 89 needed for significance and a cluster size of 289 observed). Using the Time 1 ROI, pre-stimulus values for each child were obtained for Time 2 and Time 3. Figure 2 shows the pre-stimulus ROI group means (with standard error bars) for each hemisphere and time.

Fig. 3
figure 3

Left panel: Pre-stimulus activity family-wise corrected statistical maps for the left and right hemisphere. ROIs are outlined with a black box. Right panel: Pre-stimulus activity mean and ± 2 standard error bars for each time and group (TD = blue, autism = red) and hemisphere (30 to 50 Hz activity averaged from -500 to -25 ms). Note: * p < 0.05, ** p < 0.01

A linear mixed model using the full sample (Time 1: TD N = 61, autism N = 66; Time 2: TD N = 43, autism N = 47; Time 3: TD N = 31, autism N = 30) revealed a time X hemisphere interaction (F(2, 414.82) = 4.35, p = 0.01) and a marginally significant group X hemisphere interaction (F(1, 414.80) = 3.05, p = 0.08). Simple-effect analyses showed higher pre-stimulus activity in the right than left hemisphere at Time 1 (p = 0.004) but not Time 2 or 3 (ps = 0.25—0.71), and higher pre-stimulus activity in autism than TD in the right hemisphere when collapsing across time (p = 0.03; see Fig. 3).

Exploratory regressions, carried out separately by hemisphere and time, revealed that pre-stimulus activity was not associated with the post-stimulus PL ROI values (R2s = 0.001 to 0.03, ps = 0.16 to 0.76), which suggests that the pre-stimulus activity and post-stimulus PL group differences reflect distinct auditory cortex findings.

Finally, Time 1 regressions (i.e., for entire sample) revealed a Time 1 positive association between SRS-2 and right pre-stimulus activity (p = 0.01). As with PL, this association was driven by group differences in assessment scores as well as pre-stimulus power values. Separate analyses for TD and autism showed no associations (ps = 0.29—0.86).

Discussion

Longitudinal findings showed group differences in auditory cortex maturation. Findings suggested a period of rapid auditory cortex maturation in autism, earlier than TD (prior to and surrounding the Time 1 exam), followed by a period of faster maturation in TD than in autism. As a result of these group maturation differences, group M100 (absence/presence) and PL differences were observed at Time 1 but not Time 3. In contrast, higher right-hemisphere pre-stimulus activity in autism than TD was found at all time points, indicating that this neural brain measure was stable (i.e., group effect not influenced by maturation).

The above pattern of findings can be used to generate hypotheses regarding brain development in autism. The Time 1 (6 to 9 years old) group differences in right M100 but not M50 are of particular interest. M50 and M100 maturation likely reflect changes in cortical gray matter, such as changes in synaptic efficiency (Eggermont & Salamy, 1988; Gomes et al., 2001) as well as maturational changes in the morphology of primary auditory cortex pyramidal cells (Elston & Fujita, 2014; Elston et al., 2010). Whereas development of auditory cortex deep layers (lower layer III to layer VI) occurs between 6 months and 5 years of age (Ponton et al., 1999), the superficial layers (upper layer III and layer II) continue to mature until about age 12 (Moore & Guan, 2001; Moore & Linthicum, 2007). Ponton et al., hypothesized that the 50 ms auditory response reflects, in part, recurrent activation in deep layers III and IV – the termination zone of thalamo-cortical pathways that are well developed by age 6 years (Ponton & Eggermont, 2001; Ponton et al., 1999, 2002). In the present study, observation of an M50 response in all children is consistent with early development of cortical layers III and IV in both groups, with the Time 1 M50 findings suggesting that the two groups had obtained similar development of these neural circuits by 6 to 9 years. Of note, however, is that although the P/M50 response is observed in young children, the P/M50 response continues to mature throughout childhood, with adult-like P/M50 latencies not observed until late adolescence (Edgar et al., 2020; Paetau et al., 1995; Sharma et al., 2002).

In the present study, children with TD and autism did not differ on M50 latency. The recent Williams et al., (2021a) meta-analysis of fourteen P1/M50 studies found that group-difference effect sizes ranged from − 0.72 to 1.14, and with the meta-analytic model indicting a later P1/M50 latency in autism than TD. Williams et al., (2021a) reported that the published studies did not allow support either for or against an effect of age on P/M50 TD and autism group differences (based on a Bayes Factor Comparing Moderated Model to Baseline Model BF10 value of 0.45). Crucially, the mean age across the studies examined in Williams et al. was 12 years old (see Williams et al. Table 2), with most studies reporting on children older than children examined in this study, even at Time 3. The present study, which is to our knowledge the largest sample longitudinal study examining auditory encoding processes in TD and autism in a restricted age range, provides additional information in this area.

In contrast to M50, M100 is observed less often in young children (Edgar et al., 2014, 2020; Ponton et al., 1999, 2000). Present findings replicate previous work in this area, with M100 responses observed much more often at Time 3 than Time 1. The Time 1 observation of a right M100 more often observed in autism than TD may indicate rapid development of M100 in autism. As it has been hypothesized that the slow development of M100 reflects greater synchronization in the afferent activity arriving at the synapses in layer II and upper layer III across childhood (Ponton & Eggermont, 2001; Ponton et al., 1999, 2000, 2002), perhaps there is altered early development of these pathways in autism. The PL findings – greater Time 1 and Time 2 PL in autism than TD – are consistent with this hypothesis (see details in following paragraph). Findings of M100 group differences only in the right hemisphere suggest TD and autism group differences in the development of M100 auditory cortex processes may be lateralized in children with autism. Lateralized right-hemisphere differences in autism have been reported in the auditory literature (Edgar et al., 2015a; Gage et al., 2003; for review see Lindell & Hudry, 2013; Oram Cardy et al., 2008; Roberts et al., 2010; for review see Yu & Wang, 2021). Alternatively, as the primary/secondary auditory cortex matures more rapidly in the right than left hemisphere (Edgar et al., 2014; Huotilainen et al., 1998; Kotecha et al., 2009; Parviainen et al., 2019; Port et al., 2016; Roberts et al., 2000), it is possible that left hemisphere M100 differences would also be observed as these children get older.

In the present study, the PL findings demonstrated that the pattern of TD and autism group differences change across childhood, with an initial period of rapid auditory cortex maturation in autism, earlier than TD (prior to and surrounding the Time 1 exam), followed by a period of faster maturation in TD than autism (Time 1—Time 3). As a result of group maturation differences, post-stimulus group differences were observed at Time 1 but not Time 3. A hypothesis is that the M100 and PL group differences can be understood as two ways of measuring a single phenomenon; in particular, greater right post-stimulus PL was observed in children with present versus absent M100 responses (this finding was not significant in the left hemisphere at Time 1, likely due to the limited number of observations of a left hemisphere M100). It is also of note that the PL ROI overlapped with the latency of the observed M100 responses, with analyses comparing PL and M100 latency showing right-hemisphere associations at Time 1, left-hemisphere associations at Time 2, and bilateral associations at Time 3.

Although there is a need for replication, present findings demonstrate the fleeting nature of some auditory cortex autism markers. The transient nature of these markers presents important implications for interpreting findings across studies as well as planning studies. For example, a comparison of M100 and PL findings across studies needs to consider the age (and age range) of the respective samples. Additionally, there is need to consider age when selecting auditory cortex neural measures as indications of treatment response. For example, selection of an auditory cortex neural measure as a marker indicating treatment response needs to consider the age of the sample in the clinical trial in order to identify appropriate neural measures (see Edgar et al., 2020).

The PL group-difference pattern observed in the present study is not unique to the auditory cortex. Indeed, TD and autism group differences in brain maturation have been reported for other brain measures. As an example, Edgar et al. (2019) explored the relationship between eyes-closed resting-state peak alpha frequency (PAF = the frequency showing the most resting-state activity) and age in 6- to 17-year-old male children with TD and autism. Results revealed a positive association between PAF and age in TD but not autism. As a result, PAF group differences were much more pronounced in the younger (6 to 10 years) than older children (10 to 17 years). Studies of cortical structure also provide examples of different patterns of brain maturation in TD than autism. For example, a recent large sample longitudinal study by Lee et al. (2021) reported TD and autism differences in gray-matter maturation. Different patterns of white-matter maturation in TD and children with autism have also been observed (Ouyang et al., 2016).

While development constrains the utility of some auditory measures in these groups, present and previous findings suggest a more stable auditory cortex difference between TD and autism: pre-stimulus activity. Here, higher right pre-stimulus activity was observed in autism than TD at Time 1, 2 and 3. As shown in Fig. 3, this group difference remained stable even though each group showed a reduction in pre-stimulus activity (30 to 50 Hz) from Time 1 to Time 3, suggesting that this measure is not temporally constrained within this sample. This pre-stimulus finding replicates Edgar et al., 2015c), where higher pre-stimulus activity in autism than TD was observed in older children. As detailed in Edgar et al., 2015c), higher baseline activity in autism versus TD suggests a fundamental signal-to-noise deficit in individuals with autism.

Finally, the earlier development of M100 PL in the right than left primary/secondary auditory cortex in both groups is consistent with the many studies indicating earlier maturation of the right than left auditory cortex in infants (Musacchia et al., 2013; Paetau et al., 1995) as well as both young children and adolescents (Albrecht et al., 2000; Edgar et al., 2020; Poulsen et al., 2009; Yoshimura et al., 2013). Such findings demonstrate that although the maturation of auditory cortex processes are different in TD and autism, a perhaps more fundamental feature of brain development is similar.

Limitations

There are several study limitations. First, the children with autism had NVIQ or VIQ scores ≥ 70, with future studies needed to determine whether the present findings are also observed in children with autism with lower IQ scores. Relatedly, although group differences in M50 latency were not observed in the present study, M50 latency group differences may be more prominent in children with autism with lower language and cognitive scores (Roberts et al., 2019). Second, the Time 2 and 3 samples were smaller than the Time 1 sample, in large part due to COVID-19. However, even with smaller Time 2 and 3 samples the group x time PL interaction term was significant. Furthermore, examination of the Time 2 and 3 TD and autism neural measures (Figs. 1 and 2) do not indicate the need for additional subjects to observe ‘missed’ weak group difference effects.

Conclusions

Study findings suggest a period of early rapid right auditory cortex maturation in autism (Time 1 findings) followed by a period of faster maturation in TD than autism (Time 1 to Time 3). Whereas M100 and PL differences were temporally constrained due to group differences in maturation rates, higher pre-stimulus activity in autism than TD at all time points indicated that this neural brain measure was less influenced by maturation. This higher pre-stimulus activity in autism than TD suggests a more ‘noisy’ brain in autism than TD throughout childhood. Longitudinal brain imaging studies examining children in their transition from young childhood to early adolescence are of interest to further understand the developmental changes to auditory encoding processes in autism.