Introduction

Steady state responses arise when periodic stimuli in the environment evoke stable, stereotyped activity in sensory networks. These responses are highly rhythmic, such that predictable voltage fluctuations that match the frequency of the periodic stimulus arise and vary little in phase and amplitude1. For example, a click train delivered at the rate of 20 clicks per second elicits a 20 Hz auditory steady state response (ASSR) in the electroencephalogram (EEG). The ASSR is variously known as the auditory steady-state evoked potential (SSEP), the envelope following response (EFR), and the amplitude-modulation following response (AMFR). Due in part to the ease of conducting non-invasive EEG recordings, the ASSR has emerged as a potential biomarker for altered neural function in schizophrenia, autism, bipolar disorder, attention-deficit/hyperactivity disorder, and Alzheimer’s disease2,3,4,5,6,7,8,9,10,11. One intriguing aspect of the ASSR is the relationship between the amplitude modulation (AM) frequency (i.e., the rate of fluctuations in the sound envelope) and the magnitude of the ASSR12,13,14. The largest ASSR in humans typically occurs in response to 40 Hz AM14,15, and while there is an overall reduction in the ASSR across multiple AM frequencies in patient populations, the reduction in the ASSR to 40 Hz AM is particularly pronounced in patients with schizophrenia3,16,17,18,19,20. The utility of the ASSR as a translational tool, however, is compromised in part due to uncertainties regarding the mechanisms that underlie its generation.

There have been several competing explanations for why 40 Hz AM is the best modulation frequency for the ASSR. An early account is that transient middle latency responses in A1 superimpose optimally at the 25 ms inter-stimulus interval corresponding to 40 Hz21,22,23,24. While this superposition account provides a good explanation of the enhanced ASSR to 40 Hz AM, it fails to predict differences in the amplitude of steady state responses to AM frequencies outside of 40 Hz25,26. An alternative oscillatory account is that local circuits within A1 generate an intrinsic 40 Hz rhythm, giving rise to spontaneous oscillations at that frequency27,28. On such an account, neural resonance at 40 Hz underlies the timing of both transient and rhythmic responses in A1. Accordingly, A1 exhibits both spontaneous and stimulus-induced gamma band oscillations near 40 Hz29,30,31,32. These oscillations have been shown to strongly influence the timing of A1 activity, and to align temporally to the onset of auditory stimuli33,34,35,36,37. Others have suggested that spontaneous oscillations and the mid-latency response are related activities, as the amplitude of spontaneous 40 Hz gamma oscillations prior to the onset of AM tones at 40 Hz is inversely related to the magnitude of the subsequent middle latency response36.

Regardless, the larger ASSR to 40 Hz AM is thought to arise in part from the natural or preferred timing of local cell activity at that frequency, and numerous studies have linked the best AM frequency to the time course and magnitude of cell recruitment. Of note, the best AM frequency differs across subcortical and cortical recording locations38, reflecting varying degrees of cell recruitment39,40, variations in the timing of cell responses in different brain regions38, and changes in auditory responses across developmental stages41,42,43,44. It is thus possible that the ASSR changes across AM frequencies because the ASSR captures a summation of responses from different populations of cells, each with differing intrinsic properties and tendencies for transient versus oscillatory responses.

The present study aims to provide insight into the temporal coordination of activity within primary auditory cortex underlying resonant responses in the EEG. The ASSR is observable in multiple species (including humans15,45,46,47, non-human primates48,49, harbour porpoises50, rats41,43,51,52, mice53,54, cats55, bats44, rabbits38,56, Mongolian gerbils39,57, and chinchillas58) and can be recorded using multiple methods38,52,59,60,61. Accordingly, we conducted simultaneous silicon probe and skull EEG recordings in rats as they listened to tones modulated in amplitude at frequencies of 20, 30, 40, 50, and 80 Hz (Fig. 1). From a skull screw placed over the rat primary auditory cortex (henceforth termed auditory EEG), we observed the largest ASSR for 40 Hz AM, replicating prior studies15,62. To investigate the cause, we asked first whether 40 Hz neuronal oscillations are larger in signals acquired from the auditory EEG. Finding that the instantaneous amplitude of the AM frequency is in fact largest during 20 Hz AM (Fig. 2B), we then compared the magnitude of the ASSR to baseline activity recorded in the absence of periodic sound stimuli. Whereas the magnitude of the ASSR elicited by 20 and 30 Hz AM did not significantly differ from that recorded during the baseline period, 40 Hz AM and above substantially enhanced the ASSR (Fig. 3).

Figure 1
figure 1

Auditory stimulus administration and the auditory steady state response (ASSR). (A) Illustration of the approximate recording locations for the experiment. (B) Example histology of probe location in the rat primary auditory cortex, near the boundary of dorsal secondary auditory cortex (S1 = primary somatosensory cortex, AUD = secondary auditory cortex, A1 = primary auditory cortex). (C) Example of the tonal bursts used to elicit ASSRs. (D) Illustration of the 16-channel probe used to record local field potential (LFP) activity, with corresponding estimates of cortical depths for each electrode.

Figure 2
figure 2

A resonant ASSR response to 40 Hz AM in signals acquired from the auditory EEG. (A) Illustration demonstrating the calculation of the average instantaneous amplitude (IA, left, black dot indicates the average of a1 through a5) and the ASSR (right, black waveform indicates the average of x1 through x5, and black arrow indicates the peak of the average waveform), implemented for calculation of (B,C), respectively. (B) Average IA of the filtered auditory EEG signal (bandpass filtered ± 1 Hz surrounding the AM frequency) for each AM frequency condition, averaged across the entire 15-minute duration of the stimulus condition. Thick lines indicate mean values, and the height of black lines indicate 95% confidence intervals. Asterisks indicate significant differences in pairwise comparisons to the 40 Hz AM condition (p < 0.05). (C) Modulation rate transfer function depicting the average ASSR across AM frequency conditions. Thick lines indicate mean values, and the height of black lines indicate 95% confidence intervals. Asterisks indicate significant differences in pairwise comparisons to the 40 Hz AM condition (p < 0.05).

Figure 3
figure 3

The ASSR during baseline (black) and sub-sampled auditory stimulus (colored) intervals. Each point on a line indicates the average ASSR across recordings sessions, calculated separately for each recording electrode. Shaded areas surrounding each line indicate the standard error of the mean. Each plot corresponds to a different AM frequency condition: (A) 20 Hz (blue), (B) 30 Hz (red), (C) 40 Hz (yellow), (D) 50 Hz (purple), and (E) 80 Hz (green). Asterisks and horizontal black bars in each plot indicate the electrode locations with responses significantly different from baseline responses (p < 0.01). (F) Mean differences between auditory stimulus and baseline responses, shown for each stimulation condition and each physical recording location. Colors are consistent with those in (A–E).

Because EEG amplitude is sensitive to synchronous activity in cortical patches63, we then asked whether the ASSR was sensitive to the similarity of activity across A1 layers as reflected in the inter-site phase coherence of the LFP probes (Fig. 4A,B). Alternatively, the ASSR might be more related to the consistent timing of responses to each AM tone, as reflected in the inter-cycle phase coherence (Figs. 4C,D, 5). We demonstrate that temporal consistency of the response (that is, decreased latency variability) is strongly related to the magnitude of the ASSR in the EEG signal (Fig. 4), and that this relationship is more prominent at higher frequencies in the cortical LFP (Fig. 5). Finally, we model the auditory EEG signal from the cortical LFP activity, finding evidence for both superficial and deep generators (Fig. 6), and providing new insight into deep layer coordinators of the response.

Figure 4
figure 4

Relationship between the ASSR calculated from the auditory EEG, inter-site phase coupling (ISPC), and inter-cycle phase coupling (ICPC). (A) Average ISPC values during each AM frequency condition. Thick lines indicate mean values, and the height of black lines indicate 95% confidence intervals. Asterisks indicate significant differences in pairwise comparisons to the 40 Hz AM condition (p < 0.05). (B) The ASSR from the auditory EEG as a function of the ISPC values, shown separately for each AM frequency condition. Each dot corresponds to a different recording session, and each panel indicates a different AM frequency condition. Corresponding R2 values for the lines of best fit are shown in the upper left. (C) Average ICPC values during each AM frequency condition, for signals acquired from the auditory EEG. Thick lines indicate mean values, and the height of black lines indicate 95% confidence intervals. Asterisks indicate significant differences in pairwise comparisons to the 40 Hz AM condition (p < 0.05). (D) ASSR from the auditory EEG as a function of ICPC values, shown separately for each AM frequency condition. Each dot corresponds to a different recording session, and each panel indicates a different AM frequency condition. Corresponding R2 values for the lines of best fit are shown in the upper left.

Figure 5
figure 5

Relationship between the ASSR and ICPC for linear probe channels. (A) Average ICPC values during each AM frequency condition, for signals acquired from the most superficial channel (channel 1) on the linear probe. Thick lines indicate mean values, and the height of black lines indicate 95% confidence intervals. Asterisks indicate significant differences in pairwise comparisons to the 40 Hz AM condition (p < 0.05). (B) ASSR from the channel 1 electrode as a function of the ICPC values, shown separately for each AM frequency condition. Each dot corresponds to a different recording session, and each panel indicates a different AM frequency condition. Corresponding R2 values for the lines of best fit are shown in the upper left. (C,E) Same as in (A), for channels 8 and 16 of the linear probe, respectively. (D,F) Same as in (B), for channel 8 and 16 of the linear probe, respectively. (G) R2 for linear models describing relationships between ICPC values and the ASSR at each EEG recording location. Different color lines correspond to the different AM frequency conditions, with a black bar indicating the threshold for model significance (p < 0.05). (H) Same as in (G) for linear probe locations indicated along the x-axis.

Figure 6
figure 6

Modeling results assessing filtered probe signal (filtered ± 1 Hz surrounding the amplitude modulation frequency) covariance with the filtered auditory EEG signal. (A) Histogram indicating the spatial location of the channels that were the most predictive (blue) and the second most predictive (red) of the filtered auditory EEG. (B) Example R2 values for separate single LFP channel models used to predict the filtered auditory EEG signal.

Results

Resonance of the ASSR

We constructed a modulation rate transfer function to depict changes in the ASSR across different frequencies of amplitude modulation (AM) (Fig. 2C). We first calculated the ASSR from skull screw electrodes placed over the rat primary auditory cortex (termed the auditory EEG) or frontal cortex (termed the frontal EEG). The ASSR changed dramatically across AM frequencies (Table S1). A maximum was observed for 40 Hz AM, and this was true for both the auditory (Fig. 2C) and frontal (Fig. S1) EEG. Individual pairwise comparisons with the 40 Hz AM condition revealed that 40 Hz AM elicited a significantly larger ASSR in the auditory EEG than 20, 30, and 80 Hz AM (Table S1).

This result is in contrast to a strikingly different trend in the average instantaneous amplitude (IA) observed from the auditory EEG electrode across AM frequencies, which decreased as the AM frequency increased (Fig. 2B, Fig. S2). Individual pairwise comparisons with the 40 Hz AM condition revealed that 40 Hz oscillations during 40 Hz AM were substantially smaller than 20 Hz and larger than 80 Hz oscillations during their respective conditions (Tukey’s Honestly Significant Difference Test, p < 0.05, Fig. 2B).

To visually characterize the ASSR observed from the auditory EEG across AM frequencies, single cycle (Fig. S3) and 1s (Fig. S4) averages were calculated. A consistent response to each AM frequency could be observed in this manner, and this response appeared to be paced by the AM frequency. Power spectral density estimates of the averaged 1s intervals revealed frequency components within the averaged signal (Fig. S4C,D), with the results largely mimicking the results observed in our modulation rate transfer function depiction of the ASSR (Fig. 2C). In particular, the strongest pacing by the auditory stimulus was observed during 40 Hz AM.

We next constructed modulation rate transfer functions for signals acquired from intracranial probe electrodes. The ASSR varied significantly across AM frequencies for every probe electrode location (Table S1). Individual pairwise comparisons with the 40 Hz AM condition revealed that the ASSR was significantly larger during 40 Hz AM than during 30 Hz or 80 Hz AM for all probe locations (Tukey’s Honestly Significant Difference Test, p < 0.05, Table S1). The ASSR for 50 Hz AM was significantly smaller than the ASSR for 40 Hz AM only for deep probe channels, and no intracranial probe locations showed any significant differences between the ASSR observed during 20 and 40 Hz AM (Tukey’s Honestly Significant Difference Test, p < 0.05, Table S1).

To reveal the extent that the ASSR during auditory stimulus conditions differed from the ASSR observed during a baseline interval at the end of recording sessions, 15-min auditory stimulus intervals were subsampled to match the duration of 5-min baseline intervals. Magnitude comparisons indicated there were no significant increases in the ASSR from baseline during the 20 or 30 Hz AM conditions for any electrode (Fig. 3A,B, Table S2). During the 40 Hz AM condition, all EEG and probe channels except for channel 3 exhibited significant increases in the ASSR from baseline (Fig. 3C, Table S2). During the 50 Hz AM condition, the frontal EEG, the auditory EEG, and select superficial and deep electrodes exhibited significant increases in the ASSR from baseline (Fig. 3D, Table S2). Lastly during the 80 Hz AM condition, the auditory EEG and all probe channels exhibited significant increases in the ASSR from baseline (Fig. 3E, Table S2). In contrast, only the auditory and frontal EEG electrodes exhibited significant increases in the instantaneous amplitude (IA) of oscillations compared to baseline. Specifically, the auditory EEG had significant increases in the IA of 20 Hz, 50 Hz, and 80 Hz oscillations during stimulus epochs compared to baseline (Fig. S5). The frontal EEG had significant IA increases for only 80 Hz oscillations during stimulus epochs compared to baseline (Fig. S5).

No significant influence of cross-layer synchrony on EEG response

A significant difference in inter-site phase coupling (ISPC) was observed across AM frequencies (repeated measures ANOVA, d.f. = 4, F = 48.375, p < 1.0 × 10\(^{-9}\)), largely reflecting reduced ISPC values in the 80 Hz AM condition (Fig. 4A). Individual pairwise comparisons with the 40 Hz AM condition revealed that there were no significant differences between ISPC values during the 40 Hz AM condition and ISPC values during 20 Hz, 30 Hz, or 50 Hz AM conditions (Tukey’s Honestly Significant Difference test, p > 0.05). Notably, ISPC values during 80 Hz AM were significantly smaller than ISPC values during 40 Hz AM (Tukey’s Honestly Significant Difference test, p < 0.05). To determine how ISPC values might covary with the ASSR calculated from the auditory EEG, a linear model was fit between these two measures for each recording session. R\(^{2}\) values suggest there were no significant relationships between the two values during any of the auditory stimulus conditions (t-statistic, p > 0.05, scatter and R2 values for auditory EEG in Fig. 4B, scatter and R2 values for the frontal EEG in Fig. S6).

ASSR covaries with increased cycle phase consistency at high frequencies

The inter-cycle phase coupling (ICPC) changed significantly across AM frequencies for all EEG and probe electrodes (Fig. 4C, Table S3). Individual pairwise comparisons with the 40 Hz AM condition revealed that ICPC values for all recording locations were significantly higher during 40 Hz AM than during 20 or 30 Hz AM (Table S3). For select deep electrode channels, ICPC values during 40 Hz AM were significantly lower than ICPC values during 80 Hz AM (Table S3). For channel 15 only, ICPC values during 40 Hz AM were also significantly higher than ICPC values observed during 50 Hz AM (Table S3). To determine how ICPC values might covary with the ASSR calculated from the auditory EEG, a linear model was fit between these two measures for each recording session. For all AM frequencies, the ASSR calculated from the auditory EEG significantly covaried with ICPC values (Figs. 4D, 5G, Table S4). For the 40, 50, and 80 Hz AM conditions, the ASSR calculated from all electrodes significantly covaried with ICPC values (t-statistic, p < 0.05, Fig. 5B,D,F,H). Of note, for the 20 Hz AM condition, only the ASSR calculated from the auditory EEG significantly covaried with ICPC values (p < 0.05, Figs. 4D, 5B,D,F,H). For the 30 Hz AM condition, ICPC values significantly covaried with the ASSR for the auditory EEG and a subset of probe electrodes (electrodes 3, 6, 7, 8, 9, 10, 12, and 13) (p < 0.05, Fig. 5H).

Modeling EEG signals from the LFP

We constructed linear models that use single LFPs to describe relationships between filtered LFP (filtered ± 1 Hz surrounding the amplitude modulation frequency of interest) and EEG signals. These models revealed that the most predictive channels were often the most superficial and deep channels (Fig. 6A). The second most predictive channel was not always the channel farthest from the most predictive channel, for either the auditory or frontal EEG (example shown in Fig. 6B). For the frontal EEG, the most predictive channels tended to be the deepest channels (Fig. S7).

Discussion

Our study probed the neurobiological underpinnings of the auditory steady-state response (ASSR), with a particular focus on the differential recruitment of networks within primary auditory cortex (A1) across changing modulation frequencies. We delivered blocks of tones modulated in amplitude at 20, 30, 40, 50, and 80 Hz, and confirmed that the largest ASSR was observed during the 40 Hz modulation rate condition (Fig. 2C). Critically, responses during 40 Hz AM were not larger in instantaneous amplitude (Fig. 2B, Fig. S5), but rather exhibited less latency variability than those elicited by stimuli at the other amplitude-modulation (AM) frequencies (Fig. 4C). Further, the size of the ASSR showed no relationship to inter-site phase coupling as might be expected if the large ASSR to 40 Hz AM was due to phase alignment across cortical layers (Fig. 4A,B). This suggests the ASSR reflects the recruitment of multiple oscillators across A1 that are optimally or more consistently engaged at 40 Hz AM, but not uniformly engaged at the same time. Results thus reveal that the size of the ASSR is strongly related to the temporal consistency of cortical network recruitment, and that such consistency depends on the modulation frequency.

Although it is well established that the size of the ASSR in EEG recordings is closely related to the temporal consistency of the response54,62,64,65,66, its relationship to the amplitude of endogenous oscillations has been less clear. The present study allowed us to decouple the impact of these two factors on the ASSR in the EEG recordings because 20 Hz AM elicited the largest instantaneous amplitude response (Fig. 2C), while exhibiting relatively low temporal consistency (Fig. 4C), and 40 Hz AM elicited a response with a relatively small instantaneous amplitude (Fig. 2B), while exhibiting the greatest level of temporal consistency. The largest ASSR calculated from the auditory EEG was elicited not by 20 Hz AM, but by 40 Hz AM (Fig. 2C). Indeed, 40 Hz AM tones had no impact on the average instantaneous amplitude of 40 Hz oscillations (Fig. S5), even as they elicited a large increase in the ASSR in the auditory EEG (Fig. 3). Of note, there is some evidence to suggest that recruitment into endogenous oscillations is incompatible with recruitment by the envelope of an auditory stimulus36,67, though both mechanisms may leverage the input preferences of neurons in local circuits (e.g., for specific modulation frequencies), or the natural timing of their responses. Our results highlight the fact that the mere presence of oscillatory activity at the modulation frequency need not imply that neuronal oscillations are consistently aligned with the stimulus envelope.

Critically, the relationship between temporal consistency and the ASSR measured in the auditory EEG is not ubiquitous, as it is most relevant for AM frequencies above 30 Hz. In the present study, the linear relationship between inter-cycle phase coherence (ICPC) and the ASSR was present at all EEG and LFP probe signals for 40, 50, and 80 Hz AM. In contrast, for 30 Hz AM, this relationship held only at a few, select electrodes, and for 20 Hz AM, it was restricted to the auditory EEG (Fig. 5). This discrepancy, along with the relatively low ICPC levels in the LFP elicited by lower frequency AM, suggest differences in the nature of auditory network recruitment for sounds at AM frequencies at or below 30 Hz versus those higher in the gamma band (traditionally defined in the range of 30–150 Hz).

The frequency-dependent nature of the ASSR to periodic stimuli may be driven in part by the frequency-dependent engagement of specific neural generators. The AM frequencies used here would likely recruit activity from multiple generators, including prominent contributions from local cortical generators at frequencies below 50 Hz and the recruitment of subcortical generators at higher frequencies38,41,68,69,70,71. Consequently, the troughs of our modulation rate transfer functions (e.g. 30 Hz), may arise from a spatial misalignment of currents from these various ASSR generators38. However, the decreased temporal consistency of the response to 30 Hz AM (indexed by the low ICPC values reported here) suggests that the phase consistency of the generators is also more variable, perhaps due to inconsistent or uncoordinated recruitment of the generators. Likewise, the 40 Hz peak in the modulation transfer function may reflect greater synchrony among recruited populations.

This in turn raises the question of what leads to the enhanced temporal consistency of the neural responses underlying the large ASSR to 40 Hz AM. One salient possibility is that subpopulations of neurons in A1 exhibit resonance at 40 Hz, contributing to the generation of relatively larger scale, consistent oscillations given inputs near that frequency72. This would also lead to greater synchrony among recruited populations, which has been reported in previous studies at the best AM frequency38. In subcortical recordings of the ASSR, it has also been shown that the best AM frequency is tightly linked to the input preferences of recruited neurons and the time course of their responses38,73, with the best AM frequency often evolving over the course of development as the intrinsic properties of these neurons change41,42,43,44.

Taken together, this suggests a strong relationship between enhanced temporal consistency of responses for certain AM frequencies and the resonance profiles of recruited subpopulations. It is also possible that interactions between neurons in A1 have a stereotyped or optimal time course that is best paced by 40 Hz stimuli. In either case, the intrinsic properties of participating neurons give rise to ion currents and resulting fluctuations in the LFP with stereotyped timing74. Optimal pacing of the auditory network at 40 Hz may thus arise because this frequency is complementary to the time course of natural ionic fluctuations and interactions.

Our findings thus provide insight into potential underlying neural differences in patient populations with atypical ASSRs in EEG recordings2,3,4,5,6,7,8,9,10,11. Previous investigators have speculated that the ability to recruit auditory networks in gamma frequency activity is altered in these populations75, resulting in a lower amplitude ASSR76,77,78,79. While it may indeed be the case that low gamma oscillatory power contributes to a reduced ASSR to 40 Hz AM in these groups, here we find the temporal coordination of the 40 Hz AM response has a far greater impact on the ASSR than the ability to generate large 40 Hz oscillations. These results imply that deficits in patient groups may stem from intrinsic cellular or circuit changes that modify endogenous stereotyped network timing in this frequency range. Such changes could influence both the ability to generate gamma oscillations and the temporal consistency of responses to AM stimuli in the gamma frequency.

Gamma oscillations are often thought to be generated through interactions between excitatory and inhibitory neurons, with the frequency of the rhythm dictated by the time course of recovery from inhibition80,81. As A1 contains several inhibitory interneuron subtypes, spread across all cortical layers82, results of the present study align with the suggestion that interneuronal recruitment contributes to both the generation of endogenous gamma oscillations in A1 and the ASSR to 40 Hz AM76,77,78,79. For example, parvalbumin positive interneurons in layer VI send inhibitory projections across all cortical layers83, allowing them to impose temporally coordinated inhibitory constraints. Neuronal populations in layers V/VI, with their prominent projections to layer IV and thalamus, may play a key role in the recruitment and maintenance of the ASSR and the high temporal consistency of the response.

Accordingly, results here indicate that ion currents in deep (V/VI) layers of A1 are tightly correlated with the auditory EEG signal. These layers contain a diversity of neuron types with both thalamic and cortical targets84,85,86. Further, the neurons in the deep layers of A1 are in a prime anatomical position to exert a powerful coordinating influence on the rest of the network. A previous study noted that the ASSR is largest in the granular layer (layer IV)54, suggesting strong coordinating inputs in the region. Approximately half of the input to layer IV originates from within cortex, with prominent projections from layer VI87,88. Projections from layer VI also differ from thalamic projections to layer IV in that they uniquely activate metabotropic receptors and facilitate EPSPs in response to repetitive stimuli89,90.

Previous studies examining the cellular contributions to the ASSR and 40 Hz gamma oscillations have focused on the role of NMDA and GABA receptors. Blockade of either receptor type influences the amplitude of the ASSR to 40 Hz AM, with NMDA receptor blockade dynamically decreasing or enhancing the ASSR to 40 Hz AM over time64,65,91. Several previous studies have also implicated NMDA and GABAergic receptors in the generation of 40 Hz oscillations in auditory networks76,77,78,79,91. These studies hypothesize that interactions between excitatory and inhibitory neurons, or NMDA receptor regulation of GABAergic neuron activity support intrinsic network gamma oscillations and thus the temporal consistency of responses in that range.

Lastly, our study provides further insight into a long-standing debate regarding the contributions of phasic middle latency responses or endogenous oscillations to the ASSR. This debate is a part of a larger controversy regarding the extent that steady-state responses reflect event-related responses that are paced by external stimuli or the recruitment of oscillating networks of neurons, a distinction that bears on what these responses reveal about the underlying network. The results of our study suggest that both possibilities may be true at different frequencies. At lower frequencies (30 Hz and below), the network recruits more variable responses to individual stimuli in line with an event-related response. In contrast, AM sounds at higher frequencies (40 Hz and above) elicit a more consistent and stereotyped response resembling an oscillation, with the enhanced consistency at particular frequencies (40 Hz) resembling neural resonance. Moreover, recruitment of local cell populations through either mechanism could vary across different AM frequencies, resulting in the activation of distinct subpopulations38,39,40. Overall, results suggest auditory network recruitment by repetitive stimuli occurs in multiple forms, marked by sharp changes in the temporal consistency of the response as the frequencies pacing the network change. Our study reveals that this frequency-dependent shift in auditory network recruitment substantially contributes to the unique network state underlying the ASSR to 40 Hz AM.

Methods

Rats

All animal procedures were performed in accordance with University of California, San Diego Institutional Animal Care and Use Committee (IACUC) and Association for Assessment and Accreditation of Laboratory Animal Care International (AAALAC) guidelines. The experimental protocol was approved by the University of California, San Diego Institutional Animal Care and Use Committee (IACUC, protocol S16215). Subjects were three male and three female Long-Evans rats (Charles River Laboratories), between 6 and 18 months old. Rats were housed individually or with a single other rat, and maintained on a 12-h light/dark cycle. All neural recordings were performed during the light cycle. Rats received food and water ad libitum. At the end of the experiment, rats were euthanized with a pentobarbital sodium and phenytoin sodium solution (Euthasol, Virbac Corporation, Texas). The authors have complied with ARRIVE guidelines.

Surgery and histology

Each rat was surgically implanted with a 16-channel single shank silicon probe (NeuroNexus Technologies, Ann Arbor, MI; Qualia Labs, Dallas, TX). Probe shanks were 5 mm in length, and electrodes were spaced 100 \(\upmu\)m apart along the lowest 1550 \(\upmu\)m of the shank. A diagram outlining the electrode configuration of the silicon probes is shown in Fig. 1D. Probes were implanted in left primary auditory cortex near the boundary of dorsal secondary auditory cortex (A/P = − 4.0 mm; M/L = − 7.2 mm), and the probe was placed 1.65 mm into the brain to target all layers of primary auditory cortex (Fig. 1A,B). To record skull EEG, a skull screw was placed above right primary auditory cortex (A/P = − 4.0 mm; M/L = + 7.2 mm, termed auditory EEG). In four of the rats, an additional skull screw was placed above right frontal association cortex (A/P = 5.2 mm; M/L = − 2.0 mm, termed frontal EEG). A skull screw placed over left frontal association cortex served as a ground, while an additional skull screw placed over cerebellum served as a reference. Probe locations were visualized via a Nissl stain in 40 \(\upmu\)m coronal sections (Fig. 1B).

Neural recordings

Signals were amplified by a preamplifier 20\(\times\) and amplified again to 4000–6000\(\times\) (Plexon, Dallas, TX). Local field potential (LFP) and skull electroencephalography (EEG) signals were digitally isolated with a band-pass filter from 1 to 500 Hz and notch filtered to remove 60 Hz electrical noise. LFP and EEG channels were globally referenced to a skull screw above the cerebellum. Movement artifacts were first targeted by isolating time points where the signal amplitude was more than 2.5\(\times\) the average amplitude per recording session, and then were manually reviewed for further isolation. Five of the six rats underwent three recording sessions, and one rat underwent two recording sessions.

Auditory steady state stimulus

Auditory stimuli were amplitude-modulated (AM) tones with a 100 kHz sampling frequency, 2 kHz carrier frequency, and 5ms duration. This carrier frequency was chosen due to the fact that it is detectable by both rats and humans at a wide range of ages when presented at 80dB SPL, allowing for greater translatability of results92,93. Each tone was amplitude modulated with a trapezoidal wave transposed onto them. Amplitude-modulated tones had a 25% rise and fall time (of the 5 ms duration), ensuring a short onset and offset of the stimuli to avoid auditory chirps. Each trial block consisted of a continuous 15 min of AM tones, presented at 20, 30, 40, 50, or 80 Hz. The order of blocks was randomized for each recording session. A schematic depicting example auditory stimuli is shown in Fig. 1C. Between each 15 min block, an interval without any auditory stimuli (silence) of at least one minute in duration was implemented to limit any residual effects generated from stimuli presented during the previous block. Approximately 65–90% of trials within each block remained after the removal of mechanical artifact (as described above), resulting in > 11,000 trials per block. During auditory stimulus presentation, rat subjects were placed into an enclosed environment with speakers placed directly outside the environment. Rats were free to roam around in their enclosed environment during recording sessions. Before each recording session, the volume of the stimulus inside the environment was verified to be at 80dB SPL. Following the delivery of auditory stimuli, a 5-min. baseline was recorded at the end of each recording session.

Auditory steady state response

To quantify voltage responses elicited by the auditory stimulus, the LFP and EEG signals were first bandpass filtered using a 3rd order Butterworth filter in a narrow (± 1 Hz) range around the AM frequency of interest (e.g. for 20 Hz, the data was bandpass filtered from 19 to 21 Hz). The 15-min trials were then binned into one-cycle segments of the auditory stimulus and averaged across bins. The ASSR for a given amplitude-modulation (AM) frequency was then calculated as the maximum voltage of the resulting averaged waveform. Since the averaged waveform is approximately a sinusoidal shape, the maximum value is similar to calculating the distance from the mean (0mV) of the averaged waveform. A schematic depicting calculation of the ASSR is shown in Fig. 2A (right). In addition, we assessed the average instantaneous amplitude (IA) by first calculating the absolute value of the complex Hilbert transform of the filtered signal for the entire 15-min interval. We then identified the peak value of the instantaneous amplitude for each trial (for each AM tone), and then averaged the peak values (Fig. 2A, left). The ASSR and IA were also calculated using less narrow filters of the LFP and EEG signals (Fig. S2, for ± 5 Hz the AM frequency, 10–90 Hz, or no filter). When comparing ASSR and IA values across baseline and auditory stimulus blocks, the 15-min auditory stimulus blocks were subsampled to match the duration of the 5-min baseline intervals. Of note, given higher noise levels or variability in baseline signals, the length of baseline recordings may limit the identification of robust but modest changes during auditory stimulus blocks.

Current source density analysis

To identify current sources and sinks resulting from the auditory stimulus, we calculated the second order spatial derivative of narrowly filtered probe LFPs. For every spatial location of a probe electrode s at time t the current source density of the signal was calculated as:

$$\begin{aligned} CSD_{s,t} = \frac{\frac{LFP_{s+1,t} + LFP_{s+2,t}}{2}+\frac{LFP_{s-1,t}+LFP_{s-2,t}}{2} - 2LFP_{s,t}}{z^{2}}, \end{aligned}$$
(1)

where z represents the spacing between probe electrodes (0.1 mm).

Inter-site phase coupling

To determine the phase alignment of responses across probe recording locations, the inter-site phase coupling (ISPC) was calculated as:

$$\begin{aligned} ISPC_{t} = \bigg|\frac{\sum _{l = 1}^{16}e^{i\psi _{l,t}}}{16}\bigg|, \end{aligned}$$
(2)

where l represents the spatial location of each electrode of the probe, and \(\psi\) is the phase of a response, determined by narrowly filtering (± 1 Hz) the local field potential signal around an amplitude modulation frequency of interest and calculating the arctangent of the Hilbert transform. The ISPC value characterizes the phase consistency of responses across all probe electrodes. ISPC values are between 0 and 1, with 0 indicating no phase consistency across all probe electrodes, and 1 indicating perfect phase consistency across all probe electrodes. To assess the overall influence of a particular auditory stimulus upon the phase consistency of responses across electrodes, ISPC values were averaged for the duration of a stimulus condition, or 15-min trial.

Inter-cycle phase coupling

To characterize the temporal consistency of the responses acquired from a single electrode, the inter-cycle phase coupling (ICPC) was calculated as:

$$\begin{aligned} ICPC_{\tau } = \bigg|\frac{\sum _{c = 1}^{m}e^{i\psi _{c,\tau }}}{m}\bigg|, \end{aligned}$$
(3)

where \(\tau\) is a single time point within the period of a single auditory stimulus cycle, m is the number of cycles, and \(\psi\) is is the phase of a response, determined by narrowly filtering (± 1 Hz) the local field potential signal around an AM frequency of interest and calculating the arctangent of the Hilbert transform. The ICPC value characterizes the phase consistency of the responses across presentation cycles. ICPC can take on values between 0 and 1, with 0 indicating no phase consistency across cycles, and 1 indicating perfect phase consistency across trials. To assess the overall influence of a particular AM frequency upon the phase consistency of responses on an electrode, ICPC values were calculated for the duration of a 15-min trial. ICPC values were obtained for the different time points within an auditory stimulus cycle, and then averaged.

Assessing phase consistency covariates of the ASSR

We performed simple linear regression to assess relationships between the ASSR and the spatial and temporal consistency of responses. To determine whether the phase consistency of responses across probe electrodes related to the ASSR acquired from EEG electrodes, linear models were constructed to relate ISPC values to the ASSR. To determine whether the temporal consistency of responses on a single electrode influenced the ASSR on the same electrode, linear models were constructed to relate average individual electrode ICPC values to the ASSR on the same electrode. These latter models were constructed for each electrode (16 probe electrodes, one frontal EEG electrode, and one auditory EEG electrode) and AM frequency condition separately. Model goodness of fit was assessed via the coefficient of determination (\(R^2\)).

Modeling EEG signals from the LFP

To evaluate the extent that individual LFPs covaried with the EEG signals, and to what extent the signal from each probe channel contains unique information about the variance of the EEG signals, linear models relating individual LFPs with EEG signals were constructed. These models used the filtered LFP (narrowly filtered ± 1 Hz around a stimulus frequency of interest) from a single probe channel as model regressors. \(R^2\) values were used to compare model effectiveness.