# Explaining event-related fields by a mechanistic model encapsulating the anatomical structure of auditory cortex

- 294 Downloads

## Abstract

Event-related fields of the magnetoencephalogram are triggered by sensory stimuli and appear as a series of waves extending hundreds of milliseconds after stimulus onset. They reflect the processing of the stimulus in cortex and have a highly subject-specific morphology. However, we still have an incomplete picture of how event-related fields are generated, what the various waves signify, and why they are so subject-specific. Here, we focus on this problem through the lens of a computational model which describes auditory cortex in terms of interconnected cortical columns as part of hierarchically placed fields of the core, belt, and parabelt areas. We develop an analytical approach arriving at solutions to the system dynamics in terms of normal modes: damped harmonic oscillators emerging out of the coupled excitation and inhibition in the system. Each normal mode is a global feature which depends on the anatomical structure of the entire auditory cortex. Further, normal modes are fundamental dynamical building blocks, in that the activity of each cortical column represents a combination of all normal modes. This approach allows us to replicate a typical auditory event-related response as a weighted sum of the single-column activities. Our work offers an alternative to the view that the event-related field arises out of spatially discrete, local generators. Rather, there is only a single generator process distributed over the entire network of the auditory cortex. We present predictions for testing to what degree subject-specificity is due to cross-subject variations in dynamical parameters rather than in the cortical surface morphology.

## Keywords

Analytical solutions Auditory cortex Computational modeling Event-related field Event-related response Magnetoencephalography Normal modes## 1 Introduction

The event-related potential (ERP) and field (ERF) measured with electroencephalography (EEG) and magnetoencephalography (MEG), respectively, appear as a series of waves triggered by a stimulus event. First described by Davis (1939), these waves are thought to represent stimulus-related activations which are stationary, time-locked to stimulus presentation, and buried in ongoing oscillations and other activity unrelated to stimulus processing. Thus, to cancel out the signal not associated with the stimulus, ERPs and ERFs are obtained through stimulus repetition and averaging of the single-trial EEG/MEG signals with respect to stimulus onset. The peaks and troughs of event-related responses function as landmarks as they can be identified in most subjects. Even so, the morphology of these responses varies greatly from subject to subject (see, for example, Atcherson et al. 2006; Dalebout and Robey 1997; Zacharias et al. 2011; Matysiak et al. 2013; König et al. 2015). Importantly, despite the straightforwardness of the method to extract ERPs and ERFs and decades of its use, and regardless of improvements in localization methods, we still have a poor understanding of how event-related responses are generated and what they signify.

In contrast, the general biophysics of EEG and MEG generation and the neural processes giving rise to currents in the brain contributing to these signals are well known (Sarvas 1987; Williamson and Kaufman 1981). EEG (Buzsáki et al. 2012; Einevoll et al. 2013; Mitzdorf 1985, 1994) and MEG signals (Hämäläinen et al. 1993; Okada et al. 1997) represent primarily a weighted sum of synchronized synaptic activities of pyramidal neural populations, whereas inhibitory neurons, with shorter dendrites and a symmetric dendritic structure, contribute to a closed field which does not show up in EEG and MEG. With pyramidal neurons being the predominant cell type in cortex, cortical columns are characterized by the apical dendrites of these cells running in parallel to each other and orthogonally to the cortical surface. The activity of *excitatory* synapses on these dendrites translates into electric current (cations \(\text {Ca}^{2+}\) and \(\text {Na}^{+}\)) flowing into the apical dendrites, then along the dendrites as the primary/lead current, and out through the passive leak channels into the extracellular space, where the resulting volume current completes the circuit. The primary current along many synchronously activated pyramidal cells gives rise to a magnetic field which is visible in MEG and whose strength depends on the orientation and distance of the primary current in relation to the sensor. Similarly, the extracellular sinks and sources separated along the axis of the dendrites contribute to an open electric field which can be picked up in EEG and local field potential (LFP) measurements. Traditionally, *inhibitory* synapses onto pyramidal cells were thought to contribute only minimally to EEG/MEG, with the reversal potentials of these synapses being close to the resting membrane potential (Bartos et al. 2007; Mitzdorf 1985). Accordingly, an activated inhibitory synapse leads to minimal cross-membrane currents and hence a minimal contribution to EEG and MEG. However, when pyramidal neurons are spiking, for example, when spontaneous activity occurs, the membrane potential is elevated and so inhibitory synapses can significantly contribute to EEG and MEG generation (Trevelyan 2009; Glickfield et al. 2009; Bazelot et al. 2010).

The leap from biophysics to an understanding of the experimentally measured ERP and ERF waveforms is more difficult. A sensory stimulus (the event) sets off a series of neural activations propagating from the sensory organ to cortex. Cortical activations can be observed locally, in intracortical measurements, as increased spiking when, for example, the weak thalamocortical signal activates the local feedback circuits in cortical columns of the primary areas (Douglas et al. 1995), and this stimulus-evoked activation corresponds with the surface-recorded ERP (Shah et al. 2004). The auditory event-related response starts with small-amplitude, early-latency waves in the first 8 ms from stimulus onset; these are followed by mid-latency waves in the 8–40 ms range and, then, by large-amplitude, long-latency waves (e.g., Picton and Stuss 1980). In the passive recording condition, when the subject is not engaged in a task involving the stimuli, the most prominent waves of the auditory ERP are the long-latency P1, N1, and P2 responses, peaking at approximately 50, 100, and 200 ms, respectively. Their ERF counterparts are termed the P1m, N1m, and P2m.

Computational modeling can account for long-latency auditory ERPs purely in terms of interactions across cortical layers in primary auditory cortex (Wang and Knösche 2013). However, it seems unlikely that events in primary fields could represent the full intracortical counterpart of ERPs, which emerge as a superposition of activity across larger swathes of cortex. For example, in the case of auditory cortex (AC), anatomical studies in monkey show a hierarchical organization, with primary, core fields connecting to each other and to surrounding secondary, belt fields which, in turn, are connected with parabelt fields (Kaas and Hackett 2000; Hackett et al. 2014). There is physiological evidence to suggest that this hierarchical structure is reflected in feedforward activations progressing along the core–belt axis (Rauschecker 1997) and, hence, that cortical activations generating event-related responses should have temporal as well as spatial dynamics. This is supported by localization studies. Lütkenhöner and Steinsträter (1998) modeling the long-latency auditory ERF of a human subject with a single equivalent current dipole (ECD) found that the ECD location was non-stationary across the entire time course of the ERF: during the P1m, it lies on Heschls gyrus (HG) from where it slides to the planum temporale (PT) during the N1m and shifts back to HG during the P2m. Inui et al. (2006) performed multi-dipole analysis of auditory ERFs in a 120-ms post-stimulus time window using six ECDs located in the AC, and found that activity propagates along a roughly medial-lateral axis from HG to the superior temporal gyrus (STG). This was interpreted in terms of core–belt–parabelt activation. Similar results were reported by Yvert et al. (2005) who used minimum current estimates of recordings from intracerebral electrodes in human AC. Activity started in HG and Heschls sulcus (HS) at around 20 ms. The P1 time range (30–50 ms) was characterized by multiple areas becoming activated along medio-lateral and postero-anterior axes of propagation, successively involving HG and HS, PT, and STG. Subsequently, activity cycled back so that the rising slope of the N1 coincided with a similar series of activations as during the P1.

The above results point to event-related responses having both a temporal as well as a spatial dynamics whereby foci of activity in cortex shift over time. This addition of a spatial dimension to event-related responses adds to the descriptive palette but as such gives no deeper insight into what is going on, although there have been a number of approaches for gaining such insight. Research in the 1970s and 1980s posited that the event-related response is the linear sum of separable components, each generated by a spatially defined generator which also has a well-defined information processing function, such as stimulus onset detection or change detection (for reviews, see Näätänen and Picton 1987; Näätänen 1992). However, it has proven difficult to perform component separation in a reliable way (Lütkenhöner 1998) and to map components to anatomical structure (May and Tiitinen 2010). This emphasis on localization of activity was later complemented by considerations on the connections between cortical areas. In the framework of dynamical causal modeling (DCM), the event-related response is considered to arise out of a network of a small number of nodes arranged in a hierarchical structure and each representing an extended cortical area such as the primary or secondary AC (Friston et al. 2003; David et al. 2006). Stimulation-specific modulations in the response then arise out of changes in the strengths of connections, classified as bottom-up, lateral, or top-down. Such changes have been interpreted in the framework of predictive coding, whereby cortex attempts to predict incoming stimuli and in so doing generates prediction signals via top-down inhibitory connections. When there is a mismatch between stimulus and prediction, excitatory bottom-up connections relay a prediction error signal. In this view, the N1(m) signifies excitatory activity carrying the prediction error from AC toward frontal areas. In contrast, the P2(m) is due to inhibitory, feedback activity carrying the top-down prediction information (Garrido et al. 2007, 2009).

It appears then that we have a range of mutually exclusive explanations for event-related responses. First, these can be understood as arising purely locally, as the result of intra-laminar dynamics within primary areas (Wang and Knösche 2013). Second, they can be modeled as being generated by a single source with a continually shifting location (Lütkenhöner and Steinsträter 1998). Third, they can be seen to represent the linear sum of activity of a limited number of component generators, each performing an independent information processing task (Näätänen and Picton 1987). Fourth, they might arise out of a limited number of cortical areas interacting with each other in the performance of predictive coding (Friston et al. 2003). The spatial resolutions of these explanations seem to lie at the extremes, ranging from the single column to treating entire areas as single nodes (see also Ritter et al. 2013). None of these explanations are designed to represent transformations occurring within AC, because the internal dynamics of AC as a distributed system are not included. For this purpose, a more mechanistic view on how AC processes and represents sound is needed. Such a view could be based on the structure of the AC in order to account for the spatial dynamics occurring within the temporal lobe, as described above.

Thus, the purpose of the current study is to plug the resolution gap by bringing the anatomical structure of the AC into the explanation of the auditory event-related response. As a starting point we use a previously developed model of AC (May and Tiitinen 2010, 2013; May et al. 2015), and we restrict ourselves to examining ERF generation. The original model is highly nonlinear, and we simplify it in order to make an analytical approach possible. We derive analytical solutions to the model so as to characterize the dynamics of AC signal processing in terms of basic elements, so-called normal modes. This allows us then to address the following questions: How do ERFs originate from these dynamical elements? How do these elements depend on the anatomical core–belt–parabelt structure of the AC? And how is the ERF signal modulated by the topography of the primary currents, that is, by their orientation and distance from the MEG sensor? This analysis, then, lets us explore the origin of the subject-specificity of event-related responses: Why do subjects have unique ERF morphology? Can this be fully accounted for in terms of individual curvature of AC and its modulating effect on the MEG? Or do subjects also have unique dynamics of the auditory cortex?

## 2 Model of auditory cortex

*u*and

*v*representing the population of excitatory (pyramidal) neurons and inhibitory interneurons, respectively. The dynamics are determined by the following two sets of coupled nonlinear differential equations (May and Tiitinen 2013; May et al. 2015):

*g*. Thus, the spiking rates \(g[\mathbf u (t)]\) and \(g[\mathbf v (t)]\) are zero for values of \(\mathbf u (t)\) and \(\mathbf v (t)\) smaller than a constant threshold \(\theta \), and for values above this threshold they are monotonically increasing functions of the corresponding state variables, converging toward a saturation value of unity. The function

*A*(

*t*) is a time-varying matrix describing synaptic plasticity depending on pre-synaptic activity and governed by a differential equation of its own.

Equations (1) and (2) represent the mean-field leaky integrator neuron (LIN) model of classical neurodynamics as formulated by Hopfield (1984) and Hopfield and Tank (1986). The LIN model is related to the Wilson and Cowan (1972) model, which employs similar first-order differential equations to describe the interaction between neural populations, and where the state variables represent the proportion of neurons firing. The Hopfield-and-Tank formulation is slightly closer to the biologically realistic compartmental model, as the state variable can be seen as an approximation of the membrane potential whose time derivative depends on the cross-membrane currents. While originally intended as a single-unit description, the LIN model can be used as a population description by assuming that the units in the population are identically and symmetrically connected with each other, and that they all receive the same external input. In this case the population units behave identically with each other, and the population can be described by the unit equation. Because the equations refer to cross-membrane currents (i.e., synaptic and leak currents), it becomes easier to motivate the calculation of the MEG signal, as is discussed below. The LIN formulation also has the advantage that it opens up an analytical approach to the system dynamics.

Central to the model is the anatomical structure of AC (Fig. 1a). The AC organization is similar across mammals in the sense that a hierarchical core–belt–parabelt structure can be identified, although the number of fields and their connectivity with each other is species-specific (Budinger et al. 2000; Budinger and Heil 2005; Baumann et al. 2013; Hackett et al. 2014). In general, core fields are characterized by on-responses to pure tones and their preferential connections with the tonotopically organized division of the auditory thalamus. They have extensive local connections with each other and with the surrounding belt fields. Belt fields are also tonotopically organized, albeit with a lesser spatial frequency resolution. There are strong local connections of belt fields with core fields and neighboring parabelt fields as well as connections with other cortical areas. In addition to dense connections with the ventral division of the medial geniculate body, belt fields also have pronounced connections with non-tonotopic parts of the auditory thalamus. Parabelt fields are non-tonotopic and isocortical, with lower cell density than the belt fields and have connections mainly with non-tonotopic auditory and non-auditory thalamic nuclei and remote cortical areas.

The model mimics this structure with its 240 columns (32 subcortical, 208 cortical) being distributed into one field representing the inferior colliculus (IC), one thalamic field, three core fields, eight belt fields, and two parabelt fields, with each field comprising 16 columns. We note that the IC and thalamus were not part of the original model (May and Tiitinen 2013; May et al. 2015). As shown in Fig. 1a, the fields of the model are connected according to the scheme found in the macaque (Kaas and Hackett 2000). The connections between IC and thalamus as well as between thalamus and the three core fields are purely one-to-one tonotopic. The connections between cortical fields are likewise tonotopic, with each column projecting to its tonotopic counterpart in the recipient field. In addition, the projecting column also connects with columns neighboring the tonotopic counterpart. This spread of connections, which is symmetric and partly stochastic, is described by a Gaussian distribution and explained in more detail in Appendix A1. Hence, all the connections in the model are tonotopically organized, including those in the parabelt, and this simplification is unlikely to reflect the actual anatomical organization of AC. However, we note that the stochasticity in the connection matrix allows for the columns to exhibit multi-peaked and/or broad tuning curves.

## 3 Modeling auditory cortex dynamics with normal modes

The previous modeling work in May et al. (2015) used numerical simulations of the nonlinear state equations, and accounted for the generation of the N1m and the mismatch response of the ERF. However, this approach gives only a snapshot of the system dynamics at the particular parameter settings chosen for the simulation. This way there is limited access to the relationship between the ERF on the one hand and the system parameters such as the synaptic weights and the anatomical organization on the other. Further, numerical simulations alone will not reveal why and when the peaks and troughs of the ERF occur. Here, we attempt to gain deeper insight into the dynamics of AC by taking the analytical approach to find solutions to the AC system dynamics by simplifying the description even further. In particular, we ignore synaptic plasticity, and we assume that the state variables inhabit the linear portion of the spiking-rate nonlinearity. We use this linearization of the spiking rate together with assumptions of symmetry of the weight matrices to decouple the two sets of state equations based on the standard approach of eigenvalue decomposition. The decoupled equations are then analytically solvable, and their solutions are referred to as normal modes of the system. We end up with a complete description of the system dynamics and the generated ERF in terms of the parameters of the AC. We note that synaptic plasticity will be addressed in future work and that its omission does not affect the validity of the current results.

The idea of cell populations operating in the quasi-linear range of the spiking-rate function was already used by Katznelson (1981) in his approach of decomposing cortical activity into spherical harmonics. Also, May and Tiitinen (2001) found that, with the assumption of a linear spiking rate, a pair of excitatory and inhibitory LINs can be described as a driven harmonic oscillator with damping. We therefore expect that linearization of the current AC model will likewise lead to oscillatory solutions, which can be considered to be the fundamental elements of cortical dynamics (Nunez 1995; Buzsáki and Draguhn 2004; Buzsáki 2006). These approximations gain some validity from the experimental observations of Allen et al. (1975) who found that neuronal responses behave linearly for a broad span of membrane potentials.

*N*degrees of freedom, and with the definitions

*N*degrees of freedom is transformed into a representation where there are

*N*decoupled oscillators, each with a single degree of freedom, and where the coefficients are diagonalized as indicated by the subscript ‘d’ [see Eqs. (47) and (48) in Appendix A2]:

Each of the decoupled oscillators represents one normal mode with individual frequency and amplitude (Rayleigh 1945; Caughey 1960; Caughey and O’Kelly 1965). Normal modes are the basic elements of the decoupled system but they do not represent the dynamics of individual columns. Meaningful information on single-column dynamics as part of a network of columns can only be obtained after an inverse transformation whereby the normal modes are, in effect, coupled together.

We now have simple mathematical expressions for describing the fundamental dynamics of the excitatory and inhibitory cell populations. Equations (13) and (14) represent the normal modes, the individual building blocks of the dynamics of the auditory cortex which depend on anatomical structure. Figure 3 shows an example of decoupled and coupled state variables when the model is presented with a 50-ms stimulus targeting the excitatory population of column 8 of the IC (amplitude = 0.01 for corresponding elements of \(\mathbf I _{\mathrm{aff,e}}\)). In Fig. 3a, c, the \(2\times 240\) normal modes \(u_{\mathrm{d}}(t)\) and \(v_{\mathrm{d}}(t)\) are shown.

Note that the assignment of any normal mode to a particular location in the AC is not possible. Instead, one can consider how each normal mode contributes to the activity of each column and field. Figure 4 shows two examples of normal modes, the first with a high damping frequency \(\delta _{\mathrm{d}}\) (Fig. 4a) and the second with a low \(\delta _{\mathrm{d}}\) and a polarity opposite to that of the first (Fig. 4b). Figure 4c, d shows how these normal modes are mapped onto the structure of the AC (see Fig. 1) in terms of their contributions averaged over each cortical field. Note that the maps represent the mean contribution of each normal mode to the activities of the individual fields. Specifically, this occurs through the multiplication of the mixing matrix with the normal mode [see Eq. (18)]. Figure 4 represents the general observation that the AC mappings of high-frequency normal modes tend to have more structure than those of low-frequency normal modes.

## 4 A new framework for understanding ERF generation

*P*for the dynamical equations, with

*P*comprising the elements of the various connection matrices, the time constant \(\tau _{\mathrm{m}}\), and the input–output function \(g(\cdot )\).

Default dynamical and topographical parameter values used in the simulations

Dynamical parameter set | Value | Topographical parameter set \(K_{{i}}\) | Value |
---|---|---|---|

IC recurrent connections | 0.09 | \(K_{{\mathrm{1}}}\) (feedforward) | \(-\) 4 |

IC to Thalamus connections | 0.015 | \(K_{{\mathrm{1}}}\) (feedback) | 20 |

Thalamus recurrent connections | 0.09 | \(K_{{\mathrm{1}}}\) (within field) | \(-\) 5 |

Thalamus to core connections | 0.015 | \(K_{{\mathrm{2}}}\) (within field) | 2 |

\(W_{\mathrm{AC}}\) | See Table 2 | \(K_{{\mathrm{3}}}\) (within field) | 2 |

\(W_{\mathrm{ie,d}}\) | 1 | ||

\(W_{\mathrm{ei}}\) | 1 | ||

\(W_{\mathrm{ii}}\) | 0.2 | ||

\(\tau _{\mathrm{m}}\) | 40 ms |

*i*and

*j*refer to post- and pre-synaptic populations, respectively. The matrices \(W^{+}_{{{\mathrm{AC}}}}\) and \(W^{-}_{{{\mathrm{AC}}}}\) represent the excitatory connections and lateral inhibition of \(W_{{\mathrm{AC}}}\), respectively.

### 4.1 Calculating the MEG signal

The second term of Eq. (19) is the contribution to the MEG of the inhibitory projections originating from within the column. The synaptic input is modulated by the matrix \(K_{{\mathrm{2}}}\), as represented in Fig. 5b. The third term of Eq. (19) accounts for the MEG contribution of lateral inhibition whereby \(W^{-}_{{\mathrm{AC}}}\) is modulated by the matrix \(K_{{\mathrm{3}}}\). \(K_{{\mathrm{2}}}\) and \(K_{{\mathrm{3}}}\) have the same structure and values (Table 1). Further, the polarity of the elements of \(K_{{\mathrm{2}}}\) and \(K_{{\mathrm{3}}}\) is the same as that of the elements of \(K_{{\mathrm{1}}}\) representing feedback connections. This polarity conveys the finding that inhibitory synapses tend to be located near the soma (Douglas et al. 2004, see, however, Kubota et al. 2016), and therefore their activation contributes to a current pointing downward in the apical dendrites of pyramidal neurons (Ahlfors and Wreh 2015). In general, inhibitory synapses contribute to the primary current if the pyramidal cell has an elevated membrane potential and is spiking (Trevelyan 2009; Glickfield et al. 2009; Bazelot et al. 2010). In our case, we are assuming that the resting state represents sustained, spontaneous activity, which corresponds to an elevated membrane potential.

### 4.2 Simulations of ERFs

*P*and topographical parameters \(K_i\) are listed in Table 1. The parameter values reflect the finding that recurrent connections in cortex are an order of magnitude stronger than afferent and between-field connections (Douglas and Martin 2007; Douglas et al. 1995). However, the exact values, while representing a balance between excitation and inhibition, are arbitrary and were chosen on the basis of reproducing realistic looking ERFs. The ERF was calculated using Eq. (19).

Figure 6a shows an example of an ERF from an MEG experiment using pure-tone stimulation (Matysiak et al. 2013). The waveform has a typical morphology and it shows the grand mean computed from the ERFs of several subjects. The blue curve in Fig. 6b represents the ERF waveform generated by the current linear model whose normal modes and coupled state variables are presented in Fig. 3. This simulation replicates the morphology shown for the experimental ERF in Fig. 6a: There is an initial P1m-like response peaking at 35 ms. The ERF then crosses polarity and builds up into a large-amplitude N1m-like response peaking at 115 ms. This then is followed by a shallow P2m response peaking at 260 ms. The green curve was generated in a simulation with the nonlinear version of the model. The linear and nonlinear models produce ERFs with very similar morphologies. The minute differences in the peak amplitudes are caused by the synaptic plasticity term of the nonlinear model.

Figure 6c shows the contributions to the ERF coming from excitation, intra-column inhibition, and lateral inhibition, as defined in Eq. (19). The contribution from excitation is characterized by deflections similar to the P1m, N1m, and P2m responses (Fig. 6b). In contrast, the contributions made via the inhibitory connections each comprise only a single N1m-like deflection of a small amplitude. Thus, the P1m and P2m responses are driven by excitation of the pyramidal cells.

To further examine the dependence of the ERF on the topography of the connections, we varied the \(K_{{{\mathrm{1}}}}\) weights of the contributions of the feedforward and feedback connections to the MEG signal. The results are shown in Fig. 7, where the thick blue line in (a) and the thick yellow line in (b) depict the waveform in Fig. 6b generated with the default parameter values (Table 1). In Fig. 7a, the contribution of the feedforward connections (light green elements in Fig. 5a; default value \(-\) 4) was altered from 0 (top blue line) to \(-\,30\) (bottom red line) in steps of 2 while keeping the other parameters constant. The largest effect is the emergence of the P1m and a marked monotonic increase in its peak amplitude as the feedforward contribution is increased. This is accompanied by an increase of the P1m peak latency from 26 to 54 ms. As the P1m becomes more and more substantial, it increasingly dwarfs the N1m, which is abolished at the largest contributions of the feedforward connections. A very different pattern emerged when the feedback contribution to the MEG signal (purple elements in Fig. 5a; default value 20) was increased from 0 (bottom blue line) to 30 (top red line) in steps of 2 (Fig. 7b). With zero feedback contribution, the N1m and the P2m were missing, and the ERF comprised a P1m response only. The N1m and P2m emerged only with the presence of feedback contribution. As this contribution was increased, the largest growth in amplitude was for the N1m. These results suggest that the P1m mainly reflects feedforward activation, whereas the N1m and P2m reflect feedback activation.

In Fig. 8a, the source structure of the ERF shown in Fig. 6b is revealed in terms of the individual contributions from the 13 cortical fields. The total contributions from the core, belt and parabelt are shown in Fig. 8b, along with the overall MEG response. In general, as one moves along the core–belt–parabelt axis, the responses decrease in magnitude and increase in latency. The P1m has its main source in the core, with the belt also contributing. Similarly, the N1m is largely generated in the core, but the belt contribution is now much larger. The core and belt have similar contributions to the P2m. Because of their delay, the parabelt responses contribute to the ERFs with deflections of the opposite polarity of those produced by the core and belt. However, these contributions are very shallow and broad. We note that none of the peaks and troughs of the ERF (e.g., the P1m, N1m, P2m) has a dedicated response generator in the sense that activity in any particular region of the model would account for the deflection. Rather, activity is occurring in all parts of the AC throughout the ERF, with the exception of the parabelt being in its resting state during the P1m. What is changing between the ERF deflections is the relative contribution of each area to the signal.

Each field and area might play a more fundamental role in ERF generation than that of providing a source for each deflection. Namely, our analytical results in Eqs. (13)–(17) show that the anatomical structure of the entire AC, encapsulated in \(W_{{\mathrm{AC}}}\), is part of the solution to the dynamical equations. Thus, for each field, the way it is connected to other fields, and even the local structure within the field should impact on the entire ERF. This should be the case even at the fringe of the model, in the parabelt, which otherwise provides only a weak direct source to the ERF. Figure 9 shows the results of the simulations testing this idea. Here, while keeping all other parameters constant, we introduced variations to the weight values of \(W_{{\mathrm{AC}}}\) representing the internal connections within and between the two parabelt fields (for details, see Appendix A1). These variations (Fig. 9a, b) had a minimal effect on the response produced by the parabelt (Fig. 9e) while, paradoxically, significantly altering the overall ERF (Fig. 9f). Figure 9c, d shows that the parabelt modification resulted in prominent changes in the core and belt contributions to the N1m. The end result in the ERF is a much broader N1m waveform, with a larger peak amplitude and latency, and an elimination of the P2m.

### 4.3 Separating dynamics from topography in ERF generation

*P*and \(K_i\) we have separate parameter sets for the dynamical and topographical contributions to the MEG signal, the question becomes whether there are aspects of the ERF which change when

*P*is modulated but not when

*K*is modulated, and vice versa. To this end, we examined the ERF under two conditions. In the first condition, the dynamical parameters

*P*were kept constant and the topographical parameters embedded in the three

*K*-matrices were varied. For this, the elements of the \(K_{{\mathrm{1}}}\) matrix were grouped into a \(15\times 15\) field matrix as depicted in Fig. 5a, and then each of the 88 nonzero elements of this field matrix were randomized separately by multiplying the default value (Table 1) with a random number from a distribution in the [0.5, 2] range. Similarly, the 13 nonzero elements of \(K_{{\mathrm{2}}}\) and \(K_{{\mathrm{3}}}\) were randomized separately using a random number from the same distribution. Figure 10a shows waveforms for 1000 such randomizations. In the second condition, the elements of the connection matrices found in

*P*were randomized while the other parameters were left unchanged. For each simulation, each element of the diagonal matrices \(W_{{\mathrm{ie,d}}}\), \(W_{{\mathrm{ei}}}\), and \(W_{{\mathrm{ii}}}\) was generated separately by multiplying the default value with a random number in the [0.5, 2] range. Also, for each simulation, we generated a new stochastic version of \(W_{{\mathrm{AC}}}\) (see Appendix A1). Unstable solutions (see flowchart in Fig. 2) were excluded from further analysis. Figure 10f depicts 1000 waveforms produced this way, each one representing a stable solution.

The randomization of the topographical parameters \(K_i\) lead to a scaling of the ERF, while its overall morphology was maintained (Fig. 10a). Similar scaling effects are visible in Fig. 7 where the contribution of the \(K_{{\mathrm{1}}}\) feedforward and feedback connections to the ERF are studied independently in a systematic way. In contrast, randomizing the dynamical parameters *P* resulted in a much larger diversity of the ERF waveform (Fig. 10f). To quantify these effects, we plotted the N1m-peak amplitude of the simulated waveforms against the corresponding peak latency in Fig. 10b, g. Except for the waveforms with the smallest peak amplitudes \(<0.5 \times 10^{-3}\), the N1m-peak latencies of the waveforms obtained from the randomization of the topographical parameters are nearly independent of the peak amplitude and cover a narrow range between about 110 ms and 130 ms (Fig. 10b). In contrast, when the dynamical parameters are randomized, the N1m-peak latencies span a much wider range from approximately 70 ms to 160 ms (Fig. 10g). Further, we observe a strong correlation between peak amplitude and peak latency.

The diversity between the waveform morphology can further be expressed in terms of Fourier frequency \(f_{{\mathrm{ERF}}}\) and decay time \(\tau _{{\mathrm{ERF}}}\) of the waveforms. The Fourier frequencies \(f_{{\mathrm{ERF}}}\) shown in Fig. 10c, h were obtained through a standard fast Fourier transform (FFT) and represent the dominant frequency of the FFT analysis. As expected from Fig. 10a, variations in the topographical parameters resulted in a narrow distribution of \(f_{{\mathrm{ERF}}}\) around 3 Hz. By comparison, when the dynamical parameters were varied, the distribution was much broader and peaked at 4 Hz. Note that the increase in the distribution at lower values of \(f_{{\mathrm{ERF}}}\) is due to those broad MEG waveforms in Fig. 10f that do not reach baseline level even by \(t = 500\) ms. The time constant \(\tau _{{\mathrm{ERF}}}\) describes the temporal decay of each waveform, and it was determined by first calculating the envelope of the ERF through the application of the Hilbert transform to the data. In a second step, an exponential decay function was fitted to the transformed data in a time interval ranging from the peak value of the envelope, at around 100 ms, to 600 ms where the MEG signal had sunk back to its baseline level. For both topographical and dynamical variations, the distribution of \(\tau _{{\mathrm{ERF}}}\) was centered in the 60–70 ms range. However, the distribution of \(\tau _{{\mathrm{ERF}}}\) was much broader for the dynamical variations than the topographical ones, as shown in Fig. 10d, i.

Further differences between the effects of topographical and dynamical variations become evident when \(f_{{\mathrm{ERF}}}\) is plotted against \(\tau _{{\mathrm{ERF}}}\). In the case of topographical variations, the tight distributions of these morphological descriptors depicted in Fig. 10c, d translate into a rather focal distribution in the \(f_{{\mathrm{ERF}}}\)–\(\tau _{{\mathrm{ERF}}}\) plane, as shown in Fig. 10e. Interestingly, the broader distributions of \(f_{{\mathrm{ERF}}}\) and \(\tau _{{\mathrm{ERF}}}\) associated with dynamical variations did not translate into an even or random distribution in the \(f_{{\mathrm{ERF}}}\)–\(\tau _{{\mathrm{ERF}}}\) plane. Instead, Fourier frequency and temporal decay showed a dependency on each other, with the distribution forming a distinct L shape, as is evident in Fig. 10j. Thus, there were two regions in the distribution: in the narrow range of \(\tau _{{\mathrm{ERF}}} = ({60} \pm 10)\) ms, the corresponding \(f_{{\mathrm{ERF}}}\) had a wide distribution extending from 2 to 8 Hz. Conversely, when \(f_{{\mathrm{ERF}}}\) was below 1 Hz, \(\tau _{{\mathrm{ERF}}}\) was distributed over a 70–200 ms range. Thus, there were no instances of fast temporal decay of the ERF waveform coupled with a high Fourier frequency.

Finally, we linked the variations in the ERF waveforms back to the parameters which characterize the normal modes. Figure 11a shows a subset of the ERFs shown in Fig. 10f covering a broad range of \(f_{{\mathrm{ERF}}}\). For each ERF, we plotted the damping frequency \(\delta _{\mathrm{d}}\) against the decay constant \(\gamma _{\mathrm{d}}\) of the 240 underlying normal modes in Fig. 11b. While there was little variation of \(\gamma _{\mathrm{d}}\) across the different ERFs, \(\delta _{\mathrm{d}}\) varied over a wide range not only in its absolute values but also in the dependence on \(\gamma _{\mathrm{d}}\). For ERFs with low \(f_{{\mathrm{ERF}}}\) (blue curves in Fig. 11a), \(\delta _{\mathrm{d}}\) has small values and shows a strong dependence on \(\gamma _{\mathrm{d}}\). As \(f_{{\mathrm{ERF}}}\) increases, so does \(\delta _{\mathrm{d}}\), and the dependence of \(\delta _{\mathrm{d}}\) on \(\gamma _{\mathrm{d}}\) becomes weaker.

In summary, these results predict that variations in the ERF waveform are specific to the type of parameter that is being varied. Thus, variations in dynamical parameters lead to a much broader selection of waveforms than do changes in topographic parameters. These results, depicted in Fig. 10, serve as predictions for testing in ERF measurements. We have confirmed these findings using multiple default models with realistic-looking N1m–P2m responses.

## 5 Discussion

Here, we presented a mechanistic explanation of long-latency auditory ERFs by developing analytical solutions for an already existing nonlinear model of AC signal processing. The model is based on the idiosyncratic architecture of AC in which information flows in a distinctly serial manner along multiple parallel streams within a core–belt–parabelt structure. We derived analytical solutions of the coupled differential equations for the state variables of the excitatory and inhibitory cell populations by assuming that the response to the synaptic input is linear in a wide range of spiking rates, and by using symmetric connections between the cell populations. The result is a description of the system dynamics in terms of normal modes, that is, decoupled damped harmonic oscillators. The ERF response reflects these dynamics but it is modulated by a set of non-dynamical factors comprising the topography of the primary currents and the effects of the type of connection contributing to the primary current. We showed that the ERF response originates from a mixture of normal modes, and that these directly depend on the anatomical structure as expressed in the connection matrices. In our account, each peak and trough of the ERF is not due to dedicated response generators but, rather, arises out of the network properties of the entire AC. The model generates predictions for testing whether the large inter-subject variability of ERFs is due merely to subject-specific cortical topographies or whether it also reflects subject-specific cortical dynamics.

### 5.1 The link between anatomy, dynamics, and ERFs

The current work accounts for auditory ERFs by decomposing them into a set of normal modes. Each normal mode is a solution to the equations for a driven damped harmonic oscillator [Eqs. (13) and (14)], and falls into one of three types: overdamped, critically damped, or underdamped. Further, a normal mode is defined by its amplitude [Eq. (15) or (16)] as well as by two physical terms, the decay constant \(\gamma _{\mathrm{d}}\) and the damping frequency \(\delta _{\mathrm{d}}\) [Eq. (17)]. These parameters are, in turn, functions of the set of dynamical parameters we denote by *P*, which includes all the connection matrices. With each normal mode depending directly on the entire set of connection patterns and connection strengths of the system, the decomposition of the ERF into normal modes anchors the ERF waveform directly to the anatomical structure of AC. Thus, modifying the anatomical structure can change subtle aspects of the ERF, such as the amplitudes and latencies of individual peaks and troughs. However, anatomical structure also determines what the mixture of the normal modes are in terms of their type, and it is therefore reflected in the gross aspects of the ERF, that is, whether certain peaks and troughs appear at all.

We also see how the activity of each individual column depends not just on the synaptic input to the column but, rather, it directly reflects the entire anatomical structure of the AC. The connection matrix \(W_{{\mathrm{AC}}}\) (Fig. 1b) plays a special role in the model. It consists of all the short- and long-range connections, including those which relay lateral inhibition, and thus encapsulates the anatomical structure of AC. Thus, for any specific pattern of connections and set of connection strenghts, \(\widetilde{W}_{\mathrm{AC}}\) will have a specific set of eigenvalues and eigenvectors. For a given set of \(W_{{{\mathrm{ei}}}}\), \(W_{{{\mathrm{ii}}}}\) and \(W_{{{\mathrm{ie,d}}}}\) matrices, the eigenvalues of \(\widetilde{W}_{\mathrm{AC}}\) define the distribution of frequencies \(\delta _{\mathrm{d}}\) of the normal modes [Eq. (17)]. Further, the eigenvectors regulate how the input is distributed among the normal modes [see Eqs. (45) and (46) in Appendix A2]. Using the eigenvectors gathered in the matrix \(\varUpsilon \) to couple the normal modes [Eq. (18)] gives expression to the state variables \(\mathbf u (t)\) and \(\mathbf v (t)\). Consequently, the state variable of any single column is a representation of all the normal modes, which themselves are functions of the structure of the AC network. Introducing variations in \(\widetilde{W}_{\mathrm{AC}}\) leads to changes in its eigenvalues and eigenvectors and, thus, in the dynamics of the system both on the single-column level and in terms of the ERF. We generated multiple \(\widetilde{W}_{\mathrm{AC}}\) matrices, and observed that \(\delta _{\mathrm{d}}\) depended strongly on the connection strengths while the distribution of \(\gamma _{\mathrm{d}}\) was little affected (Fig. 11). ERFs with a single peak were produced by systems with a relatively wide distribution of low-valued \(\delta _{\mathrm{d}}\) which showed a strong dependence on \(\gamma _{\mathrm{d}}\). Multi-peaked ERFs were generated by systems with high-valued \(\delta _{\mathrm{d}}\) packed into a narrow range, with little dependence on \(\gamma _{\mathrm{d}}\).

The normal modes are dynamic units which cannot be localized to any particular single location in AC. Each one can be thought of as being spread over the whole AC in a unique fashion, contributing to the activity of the cortical columns with varying strengths and polarities. This spread is accessible in the analytical approach, allowing one to map the mean contribution that each normal mode makes to each cortical field (Fig. 4). The resulting anatomical maps of the normal modes tended to show that high damping frequencies were associated with increased spatial structure. Though not shown explicitly here, one upshot of this is that the early part of the ERF is generated by normal modes with large variations across cortical fields, resulting in lower field-to-field correlations in their activity. In contrast, the late part of the ERF is dominated by normal modes with a uniform effect over the fields, resulting in higher inter-field correlations. This view opens up the possibility to consider how the activity of different fields and columns are coupled to each other via the normal modes in dynamic connectivity maps. The full implications of these observations and resulting predictions will be returned to elsewhere.

The MEG signal arises out of the cortical primary currents, which are driven by the system dynamics, and this signal is modulated by topographical factors influencing the orientation of the current. One of these factors is whether the connection driving the primary current represents feedforward or feedback input to the cortical column (Ahlfors et al. 2015). As part of the dynamical parameters *P*, our model included both feedforward and feedback AC connections in the connection matrix \(W_{{\mathrm{AC}}}\) (and the corresponding \(\widetilde{W}_{\mathrm{AC}}\)). The differential contributions of these two kinds of connections to the MEG signal was approximated through the use of the topographical \(K_{{\mathrm{1}}}\) matrix. By systematically varying the size of these contributions (while keeping the dynamics fixed), we found that the P1m reflects primarily feedforward activation and that feedback activations drive the N1m and P2m (Fig. 7). Due to the fixed dynamics, this investigation did not address how the feedback connections contribute to the dynamics of AC. A natural way to address this question would be to modify the actual feedback connections in \(W_{{\mathrm{AC}}}\) (and in \(\widetilde{W}_{\mathrm{AC}}\) respectively). However, our analytical approach does not allow this because of the requirement of symmetric connection matrices, and therefore numerical simulations would be required. We note that the role of feedback connections in neural processing is, in general, an open question, with earlier accounts labeling them weak and modulatory (Crick and Koch 1998; Sherman and Guillery 2011) and the predictive-coding framework requiring them generally to be functionally inhibitory (Bastos et al. 2012). In our model, feedback connections were symmetric with feedforward connections (Felleman and Essen 1991), as well as excitatory and of the driving kind (Covic and Sherman 2011). Leaving this to be addressed elsewhere, we suspect that feedback connections contribute to a larger \(\tau _{{\mathrm{ERF}}}\) and/or a lower \(f_{{\mathrm{ERF}}}\). That is, they might act as a memory mechanism by keeping the signal circulating in the AC for longer. This, in turn, might be beneficial for enhancing the signal-to-noise ratio in auditory processing, or for allowing the build-up of synaptic depression, which might be instrumental for representing the temporal structure of sound (May and Tiitinen 2013; May et al. 2015; Westö et al. 2016).

### 5.2 A novel approach for ERF generation

#### 5.2.1 State of the art: ECD source localization and its variations

In MEG research, the ERF waveform is usually treated as a linear combination of the activity of spatially distributed sources in the brain, and the task becomes one of localizing and modeling the activity of each source. Accurate localization of MEG sources, however, suffers from the ill-posed inverse problem. As solutions to this problem, numerous approaches have been developed, including discrete and distributed source models (Mosher et al. 1992; Scherg 1990; Scherg and Berg 1996), along with several variants of beamformers, a spatial filtering technique often applied in the analysis of brain oscillations (see, for example, Darvas et al. 2004; Hillebrand and Barnes 2005; Wendel et al. 2009).

Discrete and distributed source models use time-varying ECDs as the simplest physiologically meaningful source model. The mathematical concept of the ECD is a point-like source, and it is an abstraction which is justified in those cases where the spatial extension of the activated brain region is small compared to its distance to the MEG sensors. For example, it is common practice in experiments with simple auditory stimuli to use a single ECD per hemisphere to explain the measured magnetic field distribution describing the N1m waveform as the result of a best match between forward and inverse solution. The multi-dipole model is often used when the brain activation can be described by a small number of stationary focal sources, which is commonly the case in simple sensory experiments. To determine an adequate number of sources, a conservative and rather subjective approach is to gradually increase the number of sources on condition that for each source a distinct contribution to the measured magnetic field pattern is verifiable, i.e. that the sources do not model noise. More exacting approaches use advanced classification algorithms, such as the “recursively applied and projected multiple signal classification” method (RAP-MUSIC; Mosher and Leahy 1999). In case of spatially extended brain activation, the concept of discrete sources is often replaced by distributed source models estimating simultaneously strengths and directions of dipoles located on a grid of hundreds or even thousands of brain locations (see, for example, Dale and Sereno 1993; Hämäläinen and Ilmoniemi 1994; Dale et al. 2000; Pascual-Marqui 2002).

Taken together, the ECD as source model is a simplification which makes the source localization problem mathematically tractable. However, there are numerous arbitrary choices that the researcher has to make. Notably, these include an a priori assumption on the number of sources in discrete source analysis, and constraints such as regularization parameters in distributed source models. Therefore, source localization carries with it unavoidable ambiguities: Has the correct number of underlying neural sources been assumed? Have these been reliably and correctly separated from each other, in particular when two or more sources are close to each other in space and in time (Lütkenhöner and Steinsträter 1998)? Thus, precise source localization based on trial-averaged ERFs is non-trivial not only due to the inverse problem per se, but also due to the unknown number of sources and their separability.

#### 5.2.2 An alternative view on ERF generators

With conventional source modeling, the temptation is to understand an ERF generator in terms of a spatially and temporally constrained local process giving rise to a “component” of the ERF (Näätänen and Picton 1987; Näätänen 1992)—in effect equating sources (i.e., the primary currents) with generators (i.e., the neural tissue with the processes generating the primary currents). Thus, for example, the P1m generators are those cortical areas which are active during the peak of the P1m, and the objective of source modeling is to localize these generators. Conversely, each cortical area might be considered a generator of a component, and the challenge for source localization is to separate these generators out from each other so that the component structure of the event-related response can be identified. It follows that if the sources identified for the P1m are found to be different than those active during the N1m peak, the conclusion can be drawn that the P1m and the N1m have at least partially different (though possibly overlapping) generators. In this vein, cortical activation can be seen as signal propagation as successive generators become active, as is evident in source modeling assuming a single ECD (Lütkenhöner and Steinsträter 1998), multiple dipoles (Inui et al. 2006) and distributed sources (Yvert et al. 2005).

Our model opens up an alternative view on ERF generation. Rather than considering the ERF to be the linear sum of multiple spatially discrete sources, it becomes a combination of multiple normal modes. Here, the normal modes themselves and the way they are coupled are determined by the anatomical structure of AC and by other, dynamical parameters. This approach still approximates the ERF-generating system as a set of discrete sources—cortical columns—but it lays emphasis on the way these are connected to each other and to the dynamics of this connected system. In this way, the system can be described on three distinct levels: that of physiological and anatomical quantities, that of the normal modes, and that of the primary currents which are determined by the normal modes.

What, then, is an ERF generator in this normal mode view? In our simulations, the AC has well-defined sources of activity—those columns activated by the stimulus—and each field and area has a well-defined contribution to the ERF. Also, there is a serial progression of activation along the core–belt–parabelt axis (Figs. 3, 8), which fits in with experimental observations (Inui et al. 2006; Yvert et al. 2005; Guéguin et al. 2007). As such, these results add nothing to the conventional view of ERF generation: in our simulations, the main generators of the N1m response are clearly the core and belt areas; with the parabelt contributing very little, it seems clear that it cannot be counted as an N1m generator. However, this view is countered by the consideration that both single-column activity and the ERF represent the combination of multiple normal modes and that each normal mode is a function of the connection patterns and strengths of the entire AC system. Thus, the local connections in the parabelt, and so the parabelt fields themselves, are an intimate part of activity generation in the core and belt. This, in turn, means that one cannot consider individual columns, fields, or areas as separable ERF generators. The parabelt is just as much an N1m generator as the core and, similarly, the core is just as much a P2m generator as the parabelt.

The above principle of ERF generation is demonstrated in Fig. 9 which shows the effects of modifying the local connections within the parabelt fields. These modifications lead to modest changes in the parabelt response itself. Importantly, they entail significant changes in the activity of the core and belt. This finding is all the more intriguing since the contributions of the two parabelt fields to the overall ERF are significantly smaller than those of the core and belt (see Fig. 8b). Thus, while the parabelt does not function as a source of the N1m, it is clearly an important part of the N1m generator.

On a more fundamental level, there is an ongoing debate about the generation of event-related responses (de Munck and Bijma 2010; Sauseng et al. 2007; Telenczuk et al. 2010; Turi et al. 2012; Yeung et al. 2004). In the classical signal-plus-noise (SPN) model, the trial-averaged MEG response is treated as the superposition of a stationary stimulus-evoked signal and zero-mean Gaussian noise. In this view, an ERF is a time-locked phasic burst which is uncorrelated with the ongoing rhythmic activity (Arieli et al. 1996; Dawson 1954; Mäkinen et al. 2005; Mazaheri and Jensen 2006; Shah et al. 2004). The phase-reset model proposed by Sayers et al. (1974) (see also Makeig et al. 2002; Hanslmayr et al. 2007) provides an opposing view according to which stimulus-evoked responses are generated by partial stimulus-induced phase synchronization of the rhythmic background activity. The most recent model for ERF generation is the baseline-shift model introduced by Nikulin et al. (2007, 2010). This model is based on the asymmetric modulation of the amplitude of spontaneous alpha-band oscillations, although de Munck and Bijma (2010) argue that it could be viewed as a special case of the SPN model. The current results of our work do not as such contribute to this debate because we did not include oscillatory background activity in the simulations. However, while beyond the scope of the current study, such oscillatory activity would be easy enough to include in the model. The oscillator nature of the model implies that feeding the model with noise should already be sufficient to generate ongoing oscillations. Alternatively, the analytical solutions themselves show that the resting state of the model would be a limit cycle if the decay constant \(\gamma _{\mathrm{d}}\) for one or more normal modes in Eqs. (13) and (14) is zero. A stimulus acting as an outside push to the individual harmonic oscillators represented by the normal modes could have a multitude of effects, depending on the amplitude of the ongoing oscillations, and this could be approached analytically by considering the unit impulse response of our model. Further, as our model operates on multiple spatial resolutions, from the single-column to the aggregated MEG signal, it might be useful for approaching the question of whether phasic responses and ongoing oscillations are generated by the same or different neural populations (Sauseng et al. 2007).

### 5.3 Subject-specificity of the event-related waveforms

Event-related responses are characterized by large between-subject variability. Although rarely an object of study, this variability is evident to any researcher using ERPs and ERFs (Luck 2014). It is also given expression in, for example, the large standard deviations of peak amplitudes and latencies in test–retest studies in which the reliability and reproducibility of event-related responses have been investigated in various subject populations (e.g., Michalewski et al. 1986; Kileny and Kripal 1987; Segalowitz and Barnes 1993; Dalebout and Robey 1997; Atcherson et al. 2006). These studies tend to show that between-subject variability is contrasted by the responses staying stable for a given subject across different measurement sessions and over long periods.

There are potentially two sources of the variability of event-related responses across subjects: the anatomical topography of the cortical surface and the dynamics of AC. Subject-specific topography is well-documented, with human subjects having large differences in the pattern and number of convolutions on the supratemporal plane (Yvert et al. 2005; Moerel et al. 2014). However, the question of subject-specific dynamics of AC has, to our knowledge, not been approached before. The current AC model allowed us to address this issue in simulations via separable sets of parameters: the *P*-parameters governing the system dynamics, and the \(K_i\)-matrices capturing the topographical properties influencing the MEG signal. We introduced random variations to these parameter sets and found that variations of the dynamical parameters have a much stronger effect on the waveform than variations of the topographical parameters. Specifically, *K*-randomizations mainly affect the N1m-peak amplitude while keeping intact the morphology of the waveform of the entire ERF (Fig. 10a). In contrast, *P*-randomizations result in a much larger variety of waveform morphologies (Fig. 10f), with N1m-peak amplitudes and latencies varying in a considerably larger range (Fig. 10g). Consequently, changes of dynamics parameters entail a much broader FFT frequency spectrum, and, likewise, a broader distribution of the time constant \(\tau _{{\mathrm{ERF}}}\) (Fig. 10h, i).

These results provide predictions for straightforward testing in populations of subjects: Characterizing single-subject ERFs in terms of frequency spectrum and \(\tau _{{\mathrm{ERF}}}\), how are these estimates distributed over the population? If the distribution is narrowly focused, this would indicate that subjects have similar AC dynamics and that the subject-specificity of the ERF is due to topographical variations only. In contrast, subject-specific dynamics would be indicated by a wider distribution of frequency spectrum and \(\tau _{{\mathrm{ERF}}}\), especially if this distribution is structured as in Fig. 10e. Previous research has pointed to a large variation in the N1/N1m peak latency (e.g., Michalewski et al. 1986; Kileny and Kripal 1987; Segalowitz and Barnes 1993; Dalebout and Robey 1997; Atcherson et al. 2006), and this would fit with the results of the current simulations utilizing dynamical parameter randomizations (Fig. 10g). Thus, we would expect to see evidence supporting the presence of subject-specific AC dynamics.

While our results suggest that the separation of dynamical and topographical effects might be possible on the population level, is there hope for such a separation when looking at single-subject data? This is a challenge since (random) combinations of *P*- and *K*-parameters lead to a wealth of waveforms. One way forward might be through systematic investigations of how waveform properties depend on parameters. This might enable us to identify major causal relationships between dynamical and topographical parameters on the one hand and the resulting waveforms on the other. One compelling example is our finding that the feedforward projections of the \(K_{{\mathrm{1}}}\)-matrix are crucial for the generation of the P1m response.

The current model is based on the architecture of the monkey AC (Kaas and Hackett 2000; Hackett et al. 2014), as a comparable map of the organization of human AC is still missing (Nourski et al. 2014; Leaver and Rauschecker 2016). Might our approach offer a method for fitting anatomical organization of the AC to the ERF in humans? This would present an inverse problem quite different from the source localization one. Instead of using ECDs, the solution would be expressed in terms of normal modes and the underlying architecture coupled with the subject-specific cortical surface. Even in the presence of subject-specific ERF waveforms, this approach might become possible with the constraint that the coarse architecture in terms of cortical fields and their interconnections is a shared feature across human subjects. The first steps in this investigation will require computational studies on the effect of architecture on the ERF waveforms. Specifically, can species-specific event-related responses be explained by a species-specific constellation of fields in auditory cortex? Further, might the distributions of ERF descriptors such as those presented in Fig. 10 be used to decode anatomical structure on the population level?

## Notes

### Acknowledgements

We are grateful for the support of André Brechmann, Michael Brosch, Peter Heil, Torsten Stöter, and Matthias Wolfrum. This research was supported by the Deutsche Forschungsgemeinschaft (He 1721/10-1, He 1721/10-2, and SFB TR31, A4) and by the European Union’s Horizon 2020 research and innovation programme (Grant Agreement 763959). Further, we also acknowledge the support by an Alexander von Humboldt Polish Honorary Research Scholarship by the Foundation for Polish Science.

## References

- Ahlfors SP, Wreh C (2015) Modelling the effect of dendritic input location on MEG and EEG source dipoles. Med Biol Eng Comput 53(9):879–887Google Scholar
- Ahlfors SP, Jones SR, Ahveninen J, Hämäläinen MS, Belliveau JW, Bar M (2015) Direction of magnetoencephalography sources associated with feedback and feedforward contributions in a visual object recognition task. Neurosci Lett 585:149–154Google Scholar
- Allen GI, Korn H, Oshima T (1975) The mode of synaptic linkage in the cerebro-ponto-cerebellar pathway of the cat. I. Responses in the branchium pontis. Exp Brain Res 24:1–14Google Scholar
- Arieli A, Sterkin A, Grinvald A, Aertsen A (1996) Dynamics of ongoing activity: explanation of the large variability in evoked cortical responses. Science 273:1868–1871Google Scholar
- Atcherson SR, Gould HJ, Pousson MA, Prout TM (2006) Long-term stability of N1 sources using low-resolution electromagnetic tomography. Brain Topogr 19(1/2):11–20Google Scholar
- Bartos M, Vida I, Jonas P (2007) Synaptic mechanisms of synchronized gamma oscillations in inhibitory interneuron networks. Nat Rev 8:45–56Google Scholar
- Bastos AM, Usrey WM, Adams RA, Mangun GR, Fries P, Friston KJ (2012) Canonical microcircuits for predictive coding. Neuron 76:695–711Google Scholar
- Baumann S, Petkov CI, Griffiths TD (2013) A unified framework for the organization of the primate auditory cortex. Front Syst Neurosci 7:11Google Scholar
- Bazelot M, Dinocourt C, Cohen I, Miles R (2010) Unitary inhibitory field potentials in the CA3 region of rat hippocampus. J Physiol 588:2077–2090Google Scholar
- Brosch M, Schreiner CE (1997) Time course of forward masking tuning curves in cat primary auditory cortex. J Neurophysiol 77(2):923–943Google Scholar
- Brosch M, Scheich H (2008) Tone-sequence analysis in the auditory cortex of awake macaque monkeys. Exp Brain Res 184:349–361Google Scholar
- Brosch M, Schulz A, Scheich H (1999) Processing of sound sequences in macaque auditory cortex: response enhancement. J Neurophysiol 82(3):1542–1559Google Scholar
- Budinger E, Heil P (2005) Anatomy of the auditory cortex. In: Greenberg S, Ainsworth WA (eds) Listening to speech. Lawrence Erlbaum Associates, Mahwah, pp 91–113Google Scholar
- Budinger E, Heil P, Scheich H (2000) Functional organization of auditory cortex in the mongolian gerbil (
*Meriones unguiculatus*). III. Anatomical subdivisions and corticocortical connections. Eur J Neurosci 12:2425–2451Google Scholar - Buzsáki G (2006) Rhythms of the brain. Oxford University Press, OxfordGoogle Scholar
- Buzsáki G, Anastassiou CA, Koch C (2012) The origin of extracellular fields and currents: EEG, ECoG, LFP and spikes. Nat Rev 13:407–420Google Scholar
- Buzsáki G, Draguhn A (2004) Neural oscillations in cortical networks. Science 304:1926–1929Google Scholar
- Caughey TK (1960) Classical normal modes in damped linear dynamic systems. J Appl Mech 27:269–271Google Scholar
- Caughey TK, O’Kelly MEJ (1965) Classical normal modes in damped linear dynamic systems. J Appl Mech 32:583–588Google Scholar
- Covic EN, Sherman SM (2011) Synaptic properties of connections between the primary and secondary auditory cortices in mice. Cereb Cortex 21:2425–2441Google Scholar
- Crick F, Koch C (1998) Constraints on cortical and thalamic projections: the no-strong-loops hypothesis. Nature 391:245–250Google Scholar
- Dale A, Sereno M (1993) Improved localization of cortical activity by combining EEG and MEG with MRI cortical surface reconstruction: a linear approach. J Cogn Neurosci 5:162–176Google Scholar
- Dale A, Liu A, Fischl B, Buckner R (2000) Dynamic statistical parametric neurotechnique mapping: combining fMRI and MEG for high-resolution imaging of cortical activity. Neuron 26:55–67Google Scholar
- Dalebout SD, Robey RR (1997) Comparison of the intersubject and intrasubject variability of exogenous and endogenous auditory evoked potentials. J Am Acad Audiol 8(2):342–354Google Scholar
- Darvas F, Pantazis D, Kucukaltun-Yildirim E, Leahy RM (2004) Mapping human brain function with MEG and EEG: methods and validation. NeuroImage 23:S289–S299Google Scholar
- David O, Kiebel SJ, Harrison LM, Mattout J, Kilner JM, Friston KJ (2006) Dynamic causal modeling of evoked responses in EEG and MEG. NeuroImage 30:1255–1272Google Scholar
- Davis PA (1939) Effects of acoustic stimuli on the waking human brain. J Neurophysiol 2(6):494–499Google Scholar
- Dawson GD (1954) A summation technique for the detection of small evoked potentials. Electroencephalogr Clin Neurophysiol 6:65–84Google Scholar
- de Munck JC, Bijma F (2010) How are evoked responses generated? The need for a unified mathematical framework. Clin Neurophysiol 121:127–129Google Scholar
- Douglas RJ, Koch C, Mahowald M, Martin KAC, Suarez HH (1995) Recurrent excitation in neocortical circuits. Science 269:981–985Google Scholar
- Douglas R, Markram H, Martin K (2004) Neocortex. In: Shepherd GM (ed) The synaptic organization of the brain. Oxford University Press, New York, pp 499–558Google Scholar
- Douglas RJ, Martin KAC (2007) Mapping the matrix: the ways of neocortex. Neuron 56:226–238Google Scholar
- Einevoll GT, Kayser C, Logothetis NK, Panzeri S (2013) Modelling and analysis of local field potentials for studying the function of cortical circuits. Nat Rev 14:770–785Google Scholar
- Felleman DJ, Essen DCV (1991) Distributed hierarchical processing in the primate cerebral cortex. Cereb Cortex 1:1–47Google Scholar
- Fowles GR, Cassiday GL (2005) Analytical mechanics, 7th edn. Thomson Brooks/Cole, BelmontGoogle Scholar
- Friston KJ, Harrison L, Penny W (2003) Dynamic causal modelling. NeuroImage 19:1273–1302Google Scholar
- Garrido MI, Kilner JM, Kiebel SJ, Stephan KE, Friston KJ (2007) Dynamic causal modelling of evoked potentials: a reproducibility study. NeuroImage 36:571–580Google Scholar
- Garrido MI, Kilner JM, Kiebel SJ, Friston KJ (2009) Dynamic causal modeling of the response to frequency deviants. J Neurophysiol 101:2620–2631Google Scholar
- Glickfield LL, Roberts JD, Somogyi P, Scanziani M (2009) Interneurons hyperpolarize pyramidal cells along their entire somatodendritic axis. Nat Neurosci 12(1):21–23Google Scholar
- Guéguin M, Bouquin Jeannès RL, Faucon G, Chauvel P, Liégeois-Chauvel C (2007) Evidence of functional connectivity between auditory cortical areas revealed by amplitude modulation sound processing. Cereb Cortex 17:304–313Google Scholar
- Hackett TA, de la Mothe LA, Camalier CR, Falchier A, Lakatos P, Kajikawa Y, Schroeder CE (2014) Feedforward and feedback projections of caudal belt and parabelt areas of auditory cortex: refining the hierarchical model. Front Neurosci 8:72Google Scholar
- Hämäläinen M, Ilmoniemi R (1994) Interpreting magnetic fields of the brain: minimum norm estimates. Med Biol Eng Comput 32:35–42Google Scholar
- Hämäläinen M, Hari R, Ilmoniemi RJ, Knuutila J, Lounasmaa OV (1993) Magnetoencephalography—theory, instrumentation, and applications to non-invasive studies of the working human brain. Rev Mod Phys 65:413–497Google Scholar
- Hanslmayr S, Klimesch W, Sauseng P, Gruber W, Doppelmayr M, Freunberger R, Pechersdorfer T, Birbaumer N (2007) Alpha phase reset contributes to the generation of ERPs. NeuroImage 17:1–8Google Scholar
- Hillebrand A, Barnes GR (2005) Beamformer analysis of MEG data. Int Rev Neurobiol 68:149–171Google Scholar
- Hines M, Carnevale NT (2001) NEURON: a tool for neuroscientists. The Neuroscientist 7(2):123–135Google Scholar
- Hopfield JJ (1984) Neurons with graded response have collective computational properties like those of two-state neurons. Proc Natl Acad Sci USA 81:3088–3092Google Scholar
- Hopfield JJ, Tank DW (1986) Computing with neural circuits: a model. Science 233:625–633Google Scholar
- Inui K, Okamoto H, Miki K, Gunji A, Kakigi R (2006) Serial and parallel processing in the human auditory cortex: a magnetoencephalographic study. Cereb Cortex 16:18–30Google Scholar
- Kaas JH, Hackett TA (2000) Subdivisions of auditory cortex and processing streams in primates. Proc Natl Acad Sci USA 97:11793–11799Google Scholar
- Katznelson RD (1981) Normal modes of the brain: neuroanatomical basis and a physiological theoretical model. In: Nunez PL (ed) Electric fields of the brain: the neurophysics of EEG. Oxford University Press, Oxford, pp 401–442Google Scholar
- Kileny PR, Kripal JP (1987) Test-retest variability of auditory event-related potentials. Ear Hear 8:110–114Google Scholar
- König R, Matysiak A, Kordecki W, Sielużycki C, Zacharias N, Heil P (2015) Averaging auditory evoked magnetoencephalographic and electroencephalographic responses: a critical discussion. Eur J Neurosci 41:631–640Google Scholar
- Kubota Y, Karube F, Nomura M, Kawaguchi Y (2016) The density of cortical inhibitory synapses. Front Neural Circuits 10:27Google Scholar
- Larson E, Billimoria CP, Sen K (2009) A biologically plausible computational model for auditory object recognition. J Neurophysiol 101(1):323–331Google Scholar
- Leaver AM, Rauschecker JP (2016) Functional topography of human auditory cortex. J Neurosci 36:1416–1428Google Scholar
- Loebel A, Nelken I, Tsodyks M (2007) Processing of sound by population spikes in a model of primary auditory cortex. Front Neurosci 1:197–209Google Scholar
- Luck SJ (2014) An introduction to the event-related potential technique, 2nd edn. The MIT Press, CambridgeGoogle Scholar
- Lütkenhöner B (1998) Dipole separability in a neuromagnetic source analysis. IEEE Trans Biomed Eng 45(5):572–581Google Scholar
- Lütkenhöner B, Steinsträter O (1998) High-precision neuromagnetic study of the functional organization of the human auditory cortex. Audiol Neuro-Otol 3:191–213Google Scholar
- Makeig S, Westerfield M, Jung TP, Enghoff S, Townsend J, Courchesne E, Sejnowski TJ (2002) Dynamic brain sources of visual evoked responses. Science 295:690–694Google Scholar
- Mäkinen V, Tiitinen H, May P (2005) Auditory event-related responses are generated independently of ongoing brain activity. NeuroImage 24:961–968Google Scholar
- Matysiak A, Kordecki W, Sielużycki C, Zacharias N, Heil P, König R (2013) Variance stabilization for computing and comparing grand mean waveforms in MEG and EEG. Psychophysiology 50:627–639Google Scholar
- May PJC (2002) Do EEG and MEG measure dynamically different properties of neural activity? In: Nowak H, Haueisen J, Giesler F, Huonker R (eds) Proceedings of the 13th international conference on biomagnetism. International Congress Series. VDE Verlag GmbH, Berlin, pp 709–711Google Scholar
- May P, Tiitinen H (2001) Human cortical processing of auditory events over time. NeuroReport 12(3):573–577Google Scholar
- May P, Tiitinen H, Ilmoniemi RJ, Nyman G, Taylor JG, Näätänen R (1999) Frequency change detection in human auditory cortex. J Comput Neurosci 6:99–120Google Scholar
- May PJC, Tiitinen H (2010) Mismatch negativity MMN, the deviance-elicited auditory deflection, explained. Psychophysiology 47:66–122Google Scholar
- May PJC, Tiitinen H (2013) Temporal binding of sound emerges out of anatomical structure and synaptic dynamics of auditory cortex. Front Comput Neurosci 7:152Google Scholar
- May PJC, Tiitinen H, Westö J (2015) Computational modelling suggests that temporal integration results from synaptic adaptation in auditory cortex. Eur J Neurosci 41:615–630Google Scholar
- Mazaheri A, Jensen O (2006) Posterior alpha activity is not phase-reset by visual stimuli. Proc Natl Acad Sci USA 103:2948–2952Google Scholar
- Michalewski HJ, Prasher DK, Starr A (1986) Latency variability and temporal interrelationships of the auditory event-related potentials (N1, P2, N2, and P3) in normal subjects. Electroencephalogr Clin Neurobiol 65:59–71Google Scholar
- Mitzdorf U (1985) Current source-density method and application in cat cerebral cortex: investigation of evoked potentials and EEG phenomena. Physiol Rev 65(1):37–100Google Scholar
- Mitzdorf U (1994) Properties of cortical generators of event-related potentials. Pharmacopsychiatry 27:49–51Google Scholar
- Moerel M, Martino FD, Formisano E (2014) An anatomical and functional topography of human auditory cortical areas. Front Neurosci 8:225Google Scholar
- Mosher JC, Leahy RM (1999) Source localization using recursively applied and projected (RAP) MUSIC. IEEE Trans Signal Process 47:332–340Google Scholar
- Mosher JC, Lewis PS, Leahy RM (1992) Multiple dipole modeling and localization from spatio-temporal MEG data. IEEE Trans Biomed Eng 39:541–557Google Scholar
- Näätänen R (1992) Attention and brain function. Lawrence Erlbaum Associates, HillsdaleGoogle Scholar
- Näätänen R, Picton T (1987) The N1 wave of the human electric and magnetic response to sound: a review and an analysis of component structure. Psychophysiology 24:375–425Google Scholar
- Nikulin VV, Linkenkaer-Hansen K, Nolte G, Lemm S, Müller KR, Ilmoniemi RJ, Curio G (2007) A novel mechanism for evoked responses in the human brain. Eur J Neurosci 25:3146–3154Google Scholar
- Nikulin VV, Linkenkaer-Hansen K, Nolte G, Curio G (2010) Non-zero mean and asymmetry of neuronal oscillations have different implications for evoked responses. Clin Neurophysiol 121:186–193Google Scholar
- Noto M, Nishikawa J, Tateno T (2016) An analysis of nonlinear dynamics underlying neural activity related to auditory induction in the rat auditory cortex. Neuroscience 318:58–83Google Scholar
- Nourski KV, Steinschneider M, McMurray B, Kovach CK, Oya H, Kawasaki H, Howard MA III (2014) Functional organization of human auditory cortex: investigation of response latencies through direct recordings. NeuroImage 101:598–609Google Scholar
- Nunez PL (1995) Neocortical dynamics and human EEG rhythms. Oxford University Press, New YorkGoogle Scholar
- Okada YC, Wu J, Kyohou S (1997) Genesis of MEG signals in a mammalian CNS structure. Electroencephalogr Clin Neurophysiol 103:474–485Google Scholar
- Pascual-Marqui R (2002) Standardized low resolution brain electromagnetic tomography (sLORETA): technical details. Methods Find Exp Clin Pharmacol 24(Suppl D):5–12Google Scholar
- Picton T, Stuss DT (1980) The component structure of the human event-related potentials. Progr Brain Res 54:17–49Google Scholar
- Rauschecker JP (1997) Processing of complex sounds in the auditory cortex of cat, monkey and man. Acta Oto Laryngol 117:34–38Google Scholar
- Rayleigh L (1945) Theory of sounds, vol 1. Dover publication Inc., New YorkGoogle Scholar
- Ritter P, Schirner M, Mclntosh AR, Jirsa VK (2013) The virtual brain integrates computational modeling and multimodal neuroimaging. Brain Connect 3(2):121–145Google Scholar
- Sarvas J (1987) Basic mathematical and electromagnetic properties of the biomagnetic inverse problem. Phys Med Biol 32:11–22Google Scholar
- Sauseng P, Klimesch W, Gruber WR, Hanslmayer S, Freunberger R, Doppelmayr M (2007) Are event-related potential components generated by phase resetting of brain oscillations? A critical discussion. Neuroscience 146:1435–1444Google Scholar
- Sayers B, Beagley HA, Menshall WR (1974) The mechanism of auditory evoked EEG responses. Nature 247:481–483Google Scholar
- Scherg M (1990) Fundamentals of dipole source potential analysis. In: Grandori F, Hoke M, Romani GL (eds) Evoked magnetic fields and electric potentials. Vol. 6 of Advances in Audiology. Karger, Basel, pp 40–69Google Scholar
- Scherg M, Berg P (1996) New concepts of brain source imaging and localization. Electroencephalogr Clin Neurophysiol 46:127–137Google Scholar
- Segalowitz SJ, Barnes KL (1993) The relaibility of ERP components in the auditory oddball paradigm. Psychophysiology 30:451–459Google Scholar
- Shah AS, Bressler SL, Knuth KH, Ding M, Mehta AD, Ulbert I, Schroeder CE (2004) Neural dynamics and the fundamental mechanisms of event-related brain potentials. Cereb Cortex 14:476–483Google Scholar
- Sherman SM, Guillery RW (2011) Distinct functions for direct and transthalamic corticocortical connections. J Neurophysiol 106:1068–1077Google Scholar
- Telenczuk B, Nikulin VV, Curio G (2010) Role of neuronal synchrony in the generation of evoked EEG/MEG responses. J Neurophysiol 104:3557–3567Google Scholar
- Trevelyan AJ (2009) The direct relationship between inhibitory currents and local field potentials. J Neurosci 29(48):15299–15307Google Scholar
- Turi G, Gotthardt S, Singer W, Vuong TA, Munk M, Wibral M (2012) Quantifying additive evoked contributions to the event-related potential. NeuroImage 59:2607–2624Google Scholar
- Ulanovsky N, Las L, Nelken I (2003) Processing of low-probability sounds by cortical neurons. Nat Neurosci 6(4):391–398Google Scholar
- Ulanovsky N, Las L, Farkas D, Nelken I (2004) Multiple time scales of adaptation in auditory cortex neurons. J Neurosci 24:10440–10453Google Scholar
- Wang P, Knösche TR (2013) A realistic neural mass model of the cortex with laminar-specific connections and synaptic plasticity—evaluation with auditory habituation. PLoS ONE 8:1–17Google Scholar
- Wendel K, Väisänen O, Malmivuo J, Gencer NG, Vanrumste B, Durka P, Magjarević R, Supek S, Pascu ML, Fontenelle H, Grave de Peralta Menendez R (2009) EEG/MEG source imaging: methods, challenges, and open issues. Comput Intell Neurosci 2009, Article ID 656092, 12 pagesGoogle Scholar
- Westö J, May PJC, Tiitinen H (2016) Memory stacking in hierarchical networks. Neural Comput 28:327–353Google Scholar
- Williamson SJ, Kaufman L (1981) Biomagnetism. J Magn Magn Mater 22:129–201Google Scholar
- Wilson H, Cowan J (1972) Excitatory and inhibitory interactions in localized populations of model neurons. J Biophys 12:1–24Google Scholar
- Yarden TS, Nelken I (2017) Stimulus-specific adaptation in a recurrent network model of primary auditory cortex. PLoS Comput Biol 13:e1005437Google Scholar
- Yeung N, Bogacz R, Holroyd CB, Cohen JD (2004) Detection of synchronized oscillations in the electroencephalogram: an evaluation of methods. Psychophysiology 41:822–832Google Scholar
- Yvert B, Fischer C, Bertrand O, Pernier J (2005) Localization of human supratemporal auditory areas from intracerebral auditory evoked potentials using distributed source models. NeuroImage 28:140–153Google Scholar
- Zacharias N, Sielużycki C, König R, Kordecki W, Heil P (2011) The M100 component of evoked magnetic fields differs by scaling factors: Implications for signal averaging. Psychophysiology 48(8):1069–1082Google Scholar

## Copyright information

**OpenAccess**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.