# Bayesian Modelling of Induced Responses and Neuronal Rhythms

## Abstract

Neural rhythms or oscillations are ubiquitous in neuroimaging data. These spectral responses have been linked to several cognitive processes; including working memory, attention, perceptual binding and neuronal coordination. In this paper, we show how Bayesian methods can be used to finesse the ill-posed problem of reconstructing—and explaining—oscillatory responses. We offer an overview of recent developments in this field, focusing on (i) the use of MEG data and Empirical Bayes to build hierarchical models for group analyses—and the identification of important sources of inter-subject variability and (ii) the construction of novel dynamic causal models of intralaminar recordings to explain layer-specific activity. We hope to show that electrophysiological measurements contain much more spatial information than is often thought: on the one hand, the dynamic causal modelling of non-invasive (low spatial resolution) electrophysiology can afford sub-millimetre (hyper-acute) resolution that is limited only by the (spatial) complexity of the underlying (dynamic causal) forward model. On the other hand, invasive microelectrode recordings (that penetrate different cortical layers) can reveal laminar-specific responses and elucidate hierarchical message passing and information processing within and between cortical regions at a macroscopic scale. In short, the careful and biophysically grounded modelling of sparse data enables one to characterise the neuronal architectures generating oscillations in a remarkable detail.

### Keywords

Dynamic causal modelling Intersubject variability Connectivity Microelectrodes Laminar responses Compartmental models Hierarchical Bayesian models## Introduction

Neural rhythms have been associated with a variety of cognitive functions; including working memory (Pesaran et al. 2002; Siegel et al. 2009), visual attention (Buschman and Miller 2007; Fries 2009; Kornblith et al. 2015; Womelsdorf et al. 2006), cortical representations (Buzsáki and Chrobak 1995; Schoffelen et al. 2005), feature binding (Tallon-Baudry et al. 1996) and information propagation in feedforward/feedback directions in cortical hierarchies (Bastos et al. 2012; Friston et al. 2015). Oscillatory activity is also thought to be the signature of aberrant neuronal processing in psychiatric diseases (Uhlhaas and Singer 2012), such as autism (Dickinson et al. 2015) or schizophrenia (Gonzalez-Burgos and Lewis 2008) and can be used to disclose mechanisms underlying intersubject variability (Pinotsis et al. 2013). Gamma band responses in particular, have been shown to reflect various input attributes, like the size of visual objects (Pinotsis et al. to appeara; Perry et al. 2013), luminance (Swettenham et al. 2013) and contrast (Pinotsis et al. 2014; Ray and Maunsell 2010; Roberts et al. 2013). Here, we consider powerful tools from Bayesian inference to illustrate the wealth of information about brain function that neural rhythms and electrophysiological responses afford. In this setting, Bayesian deconvolution and empirical Bayes are used to finesse the ill-posed problem of reconstructing and explaining electromagnetic sources and oscillatory responses. In this setting, neural activity is described by probability densities parameterized by physiological or anatomical (lead field) parameters and hyper-parameters that embody assumptions about random effects. This description provides a generative model of how underlying signals are caused, which can be used to optimise the model—and its (physiological) parameters. This approach calls on a combination of forward and backward modelling that involves simulating predicted responses (using biologically plausible anatomical models) and Bayesian inversion to estimate cortical structure and function. We will showcase two applications of Variational Bayes to extract information from neuroimaging data, see also (Pinotsis and Friston 2014a): First, we will illustrate the richness of non-invasive recordings by reviewing recent studies that use parametric empirical Bayes (PEB) to characterise intersubject variability—in cortical function—using non-invasive electrophysiology. Second, we will preview a new study that uses laminar data and Bayesian model comparison to analyse oscillatory recordings obtained from the prefrontal cortex during a delayed saccade task. These complementary examples show how biologically informed modelling of electrophysiological measurements can, on the one hand, allow questions about microcircuitry to be answered using macroscopic (non-invasive) data, while on the other hand microscopic (invasive) data can be used to inform hypotheses about neuronal interactions at a macroscopic scale.

## Hierarchical Bayesian Models and the Analysis of Neuroimaging Data

In this section, we summarize some theoretical results and show how (i) parametric empirical Bayes (PEB) can be used to quantify group effects in multi-subject studies—by optimizing hierarchical Bayesian models and (ii) how Bayesian model comparison allows us to reconcile formally distinct (compartmental and mean field) models and construct DCMs of laminar probe data.

*i*-th subject responses, \(\varGamma_{i} (\theta^{(1)} )\) represents the (differential equation or dynamic causal) model that generates these responses with parameters \(\theta^{(1)}\), \(\varGamma (\theta^{(2)} )\) is the between subject (second level) model that describes intersubject variability in the parameters of the first level model. The second level maps second to first level parameters (e.g., group means to subject-specific parameters), where \(\varepsilon^{(i)}\) represent random effects at each level (e.g., intersubject variability and observation noise). Below, we combine these second level models with models of brain activity that make predictions about the dynamics of coupled excitatory and inhibitory populations. In these applications, \(\varGamma_{i} (\theta^{(1)} )\) captures biophysical (see e.g. (Deco et al. 2008; Pinotsis and Friston 2014a) behaviours that are caused by key architectures and (synaptic) connectivity parameters of interest. Bayesian procedures allow us to identify the form of hierarchical models and estimate their (hidden) parameters using observed responses and Variational Bayesian inference (Friston et al. 2007, 2008).

In the context of non-invasive electrophysiology, the hierarchical model (2) poses the difficult inversion problem of finding neural source estimates in the context of intersubject variability. This involves (i) partitioning the covariance of observed data into observation error and components that can be explained in terms of neuronal responses, that themselves entail components due to second level (between-subject) level variability; (ii) exploiting differential equation models to provide anatomical and physiological constraints on the explanation for first level (within subject) responses—usually in terms of (synaptic) connectivity estimates. The second level covariance components specify whether the parameters of the dynamical model at the first level are random or fixed effects, while dynamical models provide predictions of the dynamics at source and sensor space, which depend upon cortical anatomy and physiology.

In summary, hierarchical or empirical Bayesian modelling of the sort implied by Eq. (2) allows us to perform efficient source reconstruction and obtain connectivity estimates by replacing phenomenological constraints (e.g., based on autoregressive modelling and temporal smoothness considerations) by spatiotemporal constraints based on models of neuronal activity. This can be thought as an alternative to autoregressive models, which model statistical dependencies among measured signals—as opposed to the neuronal processes generating measurements. In dynamic causal modelling, one uses a forward or generative model of distributed processing to estimate the (coupling) parameters of that model. Inference then proceeds assuming nonlinear within-subject effects and linear between subject effects. This allows one to distinguish among competing hypotheses about the mechanisms and architectures generating the data and the nature of group effects in multiple subject studies (Friston et al. 2015, 2016; Pinotsis et al.; to appear).

Here, \(y\) denotes the data obtained from all subjects (indexed by *i*) and the generative model \(\varGamma_{i}\) is a function of model parameters at the first or within-subject level: \(\theta^{(1)}\). These parameterize the connectivity architecture mediating responses, the observation function \(\varphi \subset \theta^{(1)}\) and the spectra of the inputs and channel noise, \(\{ \alpha_{n} ,\alpha_{u} ,\beta_{n} ,\beta_{u} \} \subset \theta^{(1)}\). Gaussian assumptions about sampling errors \(\varepsilon^{(1)}\) provide the likelihood model at the first (within-subject) level: \(p(y_{i} |\theta^{(1)} )\). To explain intersubject variability this model is supplemented with a mapping from group means to subject-specific estimates: \(\theta^{(1)} = (X \otimes I)\theta^{(2)} + \varepsilon^{(2)}\), where \(\varepsilon^{(2)}\) are random effects (at the between–subject level) and *X* is a design matrix containing between-subject explanatory variables. Below, using Bayesian model reduction, we adjudicate among competing hypotheses about the intrinsic connections that show intersubject variability. Effectively, this involves comparing the evidence for random effects models with and without (combinations of) between subject effects on (combinations) of connectivity parameters.

The key thing about this free energy is that it can be evaluated (using BMR) without optimising the first level posterior. This means the second level parameters (e.g., group means) can be optimised or estimated, for any given model of priors, without reinventing the model at the first level. Technically, the inversion of the hierarchical or empirical Bayesian model only requires the posterior density from the inversion of each subject’s DCM. In short, the use of BMR allows one to make inferences at the group level without having to re-estimate subject-specific parameters; see (Friston et al. 2015, 2016) for details and a study of robustness of this scheme—and (Litvak et al. in press) for a reproducibility study using independent data under formally distinct models. Finally, after obtaining optimized second level estimates, these can be used as empirical priors to recursively optimize densities over parameters at the first level. The latter approach is not necessary but can finesse the local minima problem inherent in nonlinear (dynamic causal) modelling at the first level and allows one to estimate subject—or trial—specific parameters when a subset of subjects (or trials) provide more informative data than others (e.g. because of differences in lead fields), see (Friston et al. 2016).

Our aim, in this second study, is to make inferences using data collected with laminar probes (see below). The first level generative model makes predictions of layer specific responses, where we eschew the difficult problem of inverting detailed (compartmental) models (due model complexity and conditional dependencies) by using simulated data obtained with a microscopic (compartmental) model \(m_{CM}\) to inform a (mean field) neural mass model \(m_{MF}\) of empirical data. In other words, we consider the joint optimization of compartmental and neural mass models, assuming they are homologous (i.e., they explain the same underlying cortical function and structure, see Pinotsis et al. under review, for more details). Effectively, we use the conditional densities obtained after fitting simulated data as empirical priors for subsequent analyses of empirical data. In what follows, we illustrate this approach after first describing a study of individual differences in gamma oscillations.

## Neural Models and Their Inversion with Variational Bayes

Population models come in different flavours, for a review see (Deco et al. 2008; Moran et al. 2015) with some cardinal distinctions; namely the distinction between *convolution* and *conductance* dynamics, the distinction between *neural mass* and *mean field* formulations and the distinction between *point sources* and *neural field* models. The first distinction pertains to the dynamics or equations of motion within a single population. Convolution models formulate synaptic dynamics in terms of a (linear) convolution operator; whereas conductance based models consider the (non-linear) coupling between conductance and voltage. The second distinction is between the behaviour of a neuronal population or ensemble of neurons—as described with their mean or a *point probability mass* over state space. This contrasts with mean field approaches that model the ensemble density, where different ensemble densities are coupled through their expectations and covariances; in other words, these models include a nonlinearity that follows from the interaction between first and second order moments. This extra realism allows them to reproduce faster population dynamics; for example, somatosensory evoked potentials (Marreiros et al. 2010; Pinotsis et al. 2013b). Finally, there is a distinction between models of populations as point sources (c.f., equivalent current dipoles) and models that have an explicit spatial domain over (cortical) manifolds that call on neural fields. Neural field models are defined in terms of (integro-) differential equations that describe cortical dynamics in terms of (spatially) distributed sources sending afferent connections, conduction delays and lumped synaptic time constants (Pinotsis and Friston 2014b). These equations prescribe the activity in neuronal populations occupying bounded manifolds (patches) in different layers that lie beneath the cortical surface. In summary, field and mass models offer a coarse-grained description of spatiotemporal dynamics of brain sources in terms of smooth (analytic) connectivity matrices that also depend on time (and perhaps space).

Compartmental models on the other hand, operate at the single cell level. They yield precise descriptions of the anatomy, morphology and biophysical properties of the neurons that constitute populations. These models provide detailed descriptions of intracellular (longitudinal) currents within the long apical dendrites of synchronized cortical pyramidal cells, see e.g. (Bazhenov et al. 2002; Einevoll 2014; Krupa et al. 2008; Lindén et al. 2010; Ramirez-Villegas et al. 2015; Roth and Häusser 2001; Santaniello et al. 2015). These models embody the laminar structure of a cortical column and can characterize the cellular and circuit level processes that are measured with multielectrode arrays, MEG or electrocorticography. They provide characterizations of neuronal morphology and how neurons are grouped together to form spatially extended networks with well-behaved intrinsic (inter-and intra-laminar) connectivity. In the analysis of microelectrode data below, we employ a compartmental model that was originally used to explain somatosensory evoked responses measured with MEG during a tactile stimulation paradigm (Jones et al. 2007), and its neural mass analogue. Having specified the particular generative or forward model of observed that physiological responses the next step is to estimate the evidence and parameters of competing models; usually using dynamic causal modelling (DCM).

DCM offers a framework for the inversion of state space models using a Variational Bayesian algorithm known as Variational Laplace. This is based on the optimization of a cost function called variational Free Energy, \({\mathcal{F}}\). This provides a bound on the model log-evidence that—under Gaussian assumptions about the posterior density and random effects—acquires a simple form: see (Friston et al. 2007) for details. A standard model inversion corresponds to the case when \({\mathcal{F}}\) is given by Eq. (3), while the empirical Bayesian approach used here considers the case where \({\mathcal{F}}\) is defined at the first and second level on a hierarchical model (within and between subject respectively)—and is optimized with respect to first and second level posteriors (Eq. 3). Crucially, this optimization is computationally efficient, because the second level free energy receives a contribution from the first level that can be computed easily for any (reduced) priors, given the (pre-computed) posterior under full priors: see Friston et al. (2015, 2016) and subsequent applications for more details.

*m*

_{i}is better than

*m*

_{j}—or more exactly, there is strong evidence for the i-th model relative to the

*j*-th model.

## Explaining Intersubject Variability in Gamma Responses Using Neural Fields

*l*and

*m*(Pinotsis et al. 2014):

*q*-th sensor, \(\dag\) denotes the conjugate transpose matrix and \(Q = [q_{1} ,q_{2} ,q_{3} ,q_{4} ]\) is a vector of coefficients that weights the contributions of each neuronal population to the observed MEG signal. Here, \(g_{u} (\omega )\) is a spatiotemporal representation of fluctuations or inputs driving induced responses, which we assume to be a mixture of white and pink temporal components. These contributions are based on anatomical properties and the lead field configuration of each population (e.g. inhibitory neurons do not generate a large dipole), where each electrode or sensor has its own sensitivity profile, reflecting the topographic structure of the underlying cortical source.

Neural field model parameters

Parameter | Physiological interpretation | Prior mean |
---|---|---|

\(\kappa_{1} ,\kappa_{2} ,\kappa_{3} ,\kappa_{4}\) | Postsynaptic rate constants | 1/2, 1/35, 1/35, 1/2 (ms |

\(\alpha_{11} ,\alpha_{14} ,\alpha_{12}\) \(\alpha_{22} ,\alpha_{21} ,\alpha_{23} ,\alpha_{33}\) \(\alpha_{41} ,\alpha_{32} ,\alpha_{44}\) | Amplitude of intrinsic connectivity kernels (×10 | 108, 45, 1.8 9, 162, 18, 45 (a.u) 36, 18, 9 |

\(c_{ab}\) | Spatial decay of connectivity kernels | \(\left\{ {\begin{array}{*{20}c} {0.6} & {{\text{a}} \ne b} \\ 2 & {a = b} \\ \end{array} } \right.\) (mm |

\(r,\eta\) | Parameters of the postsynaptic firing rate function | .54, 0 (mV) |

| Conduction speed | .3 m/s |

\(\phi\) \(q_{1} ,q_{2} ,q_{3} ,q_{4}\) | Dispersion of the lead field Neuronal contribution weights | \(\sqrt 2 /16\) (mm) .2, 0, .2, .6 |

\(a_{u} ,a_{n}\) | Exogenous white input, channel-specific white noise (log–scale) | 0, 0 |

\(\beta_{u} ,\beta_{n}\) | Exogenous pink input, channel-specific pink input (log–scale) | 0, 0 |

Below we use the likelihood model given by Eq. (5) and PEB to study intersubject variability in (stimulus-locked) oscillations recorded with MEG during a visual perception paradigm (Perry et al. 2013). Technical details of this analysis can be found in (Pinotsis et al. to appear). Here, our focus is on explaining these results from the vantage point of hierarchical Bayesian inference (see above). We used cross spectral densities as data features that were taken from observed responses, while the subject was looking at stationary, vertically oriented bars. These spectral responses showed sustained activity in the 30–80 Hz range that varied across individuals with stimulus size: these responses either showed an approximately linear (monotonic) increase in the gamma-band response or a saturating response with increasing size, akin to surround suppression. So what are the key mechanisms that could explain these individual differences?

*X*and

*W*are design matrices describing group and within subject effects respectively. In our application, we assume (for simplicity), \(W = I\) and consider three proxies to describe phenotypic variations between subjects; namely, the change in amplitude of gamma responses with increasing stimulus size, the peak frequency over all stimuli and the amplitude of gamma responses averaged over stimuli (Perry et al. 2013). These proxies or phenotypes enter the design matrix

*X*, creating the model space depicted in Fig. 3. First, we fit the (first level) model to individual subject data as in standard DCM approach. Then, we use Bayesian model reduction to invert the hierarchical model (2). The Kronecker tensor product with the identity matrix \(X \otimes I\) means that we have a second level parameter for every second level (phenotypic) variable and every first level (connectivity) parameter. This means one can identify the combination of connectivity parameters and phenotypic variables that best explains intersubject variability. This model space corresponds to all combinations of between subject effects. Having defined the hierarchical model we can now establish the significance of any group effects using Bayesian model reduction over (second level) models; see Fig. 3.

In short, this example shows how non-invasive (macroscopic) data can be used to make inferences at a microcircuitry level. In this instance, we have taken the opportunity to highlight inferences in the setting of hierarchical or empirical Bayesian models that accommodate intersubject variability. Our conclusion is that intersubject variability in visually induced gamma responses is best explained by differences in the intrinsic (laminar) connectivity to and from inhibitory interneurons.

## Modelling Layer-Specific Activity Using Neural Mass Models

The first of the above models is a well-known conductance based (microscopic) model (Bush and Sejnowski 1993). In this model, neurons and their constituent parts (axonal arbours, soma etc.) are considered as cylindrical conductors (segments) and transmembrane potentials are given by aggregates of Ohmic currents. These currents flow across the compartment, forming an RC circuit and obey Kirchhoff’s law. \(L_{im}\) are lead field coefficients for each compartment and sensor, \(A_{m} ,l_{m}\) are the cross-sectional area and the length of compartment *m* (projected in a direction perpendicular to apical dendrites). \(\rho_{m}\), \(c_{m}^{{}}\) are the axial resistivity and membrane capacitance and \(J_{m} (t)\) is the longitudinal current density. This model yields detailed descriptions of intracellular longitudinal currents—within the long apical dendrites of synchronized cortical pyramidal cells—that follow from cable theory. Neuronal populations are modelled as spatially organised networks with the soma of principal cells in supragranular and infragranular layers. This model captures the laminar structure of cortical columns and can characterize the cellular and circuit level processes that are measured with multi-electrode arrays or MEG. It also provides a model of neuronal morphology and how neurons are grouped together to form spatially extended networks, with precise connectivity.

The second model considered above is the neural mass variant of the Bush and Sejnowski 1993) model, see Fig. 2. The crucial difference between the two models in Eq. (6) is that the latter operates at the mesoscale and cannot describe microscopic effects like dendritic delays or back propagation. However, by fitting responses generated by its homologous microscopic model (using DCM), we obtain a prior distribution of neural mass model parameters that can faithfully explain responses recorded with laminar probes, see also Pinotsis et al. under review. In other words, we can establish a mapping between detailed compartmental models based upon conductances and simpler neural mass models based upon (implicit) synaptic convolutions. This mapping uses exactly the same inference machinery used to analyse empirical data but, in this instance, we are fitting neural mass models to responses that are generated by detailed compartmental models.

## A Working Memory Task and Experimental Data

Recordings were obtained from a monkey performing a memory guided saccade task. The monkey was trained to fixate on a central white dot during the 250 ms presentation of a red dot (cue) in the periphery of the animal's vision. This cue was presented at one of six potential locations evenly spread on an annulus 10° from the fixation point. After the cue, dots appeared at all of the six locations while the monkey maintained central fixation over a two second memory delay. Then, the central fixation dot turned purple and the peripheral stimuli disappeared. This told the monkey to make a direct saccade to the remembered location of the red cue dot to receive a juice reward. We recorded local field potentials from a 24 channel multi-contact laminar electrode implanted within prefrontal cortex.

## Dynamic Causal Modelling of Laminar Probe Data

After equiping the mass model of Fig. 2 with with priors that are consistent with compartmental models (see above), we inverted the empirical responses induced during the delay period in the working memory task. We then used Bayesian model comparison to test whether the model could successfully identify the layer (superficial vs deep) from which we recorded responses.

This sort of validation is potentially important as laminar probes offer unbiased estimates of laminar specific activity—and the hierarchical architecture of extrinsic (between-source) connections rests primarily on laminar specific connectivity. Laminar probes are therefore an exciting technique that allows us to measure brain responses at an unprecedented resolution. When combined with dynamic causal modelling, these responses could be used to address several important questions that we review in the Conclusions below.

## Conclusion

We have reviewed two recent advances in hierarchical or empirical Bayesian modelling that enable us to deal with the ill-posed (inverse) problem of source reconstruction and disclose processes that generate neural rhythms—and mediate the propagation of information within and between cortical sources. In the first illustration, we showed that non-invasive recordings contain rich spatial information, despite the low resolution of M/EEG; while in the second, we attempted to reconcile models operating at the microscopic and mesoscopic scale—and show that DCM can correctly assign superficial and deep cortical dynamics to laminar-specific responses. This intralaminar DCM affords the same computational efficiency and advantages as all other models in DCM, but can exploit microscopic (laminar-specific) data that embody effects like antidromic currents and back-propagation.

Dynamic causal modelling of electrophysiological responses obtained with laminar probes is in a position to address several neurobiological questions: one of the key reasons to use laminar probes is that they can provide direct evidence that distinct cortical layers are involved in particular oscillations and computations. Previous studies of visual cortex suggest that oscillatory activity in the gamma and alpha bands are segregated by layers. Neurons in deep layers (layers 5 and 6) show spike-field coherence in the alpha band while superficial layer (layers 2 and 3) neurons show spike-field coherence in the gamma band (Buffalo et al. 2011). A question of outstanding importance is whether this laminar segregation is preserved in prefrontal cortex, which is involved in top-down control of sensory cortex (Miller and Cohen 2001).

Superficial and deep cortical layers also tend to have distinct cortical targets. For example, superficial-layer neurons form the strongest source of cortico-cortical feedforward projections, while deep-layer neurons contribute predominately to cortico-cortical feedback (Markov et al. 2013). Recently, it was shown that the laminar connectivity pattern of a particular inter-areal (extrinsic) connection predicts how inter-areal oscillatory activity is coordinated between the areas: when a given connection is dominated by superficial-layer projection neurons (characteristic of feedforward connectivity), gamma and theta oscillations predominate. On the other hand, when a reciprocal connection is dominated by deep-layer projection neurons (characteristic of feedback connectivity), beta oscillations appear to mediate neuronal communication (Bastos et al. 2015). These results suggest that the precise laminar pattern of extrinsic connectivity profoundly shapes inter-areal communication, and the frequencies over which it occurs.

Therefore, it appears that the functional role of oscillations is shaped both by cortical layer and inter-areal connection types, as reviewed above. This provides an important motivation for using multi-laminar probes to examine cortical activity during cognitive tasks. An equally important motivation, from our perspective, is to interrogate the canonical microcircuit hypothesis, which predicts that neurons in distinct cortical layers contribute to distinct computations (Bastos et al. 2012; Friston and Kiebel 2009; Friston et al. 2015). In particular, it has been hypothesized that superficial layer neurons can encode prediction error, while deep layer neurons encode expectations that are used to generate descending (feedback) predictions. The hierarchical message passing of prediction errors and predictions is thought to be a crucial part of predictive coding under the Bayesian brain hypothesis (Friston and Kiebel 2009; Rao and Ballard 1999; Summerfield et al. 2008). Therefore, multilaminar data may provide the critical test for these hypotheses: the modelling of these data could establish whether neuronal activities (spikes and LFPs) from different cortical layers are indeed involved in distinct computations implied by predictive coding. In parallel, these data can be used to inform and nuance laminar-resolved dynamic causal models of the sort we entertain here. In turn, more advanced models will provide more precise descriptions of laminar-resolved activity, allowing more mechanistic questions to be asked about the role of specific neuronal populations, intrinsic connectivity and their neuromodulators in cognition.

## Acknowledgments

This work was funded by the Wellcome Trust Grant No 088130/Z/09/Z, NIMH R37MH087027 and The MIT Picower Innovation Fund.

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.