Skip to main content
Log in

Learning spiking neuronal networks with artificial neural networks: neural oscillations

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

First-principles-based modelings have been extremely successful in providing crucial insights and predictions for complex biological functions and phenomena. However, they can be hard to build and expensive to simulate for complex living systems. On the other hand, modern data-driven methods thrive at modeling many types of high-dimensional and noisy data. Still, the training and interpretation of these data-driven models remain challenging. Here, we combine the two types of methods to model stochastic neuronal network oscillations. Specifically, we develop a class of artificial neural networks to provide faithful surrogates to the high-dimensional, nonlinear oscillatory dynamics produced by a spiking neuronal network model. Furthermore, when the training data set is enlarged within a range of parameter choices, the artificial neural networks become generalizable to these parameters, covering cases in distinctly different dynamical regimes. In all, our work opens a new avenue for modeling complex neuronal network dynamics with artificial neural networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The different criteria of MFE initiation from Algorithm 1 aims to ensure the robustness of capturing MFE, due to the lack of network state information within each timestep in the tau-leaping simulations.

References

  • Aggarwal CC et al (2018) Neural networks and deep learning, vol 10. Springer, Cham, p 3

    Book  Google Scholar 

  • AlQuraishi M, Sorger PK (2021) Differentiable biology: using deep learning for biophysics-based and data-driven modeling of molecular mechanisms. Nat Methods 18(10):1169–1180

    Article  Google Scholar 

  • Andrew Henrie J, Shapley R (2005) LFP power spectra in V1 cortex: the graded effect of stimulus contrast. J Neurophysiol 94(1):479–490

    Article  Google Scholar 

  • Azouz R, Gray CM (2000) Dynamic spike threshold reveals a mechanism for synaptic coincidence detection in cortical neurons in vivo. Proc Natl Acad Sci 97(14):8110–8115

    Article  Google Scholar 

  • Azouz R, Gray CM (2003) Adaptive coincidence detection and dynamic gain control in visual cortical neurons in vivo. Neuron 37:513–523

    Article  Google Scholar 

  • Barron AR (1994) Approximation and estimation bounds for artificial neural networks. Mach Learn 14(1):115–133

    Article  Google Scholar 

  • Bauer M et al (2006) Tactile spatial attention enhances gamma-band activity in somatosensory cortex and reduces low-frequency activity in parieto-occipital areas. J Neurosci 26(2):490–501

    Article  Google Scholar 

  • Bauer EP, Paz R, Paré D (2007) Gamma oscillations coordinate Amygdalo-Rhinal interactions during learning. J Neurosci 27(35):9369–9379

    Article  Google Scholar 

  • Börgers C, Kopell N (2003) Synchronization in networks of excitatory and inhibitory neurons with sparse, random connectivity. Neural Comput 15(3):509–538

    Article  Google Scholar 

  • Bressloff PC (1994) Dynamics of compartmental model recurrent neural networks. Phys Rev E 50(3):2308

    Article  MathSciNet  Google Scholar 

  • Brosch M, Budinger E, Scheich H (2002) Stimulus-related gamma oscillations in primate auditory cortex. J Neurophysiol 87(6):2715–2725

    Article  Google Scholar 

  • Brunel N, Hakim V (1999) Fast global oscillations in networks of integrate-and-fire neurons with low firing rates. Neural Comput 11(7):1621–1671

    Article  Google Scholar 

  • Buice MA, Cowan JD (2007) Field-theoretic approach to fluctuation effects in neural networks. Phys Rev E 75(5):051919

    Article  MathSciNet  Google Scholar 

  • Buschman TJ, Miller EK (2007) Top-down versus bottom-up control of attention in the prefrontal and posterior parietal cortices. Science 315:1860–1862

    Article  Google Scholar 

  • Cai D et al (2006) Kinetic theory for neuronal network dynamics. Commun Math Sci 4(1):97–127

    Article  MathSciNet  Google Scholar 

  • Cai Y et al (2021) Model reduction captures stochastic Gamma oscillations on low-dimensional manifolds. Front Comput Neurosci 15:74

    Article  Google Scholar 

  • Chariker L, Young L-S (2015) Emergent spike patterns in neuronal populations. J Comput Neurosci 38(1):203–220

    Article  Google Scholar 

  • Chariker L, Shapley R, Young L-S (2016) Orientation selectivity from very sparse LGN inputs in a comprehensive model of macaque V1 cortex. J Neurosci 36(49):12368–12384

    Article  Google Scholar 

  • Chariker L, Shapley R, Young L-S (2018) Rhythm and synchrony in a cortical network model. J Neurosci 38(40):8621–8634

    Article  Google Scholar 

  • Chon KH, Cohen RJ (1997) Linear and nonlinear ARMA model parameter estimation using an artificial neural network. IEEE Trans Biomed Eng 44(3):168–174

    Article  Google Scholar 

  • Christof K (1999) Biophysics of computations. Oxford University Press, Oxford

    Google Scholar 

  • Csicsvari J et al (2003) Mechanisms of gamma oscillations in the hippocampus of the behaving rat. Neuron 37:311–322

    Article  Google Scholar 

  • Erol B (2013) A review of gamma oscillations in healthy subjects and in cognitive impairment. Int J Psychophysiol 90(2):99–117. https://doi.org/10.1016/j.ijpsycho.2013.07.005

    Article  Google Scholar 

  • Frien A et al (2000) Fast oscillations display sharper orientation tuning than slower components of the same recordings in striate cortex of the awake monkey. Eur J Neurosci 12(4):1453–1465

    Article  Google Scholar 

  • Fries P et al (2001) Modulation of oscillatory neuronal synchronization by selective visual attention. Science 291:1560–1563

    Article  Google Scholar 

  • Fries P et al (2008) The effects of visual stimulation and selective visual attention on rhythmic neuronal synchronization in macaque area V4. J Neurosci 28(18):4823–4835

    Article  Google Scholar 

  • Gerstner W et al (2014) Neuronal dynamics: from single neurons to networks and models of cognition. Cambridge University Press, Cambridg

    Book  Google Scholar 

  • Ghosh-Dastidar S, Adeli H (2009) Spiking neural networks. Int J Neural Syst 19(04):295–308

    Article  Google Scholar 

  • Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. In: arXiv preprint arXiv:1412.6572

  • Hasenauer J et al (2015) Data-driven modelling of biological multi-scale processes. J Coupled Syst Multiscale Dyn 3(2):101–121

    Article  Google Scholar 

  • He K et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Hodgkin AL, Huxley AF (1952) A quantitative description of membrane current and its application to conduction and excitation in nerve. J Physiol 117(4):500

    Article  Google Scholar 

  • Jack RE, Crivelli C, Wheatley T (2018) Data-driven methods to diversify knowledge of human psychology. Trends cognit Sci 22(1):1–5

    Article  Google Scholar 

  • Janes KA, Yaffe MB (2006) Data-driven modelling of signal-transduction networks. Nat Rev Mol Cell Biol 7(11):820–828

    Article  Google Scholar 

  • Krystal JH et al (2017) Impaired tuning of neural ensembles and the pathophysiology of schizophrenia: a translational and computational neuroscience perspective. Biol Psychiatr 81(10):874–885

    Article  Google Scholar 

  • Li Z et al (2020) Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895

  • Li H et al (2020) NETT: solving inverse problems with deep neural networks. Inverse Probl 36(6):065005

    Article  MathSciNet  Google Scholar 

  • Li Y, Hui X (2019) Stochastic neural field model: multiple firing events and correlations. J Math Biol 79(4):1169–1204

    Article  MathSciNet  Google Scholar 

  • Li Y, Chariker L, Young L-S (2019) How well do reduced models capture the dynamics in models of interacting neurons? J Math Biol 78(1):83–115

    Article  MathSciNet  Google Scholar 

  • Liu J, Newsome WT (2006) Local field potential in cortical area MT: stimulus tuning and behavioral correlations. J Neurosci 26(30):7779–7790

    Article  Google Scholar 

  • Lu L et al (2021) Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat Mach Intell 3(3):218–229

    Article  Google Scholar 

  • Mably AJ, Colgin LL (2018) Gamma oscillations in cognitive disorders. Current Opin Neurobiol 52:182–187

    Article  Google Scholar 

  • Nikola K, Samuel L, Siddhartha M (2021) On universal approximation and error bounds for fourier neural operators. J Mach Learn Res 22:1–76

    MathSciNet  Google Scholar 

  • Nobukawa S, Nishimura H, Yamanishi T (2017) Chaotic resonance in typical routes to chaos in the Izhikevich neuron model. Sci Rep 7(1):1–9

    Article  Google Scholar 

  • Pesaran B et al (2002) Temporal structure in neuronal activity during working memory in macaque parietal cortex. Nat Neurosci 5(8):805–811

    Article  Google Scholar 

  • Pieter Medendorp W et al (2007) Oscillatory activity in human parietal and occipital cortex shows hemispheric lateralization and memory effects in a delayed double-step saccade task. Cereb Cortex 17(10):2364–2374

    Article  Google Scholar 

  • Ponulak F, Kasinski A (2011) Introduction to spiking neural networks: information processing, learning and applications. Acta Neurobiol Exp 71(4):409–433

    Article  Google Scholar 

  • Popescu AT, Popa D, Paré D (2009) Coherent gamma oscillations couple the amygdala and striatum during learning. Nature Neurosci 12(6):801–807

    Article  Google Scholar 

  • Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys 378:686–707

    Article  MathSciNet  Google Scholar 

  • Rangan AV, Young L-S (2013) Emergent dynamics in a model of visual cortex. J Comput Neurosci 35(2):155–167

    Article  MathSciNet  Google Scholar 

  • Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

    Article  Google Scholar 

  • Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):1–48

    Article  Google Scholar 

  • Solle D et al (2017) Between the poles of data-driven and mechanistic modeling for process operation. Chem Ing Tech 89(5):542–561

    Article  Google Scholar 

  • Tao L et al (2006) Orientation selectivity in visual cortex by fluctuation-controlled criticality. Proc Natl Acad Sci 103(34):12911–12916

    Article  Google Scholar 

  • Traub RD et al (2005) Single-column thalamocortical network model exhibiting gamma oscillations, sleep spindles, and epileptogenic bursts. J Neurophysiol 93(4):2194–2232

    Article  Google Scholar 

  • Van Der Meer MAA, David Redish A (2009) Low and high gamma oscillations in rat ventral striatum have distinct relationships to behavior, reward, and spiking activity on a learned spatial decision task. Front Integr Neurosci 3:9

    Google Scholar 

  • van Wingerden M et al (2010) Learning-associated gamma-band phase-locking of action-outcome selective neurons in orbitofrontal cortex. J Neurosci 30(30):10025–10038

    Article  Google Scholar 

  • Wang S, Wang H, Perdikaris P (2021) Learning the solution operator of parametric partial differential equations with physics-informed DeepONets. Sci Adv 7(40):eabi8605

    Article  Google Scholar 

  • Whittington MA et al (2000) Inhibition-based rhythms: experimental and mathematical observations on network dynamics. Int J Psychophysiol 38(3):315–336

    Article  Google Scholar 

  • Wilson HR, Cowan JD (1972) Excitatory and inhibitory interactions in localized populations of model neurons. Biophys J 12(1):1–24

    Article  Google Scholar 

  • Womelsdorf T et al (2007) Modulation of neuronal interactions through neuronal synchronization. Science 316:1609–1612

    Article  Google Scholar 

  • Womelsdorf T et al (2012) Orientation selectivity and noise correlation in awake monkey area V1 are modulated by the gamma cycle. Proc Natl Acad Sci 109(11):4302–4307

    Article  Google Scholar 

  • Wu T et al (2022) Multi-band oscillations emerge from a simple spiking network. Chaos 33:043121

    Article  MathSciNet  Google Scholar 

  • Xiao Z-C, Lin KK (2022) Multilevel monte Carlo for cortical circuit models. J Comput Neurosci 50(1):9–15

    Article  Google Scholar 

  • Xiao Z-C, Lin KK, Young L-S (2021) A data-informed mean-field approach to mapping of cortical parameter landscapes. PLoS Comput Biol 17(12):e1009718

    Article  Google Scholar 

  • Yuan X et al (2019) Adversarial examples: attacks and defenses for deep learning. IEEE Trans Neural Netw Learn Syst 30(9):2805–2824

    Article  MathSciNet  Google Scholar 

  • Zhang J et al (2014) A coarse-grained framework for spiking neuronal networks: between homogeneity and synchrony. J Comput Neurosci 37(1):81–104

    Article  MathSciNet  Google Scholar 

  • Zhang J et al (2014) Distribution of correlated spiking events in a population-based approach for integrate-and-fire networks. J Comput Neurosci 36:279–295

    Article  MathSciNet  Google Scholar 

  • Zhang JW, Rangan AV (2015) A reduction for spiking integrate-and-fire network dynamics ranging from homogeneity to synchrony. J Comput Neurosci 38:355–404

    Article  MathSciNet  Google Scholar 

  • Zhang Y, Young L-S (2020) DNN-assisted statistical analysis of a model of local cortical circuits. Sci Rep 10(1):1–16

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by the National Science and Technology Innovation 2030 Major Program through grant 2022ZD0204600 (R.Z. Z.W., T.W., L.T.), the Natural Science Foundation of China through grants 31771147 (R.Z., Z.W., T.W., L.T.) and 91232715 (L.T.). Z.X. is supported by the Courant Institute of Mathematical Sciences through Courant Instructorship. Y.L. is supported by NSF DMS-1813246 and NSF DMS-2108628.

Funding

Directorate for Mathematical and Physical Sciences (2108628, 1813246), National Natural Science Foundation of China (31771147, 91232715, 2022ZD0204600).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Louis Tao, Zhuo-Cheng Xiao or Yao Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Tau-leaping and SSA algorithms

The simulations of the SNN dynamics are carried out by two algorithms: Tau-leaping and Stochastic Simulation Algorithm (SSA). The key difference is that, The tau-leaping method processes events that happen during a time step \(\tau \) in bulk, while SSA simulates the evolution event by event. Of the two, tau-leaping can be faster (with properly chosen \(\tau \)), while SSA is usually more precise with the precision that scales with C++ execution. Here we illustrate a Markov jump process as an example.

Algorithms. Consider \(X(t)=\{x_1(t),x_2(t),..., x_N(t)\}\), where X(t) can take values in a discrete state space

$$\begin{aligned} S=\{s_1,s_2,\ldots ,s_M\subset \mathbb {R}^N\}. \end{aligned}$$

The transition from state X to state \(s_i\) at time t is denoted as \(T_{s_i}^t(X)\), taking an exponential distributed waiting time with rate \(\lambda _{s_i\leftarrow X}\). Here, \(s_i\in S(X)\) which are states adjacent to state X with a non-zero transition probability. For simplicity, we assume \(\lambda _{s_i\leftarrow X}\) does not explicitly depend on t except via X(t).

Tau-leaping only considers X(t) on a time grid \(t = jh\), for \(j = 0,1,...,T/h\), assuming state transfer occurs for at most one time within each step:

$$\begin{aligned}&P(X^{(j+1)h} = s_i) \\&\qquad ={\left\{ \begin{array}{ll} h\lambda _{s_i\leftarrow X^{jh}} &{} \forall s_i\in S(X^{jh}), \\ 1-h\sum _{s_i\in S(X^{jh})}\lambda _{s_i\leftarrow X^{jh}} &{} s_i = X^{jh}, \\ 0 &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$

On the other hand, SSA accounts for this simulation problem as:

$$\begin{aligned} X(T) = T_{X_k}^{t_k}\circ T_{X_{k-1}}^{t_{k-1}}\circ \ldots \circ T_{X_1}^{t_1}(X(0)), \end{aligned}$$

i.e., starting from \(X^0\), X transitions to \(X_1, X_2,\ldots , X_k = X(T)\) at time \(0<t_1< t_2<\ldots< t_k <T \).

For \(t_\ell<t<t_{\ell +1}\), we sample the transition time from \(\text {Exp}(\sum _{s_i\in S(X(t))} \lambda _{s_i\leftarrow X(t)})\). That is, for independent, exponentially distributed random variables

$$\begin{aligned} \tau _i\sim \text {Exp}(\lambda _{s_i\leftarrow X(t)}), \end{aligned}$$

we have

$$\begin{aligned} t_{\ell +1} - t_\ell = \min _{s_i\in S(X(t))}\tau _i\sim \text {Exp}\left( \sum _{s_i\in S(X(t))} \lambda _{s_i\leftarrow X(t)}\right) . \end{aligned}$$

Therefore, in each step of an SSA simulation, the system state evolves forward by an exponentially distributed random time, whose rate is the sum of rates of all exponential “clocks". Then we randomly choose the exact state \(s_i\) to which transition takes place with probability weighted by the sizes of the pending events.

Implementation on spiking networks. We note that X(t) will changes when

  1. 1.

    neuron i receives external input (\(v_i\) goes up for 1, including entering \(\mathcal {R}\));

  2. 2.

    neuron i receives a spike (\(H^E_i\) or \(H^I_i\) goes up for 1);

  3. 3.

    a pending spike takes effect to neuron i (\(v_i\) goes up/down according to synaptic strengths);

  4. 4.

    neuron i walks out from refractory (\(v_i\) goes from \(\mathcal {R}\) to 0).

The corresponding transition rates are directly given (\(\lambda ^E\) and \(\lambda ^I\)) or the inverses of the physiological time scales (\(\tau ^{E}\), \(\tau ^{I}\), and \(\tau ^{\mathcal {R}}\)). In an SSA simulation, when the state transition elicits a spike in a neuron, the synaptic outputs generated by this spike are immediately added to the pool of corresponding types of effects, and the neuron goes into the refractory state. However, in a tau-leaping simulation, the spikes are recorded but the synaptic outputs are processed in bulk at the end of each time step. Therefore, all events within the same time step are uncorrelated.

1.2 The coarse-graining mapping

Here we give the definition of the coarse-grained mapping \(\mathcal {C}\) in Eq. 6. For \(\forall \omega \in \varvec{\Omega }\) that

$$\begin{aligned} \omega =&(V_1,\ldots ,V_{N_E},V_{N_E+1},\ldots ,V_{N_E+N_I},\\&H^E_1,\ldots ,H^E_{N_E},H^E_{N_E+1},\ldots ,H^E_{N_E+N_I},\\&H^I_1,\ldots ,H^I_{N_E},H^I_{N_E+1},\ldots ,H^I_{N_E+N_I}), \end{aligned}$$

we define

$$\begin{aligned} \mathcal {C}(\omega )=\tilde{\omega }=(&n^E_1, n^E_2, \ldots , n^E_{22}, n^E_R, n^I_1, n^I_2, \ldots , n^I_{22}, n^I_R,\\&H^{EE}, H^{EI}, H^{IE}, H^{II}), \end{aligned}$$

where,

$$\begin{aligned} n^E_i&=\sum _{j=1}^{N_E}{} {\textbf {1}}_{\varGamma _i}(V_j),\quad \text {for }i=1,\ldots ,22;\\ n^E_R&=\sum _{j=1}^{N_E}{} {\textbf {1}}_{\{\mathcal {R}\}}(V_j);\\ n^I_i&=\sum _{j=N_E+1}^{N_E+N_I}{} {\textbf {1}}_{\varGamma _i}(V_j),\quad \text {for }i=1,\ldots ,22;\\ n^I_R&=\sum _{j=N_E+1}^{N_E+N_I}{} {\textbf {1}}_{\{\mathcal {R}\}}(V_j); \end{aligned}$$

and

$$\begin{aligned} H^{EE}&=\sum _{j=1}^{N_E}H_j^E;\qquad H^{IE}=\sum _{j=N_E+1}^{N_E+N_I}H_j^E;\\ H^{EI}&=\sum _{j=1}^{N_E}H_j^I;\qquad H^{II}=\sum _{j=N_E+1}^{N_E+N_I}H_j^I. \end{aligned}$$

Here, \({\textbf {1}}_{\text {A}}(a)\) is an indicator function of set A, i.e., \({\textbf {1}}_{\text {A}}(a) = 1\) \(\forall a\in \text {A}\), otherwise \({\textbf {1}}_{\text {A}}(a) = 0\). \(\varGamma _i\) is a subset of the state space for membrane potential, and

$$\begin{aligned} \varGamma _i = [-15+5i, -10+5i)\cap \varGamma . \end{aligned}$$

1.3 Pre-processing surrogate data: discrete cosine transform

Here we explain how discrete cosine transform (DCT) works in the pre-processing. For an input probability mass vector

$$\begin{aligned} \varvec{p} = (n_1, n_2, \cdots , n_{22}), \end{aligned}$$

its DCT output \(\mathcal {F}_c(\varvec{p}) = (c_1,c_2,..., c_{22})\) is given by

$$\begin{aligned} c_k = \sqrt{\frac{2}{22}}\sum _{l=1}^{22} \frac{n_l}{\sqrt{1+\delta _{kl}}}\cos \left( \frac{\pi }{2N}(2l-1)(k-1)\right) , \end{aligned}$$
(9)

where \(\delta _{kl}\) is the Kronecker delta function. The iDCT mapping \(\mathcal {F}^{-1}_c\) is defined as the inverse function of \(\mathcal {F}_c\).

1.4 The linear formula for firing rates

When preparing the parameter-generic training set, we use simple, linear formulas to estimate the firing rate of E neurons and I neurons (\(f_E\) and \(f_I\), see Li and Hui 2019; Li et al. 2019). We take \(\theta \in {\Theta }\) for the synaptic coupling strength, while other constants are the same as in Table 1.

$$\begin{aligned} f_E=\frac{\lambda ^E(M+C^{II})-\lambda ^I C^{EI}}{(M-C^{EE})(M+C^{II})+(C^{EI}C^{IE})}\\ {f_I=\frac{\lambda ^I(M-C^{EE})+\lambda ^E C^{IE}}{(M-C^{EE})(M+C^{II})+(C^{EI}C^{IE})}} \end{aligned}$$

where

$$\begin{aligned}&C^{EE}=N^EP^{EE}S^{EE},\ C^{IE}=N^EP^{IE}S^{IE}\\&C^{EI}=N^IP^{EI}S^{EI},\ C^{II}=N^IP^{II}S^{II}. \end{aligned}$$

1.5 The deep network architecture

In general, artificial neural networks (ANNs) are interconnected computation units. Many different architectures are possible for ANNs; in this paper, we adopt the feedforward deep network architecture, which is one of the simplest (Fig. 7).

Fig. 7
figure 7

Diagram of a feedforward ANN

A feedforward ANN has a layered structure, where units in the \(i-\)th layer drive the \((i+1)-\)th layer with a weight matrix \(\varvec{W}_i\) and a bias vector \(b_i\). Computation is processed from one layer to the next. The first, “input layer" takes an input vector x, sending its output \(\varvec{W}_1x+b_1\) to the first "hidden layer"; the first hidden layer then sends output \(\varvec{W}_2 f(\varvec{W}_1x+b_1)+b_2\) to the next layer, and so on, until the last, “output layer" produces an output vector y. In this paper, we implemented a feedforward ANN with four layers containing 512, 512, 512, and 128 neurons, respectively. We chose the Leaky ReLU function with a default negative slope of 0.01 as our activation function \(f(\cdot )\).

The training of feedforward ANNs is achieved by the back-propagation (BP) algorithm. Let \(\mathcal{N}\mathcal{N}(x)\) denote the prediction of the ANN with input x, and \(L(\cdot )\) the loss function. With each entry (xy) in the training data, we minimize the loss \(L(y-\mathcal{N}\mathcal{N}(x))\) following the gradients on each dimension of \(W_i\) and \(b_i\). The computation of gradients takes place from the last layer \(W_n\)’s and \(b_n\), then “propagated back" to adjust previous \(W_i\) and \(b_i\) on each layer. We chose the mean-square error as our loss function, i.e. \(L(\cdot )=||\cdot ||_{L^2}^2.\)

Fig. 8
figure 8

Left: pre-, post- and predicted MFE profiles without DCT + iDCT; Middle: pre-, post- and predicted MFE profiles with DCT + iDCT; Right: pre-, post- and predicted MFE profiles with network parameters as additional inputs of ANN. (Left and Middle: \(S^{EE}\), \(S^{IE}\), \(S^{EI}\), \(S^{II}\) = 4, 3, \(-\)2.2, -2; Right: 3.82, 3.24, \(-\)2.05, \(-\)1.87.)

1.6 Pre-processing in ANN predictions

Here we provide more examples of the ANN predictions of the voltage profiles. We compare how ANNs predict post-MFE voltage distributions \(\varvec{p}^E\) and \(\varvec{p}^I\) in three different settings in Fig. 8. In each panel divided by red lines, the left column gives an example of pre-MFE voltage distributions, while the right column compares the corresponding post-MFE voltage distributions collected from ANN predictions (red) vs. SSA simulation. Results from ANNs without pre-processing, with pre-processing, and the parameter-generic ANN are depicted in the left, middle, and right panels.

1.7 Principal components of voltage distributions

The voltage distribution vectors in the form below are used to plot the distribution in the phase space as shown in middle panels of Figs. 5, 6, 9D, 10D, 11D, 12D, 13, 14, and 15

$$\begin{aligned} (&n^E_1, n^E_2, \ldots , n^E_{22}, n^E_R, n^I_1, n^I_2, \ldots , n^I_{22}, n^I_R) \end{aligned}$$

The vectors from the training set (colored in blue in figures) are selected to generate the basis of the phase space through svd function in numpy.linalg in Python. The first two rows of the \(V^\top \) are the first two PCs of the space. The scores of vectors from the training set and approximated results are dot products of these vectors and the normalized PCs.

The plain ks-density function in MATLAB is used to estimate the kernel smoothing density of the profile distribution based on the data points generated above. The contours show the level of each tenth of the maximal height (with 0.1% bias for demonstrating the top) in the distributions.

Fig. 9
figure 9

DNN predictions and surrogate dynamics in ER random network with 400 neurons. A. Mapping \(\widehat{F}^{\theta }_1\) in ER network for \(\theta = (S^{EE}, S^{EI}, S^{IE}, S^{II})= (4, 3, -2.2, -2)\). A. Left: a pre-MFE \(\varvec{p}^E(v)\) and a \(\varvec{p}^I(v)\); Right: post-MFE \(\varvec{p}^E(v)\) and \(\varvec{p}^I(v)\) produced by ANN (blue) vs. spiking network simulations (orange). B. Comparison of E and I spike number during MFEs, ANN predictions vs. SSA simulations. The distributions are depicted by 10th contours of max in ks-density estimation; C. Example of pre and post-MFE voltage distributions \(\varvec{p}^E\) and \(\varvec{p}^I\) in the surrogate dynamics. D. Distribution depicted by 10th-contours of the first two principal components of \(\varvec{p}^E\) and \(\varvec{p}^I\). E. Raster plots of simulated surrogate dynamics and the real dynamics starting from the same initial profiles (color figure online)

Fig. 10
figure 10

DNN predictions and surrogate dynamics in a 400-neuron random network with log-normal degree distribution. A-E are in parallel to Fig. 9

Fig. 11
figure 11

DNN predictions and surrogate dynamics in ER random network with 4000 neurons. A-E are in parallel to Fig. 9, except that \(\theta = (S^{EE}, S^{EI}, S^{IE}, S^{II})= (0.4, 0.3, -0.22, -0.2)\)

Fig. 12
figure 12

DNN predictions and surrogate dynamics in a 4000-neuron random network with log-normal degree distribution. A-E are in parallel to Fig. 11

Fig. 13
figure 13

Surrogate dynamics produced by parameter-generic MFE mapping \(\widehat{F}_1\) in two fixed networks with 400 neurons. Left: ER network; Right: Network with log-normal degree distribution.A-B: Example of pre and post-MFE voltage distributions \(\varvec{p}^E\) and \(\varvec{p}^I\) in the surrogate dynamics. C-D: Distribution depicted by 10th-contours of the first two principal components of \(\varvec{p}^E\) and \(\varvec{p}^I\). E-F: Raster plots of simulated surrogate dynamics and the real dynamics starting from the same initial profiles

Fig. 14
figure 14

Surrogate dynamics produced by parameter-generic MFE mapping \(\widehat{F}_1\) in two fixed networks with 4000 neurons. Left: ER network; Right: Network with log-normal degree distribution. A-F are in parallel to Fig. 13

Fig. 15
figure 15

Surrogate dynamics with a parameter set that lies out of the sampling 4D cube produced by parameter-generic MFE mapping \(\widehat{F}_1\). Left: \((S^{EE},S^{IE},S^{EI},S^{II}) = (5, 3, -2.2, -2)\); Right: \((S^{EE},S^{IE},S^{EI},S^{II}) = (4, 4, -2.2, -2)\). A-F are in parallel to Fig. 13

Fig. 16
figure 16

Fixed random graphs used for SNN simulation. ER: Erd?s-Rényi random graph. LN: random graphs with log-normal degree distribution. A light-colored block at (ij) represents a directed synaptic connection from i to j. The last quarter of the neurons are inhibitory, while the others are excitatory. This figure is compressed to reduce the file size

Fig. 17
figure 17

Distributions of the magnitude of MFEs (number of spikes) from sampled parameters sets and the specific parameter set \((S^{EE},S^{IE},S^{EI},S^{II}) = (4, 3, -2.2, -2)\)

Fig. 18
figure 18

Distributions of the magnitude of MFEs (number of spikes) from the enlarged set of initial profiles and a single trajectory

Fig. 19
figure 19

Firing rates of networks with parameter sets that can produce accepted MFEs for training. (n=3000)

1.8 Consistent results from fixed random networks

Here we test the capability of our method with fixed network architectures. We select two types of random graphs: 1. Erd?s-Rényi random graph (ER), and 2. random graphs with log-normal degree distribution (LN). In both types of graphs, the average edge density is consistent with Ps in Table 1. Their adjacency matrices are shown in Fig. 16. The sampling of four types of edges in LN random graphs leverages a standard deviation of 0.2 in logarithm and constraints from the mean degrees. Results are obtained from both sizes, 400 and 4000, of the random graphs. For each network, MFEs are captured from original network simulations and simulations with enlarged initial profiles (Fig. 18). The trained parameter-specific MFE mappings \(\widehat{F}^{\theta }_1\) are used to produce predictions of post-MFE states and surrogate network dynamics (Figs. 9, 10, 11, 12). MFEs are further captured in simulations with various parameter sets sampled from the 4D cube \(\varvec{\Theta }\). The trained parameter-generic \(\widehat{F}_1\) is used to produce predictions and surrogate dynamics, which are shown in Figs. 13 and 14. As seen in Figs. 9, 10, 11, 12, 13 and 14, the performance of the ANN surrogate is consistent with the case when postsynaptic connections are decided on-the-fly.

1.9 Varying the synaptic coupling strengths generates a broad range of firing rates and magnitudes of MFEs

The training of parameter-generic MFE mapping \(\widehat{F}_1\) needs MFEs from simulations with a variety of sets of parameter \(\theta \). As introduced in Sect. 4.3, the sets sampled from the 4D cube \(\varvec{\Theta }\) are first filtered by the estimated firing rate computed by the linear formula. A large fraction (about 80%) of the sets from the previous step generate MFEs that can be captured and accepted by our algorithm. These sets generate a wide range of firing rates (Fig. 19). The simulated firing rates match the linear formula in general. The major rejected region appears with high \(S^{EE}\), \(S^{IE}\) and low \(S^{II}\), \(S^{EI}\), which is in the neighborhood with the accepted region with high firing rate and near the singular region of the linear formula.

The magnitudes of MFEs (number of spikes) from various parameters show a much wider distribution than the original set of parameters (Fig. 17).

1.10 Extrapolating network dynamics out of \(\varvec{\Theta }\) with parameter-generic MFE mapping \(\widehat{F}_1\)

To test the extrapolation ability of our method, we generate surrogate dynamics in networks with \(\theta \)’s outside of \(\varvec{\Theta }\) with the parameter-generic MFE mapping \(\widehat{F}_1\). The two \(\theta \)’s are \((S^{EE},S^{IE},S^{EI},S^{II}) = (5, 3, -2.2, -2)\) and \((S^{EE},S^{IE},S^{EI},S^{II}) = (4, 4, -2.2, -2)\).

Parameter-generic MFE mapping \(\widehat{F}_1\) trained with MFEs in \(\varvec{\Theta }\) can still capture the neuronal oscillations and predict post-MFE states in the two networks. (Fig. 15) Behaviors of networks under these two sets of parameters are also reproduced. Strong recurrent excitation in the former \(\theta \) makes MFEs readily to be concatenated, while the less-synchronized character of the latter \(\theta \) makes MFEs hard to trigger or identify. This result shows the robustness and capability of extrapolation of our method, while our future work can focus on improving the precision of prediction in detail (Figs. 18, 19).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, R., Wang, Z., Wu, T. et al. Learning spiking neuronal networks with artificial neural networks: neural oscillations. J. Math. Biol. 88, 65 (2024). https://doi.org/10.1007/s00285-024-02081-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00285-024-02081-0

Keywords

Mathematics Subject Classification

Navigation