Computational Models of Timing Mechanisms in the Cerebellar Granular Layer
- 917 Downloads
A long-standing question in neuroscience is how the brain controls movement that requires precisely timed muscle activations. Studies using Pavlovian delay eyeblink conditioning provide good insight into this question. In delay eyeblink conditioning, which is believed to involve the cerebellum, a subject learns an interstimulus interval (ISI) between the onsets of a conditioned stimulus (CS) such as a tone and an unconditioned stimulus such as an airpuff to the eye. After a conditioning phase, the subject’s eyes automatically close or blink when the ISI time has passed after CS onset. This timing information is thought to be represented in some way in the cerebellum. Several computational models of the cerebellum have been proposed to explain the mechanisms of time representation, and they commonly point to the granular layer network. This article will review these computational models and discuss the possible computational power of the cerebellum.
KeywordsCerebellum Time Delay eyeblink conditioning Neural network models Recurrent network Granular layer
How do Purkinje cells learn the timing of the cessation of firing? Here, we assume that the CS in delay eyeblink conditioning evokes temporally constant neural activity in PN and does not contain any temporal information. Therefore, our question is how the ISI from the CS onset to the US onset is represented in the cerebellar cortex. A working hypothesis is that any time measured from the CS onset is represented by the sequential activation of granule cells or granule-cell populations: there should be one-to-one correspondence between the passage-of-time (POT) from the CS onset and a granule cells’ temporal activation pattern. A specific ISI is determined by the cessation of firing activities of some Purkinje cells that do not receive inputs from granule cells that are active at that timing. On the basis of this hypothesis, the mechanism by which Purkinje cells learn the time to stop firing is explained as illustrated in Fig. 1b. At the onset of CS presentation, a sequence of active granule cells or granule-cell populations starts. Let us assume that the active granule cell or the population of active granule cells at the US onset is uniquely determined. At the US onset, the activity of the inferior olive (IO) conveyed by the climbing fiber (CF) induces strong depolarization in a Purkinje cell which receives at the same time signals from the active granule cell/cell population through parallel fibers (PFs). The conjunctive excitation of PF and CF induces long-term depression (LTD) in those PF–Purkinje cell synapses. Only the PF–Purkinje cell synapses activated by the granule cell/cell population at the US onset are depressed and the other synapses are unaffected. Because the active granule cell/cell population changes gradually with time, the net excitatory drives to the Purkinje cell starts to decrease in advance of the US onset and becomes the minimum at the US onset, which results in the cessation of firing of the cell starting slightly earlier than the US onset. Therefore, the most important aspect of computational modeling is how the granular layer generates sequential activities of granule cells/cell populations without recurrence.
We have classified the current computational models of POT representation in the cerebellar granular layer into four types: (1) delay line [7, 8, 9, 10], (2) spectral timing , (3) oscillator [12, 13], and (4) random projection [14, 15, 16, 17]. In the following section, we will review and evaluate each type of model separately.
Models of the Cerebellar Granular Layer for POT Representation
Delay Line Model
Spectral Timing Model
The spectral timing model proposed by Bullock et al.  is based on the assumption of a wide distribution of membrane time constants for different granule cells. This assumption leads to various delays for granule cells to be activated. Once a granule cell becomes active, the cell is inhibited by a companion Golgi cell as long as the CS signal is sustained. Consequently, the granule cell becomes active only once during the CS presentation, which results in a spectrum of activity peaks of granule cells throughout the CS–US interval (Fig. 2b). In the spectral timing model, to generate a variety of spectral activity patterns, the membrane time constant of granule cells has to vary widely up to a few seconds to cover the range of ISIs (2–3 s) representable in delay eyeblink conditioning , which seems biologically implausible. In addition, there is no experimental evidence that a precise, one-to-one connectivity pattern exists between granule cells and Golgi cells.
The above-mentioned models are based on the sequential activation of individual granule cells. In contrast, the following models rely on the sequential activation of granule-cell populations. In the oscillator model (Fig. 2c), granule cells are assumed to behave as oscillators with different frequencies and phases in response to a CS. Although these individual cells become active repeatedly during the CS presentation, the summed activity over granule cells within some cluster can be localized at a certain time; this is guaranteed by mathematics, stating that any localized function can be given uniquely by the sum of Fourier components with various frequencies. Thus, the population of these cells can appear only once during the CS presentation. Such an active granule-cell population may be found at each time step, which constructs a sequence of populations of active granule cells instead of the sequential activation of individual granule cells.
An oscillator model called the “phase encoder model” developed by Fujita  was originally proposed as a model for the adaptation of the eye-movement reflex in the vestibulo-ocular reflex. In this model, individual granule cells become oscillators with a fundamental frequency specified by MF signals representing the head rotation velocity and with different time lags (phases). This may be possible, as shown in Fig. 2c, when granule cells oscillate with a slow frequency. Thus, granule cells become active one by one, resulting in the sequential activation. The time lags are developed by the randomness of MF-granule cell synaptic weights and the temporal integration of the feedback inhibition of granule cells via a Golgi cell.
Gluck et al.  extended Fujita’s idea by describing granule cells as oscillators with different frequencies and time lags in response to temporally constant CS signals, and applied their model to delay eyeblink conditioning, without explaining the mechanism of how granule cell activities behave as oscillators with different frequencies over the broad range. They also assumed that PF–Purkinje cell synaptic weights are complex values to represent the information on time lag. More recently, Garenne and Chauvet  suggested that MF-granule cell synapses show a wide range of efficacy, enabling granule cells to fire from 1 to 100 Hz. In oscillator models, the lowest firing rate among granule cells determines the longest representable ISI: their model represents the POT up to 1 s. To generate a variety of time lags, their model assumed a wide distribution of MF-granule synaptic efficacies. However, if the synaptic efficacy changes, different sequences of active granule-cell populations can be generated for the same mossy fiber signal, which in turn leads to unstable time representation. Indeed, a recent experimental study showed that MF-granule cell synapses are highly plastic . Thus, their model is unlikely to represent the POT robustly.
Random Projection Model
Buonomano and Mauk  have proposed a POT representation model using a sequence of populations of active granule cells in a different way from that described in the oscillator model, and the same group later elaborated on their model . In this model (Fig. 2d), granule cells repeatedly undergo random transitions between active and inactive states during the CS presentation. Because of the dynamics of a granular layer network with random recurrent connections, the same population of active granule cells does not appear more than once. This enables a population of active granule cells to encode a unique time after the CS onset. Therefore, POT is represented by a sequence of populations of active granule cells. Buonomano and Mauk argued for the first time that this dynamic, aperiodic activity pattern of granule cells emerges from the granule–Golgi–granule-negative feedback with settings of realistic network structure and parameters .
The essence of our model is to project a population of active granule cells to another population of active granule cells by the matrix transformation, where the matrix elements represent effective inhibitory connection weights via Golgi cells (Fig. 2d). This “random projection1” is repeated during CS presentation, and thereby a random sequence of populations of active granule cells is generated. Hereafter, we call this type of model the random projection model. The assumed random recurrent connections are biologically supported by the presence of a recurrent inhibitory network of granule cells via Golgi cells. Furthermore, the generation of the random sequence is reproducible across trials: when the same CS is presented, the same sequence is generated .
The Marr–Albus–Ito theory advocates that granule cells act as a spatial pattern discriminator that represents MF with a sparse code [24, 25, 26]. This idea has been denied by a theoretical study in which a large-scale granular layer model exhibited memory capacity much smaller than expected . Nevertheless, the random projection model not only supports the hypothesis of spatial discrimination by granule cells in the Marr–Albus–Ito theory but also creates a novel concept of spatiotemporal discrimination as an information processing capability of the cerebellum .
Granular Layer Models for Other Functions
So far, we have introduced granular layer models that transform temporally constant MF input signals into a sequence of active granule cells or granule-cell populations that represents POT. Here, we briefly introduce granular layer models for other functions.
Three models have been proposed, in which the granular layer works as decorrelation filters that perform principal component analysis , independent component analysis , and maximization of information transfer between MFs and granule cells . The decorrelation filter extracts a few meaningful components from a number of noisy and redundant signals. Such decorrelation filter models acquire sparse representation of MF signals, as envisioned by Marr  and Albus . These models generate time-varying granule cell activity in response to time-varying MF signals, but they do not generate time-varying granule cell activity in response to constant MF signals. Therefore, as long as we assume that the CS does not contain any temporal information, these models may not be able to account for delay eyeblink conditioning.
Related to this issue, Freeman and Muckler  have reported that neurons in the pontine nuclei sending MF inputs to the cerebellum exhibit three different temporal discharge patterns. Phasic neurons elicit spikes transiently in response to the CS onset. Such phasic responses may have a function of resetting the temporal sequences of granule cells’ activities . Sustained and late neurons elicit spikes tonically during the CS presentation. Sustained neurons keep their firing rates relatively constant, whereas late neurons increase their firing rates gradually. Thus, it is possible that MF inputs as a whole convey temporally modulated signals. However, the representation of POT requires the precise and robust relationship between the timing and temporal discharge patterns. Therefore, the sustained and late discharge patterns of the pontine neurons should have fine temporal structure and be reproducible across different trials of CS presentation. Although Figs. 4, 5, and 6 in their paper  illustrate temporal modulation in the firing rate of sustained and late neurons during the CS presentation, it is unclear whether the temporal modulation is reproducible across trials or simply irreproducible noise. On the other hand, Aitkin and Boyd  have also reported that some neurons in cats’ lateral pontine nuclei elicit spikes transiently in response to the onset of the acoustic stimulation and some other neurons elicit spikes tonically during the tone presentation. The tonic discharge gradually decreases with time, and exhibits little temporal modulation. This study suggests that the fine temporal modulation in discharge patterns, if any, is just noise. Thus, the presence of three types of neurons in the pontine nuclei may not contribute to the POT representation, but further experiments are needed for clarifying the possibility that pontine neurons represent POT during CS presentation.
Another issue is that the decorrelation filter models require learning of connection weights between granule and Golgi cells for realizing sparse coding. The learning changes the activity pattern of these cells gradually across repeated trials. On the other hand, a recent experimental study demonstrated that the activity pattern of Golgi cells does not change across 300 trials of saccade adaptation , suggesting that learning for sparse coding may not take place in the granular layer.
Extra Granular Layer Models for POT Representation
We may also need brief introduction of some other models that proposed possible mechanisms of POT representation in the outside of the granular layer. Fiala et al. have proposed a Purkinje cell model based on the slow process of intracellular signal transduction mediated by the metabotropic glutamate receptor (mGluR) . The variation in the number of mGluRs expressed on the dendrites for different Purkinje cells results in different latencies in the elevation of intracellular Ca2+ concentration for those cells, producing a spectrum of intracellular Ca2+ transients across different Purkinje cells. The transient increase of Ca2+ causes inward flow of Ca2+-dependent K+ currents, and the channel conductance was increased by PF–CF pairing. Thereby, after repeated pairings of stimulation to PFs and CFs, a particular Purkinje cell learns to pause at a particular ISI. Fiala et al. assumed that the amount of expressed mGluRs was constant. On the other hand, Steuber and Willshaw assumed that the amount of mGluRs adaptively changes by learning . PF–CF pairing stimulation adjusted Ca2+ response latency to match a particular ISI, which causes the timed pause of the Purkinje cell. On the contrary, Schreurs and colleagues have reported in a series of experiments [37, 38, 39] that K+ currents including Ca2+-dependent K+ current rather decreased during delay eyeblink conditioning, which resulted in the increase of excitability of the Purkinje cell. To address this contradiction, Hong and Optican have recently proposed another Purkinje cell model coupled with stellate cells . This model is composed of many pairs of a Purkinje cell and a stellate cell. Both Purkinje and stellate cells increase their excitability by PF–CF pairing stimulation. However, the increase of stellate cells’ excitability is faster and stronger than that of Purkinje cells, and thereby inhibition from stellate cells to Purkinje cells becomes to dominate excitation from PFs, resulting in the timed pause. This model is another version of spectral timing models: in this model, the temporal spectrum is generated by many pairs of Purkinje and stellate cells, whereas in Bullock and Grossberg’s model, the spectrum is generated by many pairs of granule and Golgi cells.
Comparison of Random Projection Model with Experiments
We have computationally demonstrated that the involvement of NMDAR-mediated EPSPs with a long decay time is an important requisite for granule cells to exhibit randomly repetitive transition between burst and silent states during persistent sensory stimulation . To our knowledge, thus far, there has not been enough number of studies of in vivo granule cell recording to find experimental evidence supporting the random projection model. Unfortunately, some studies [41, 42, 43, 44] used ketamine for anesthesia, which is a blocker of NMDARs ( and references therein). Two studies [41, 42] used very brief sensory stimulation that sustains for 50 ms. In delay eyeblink conditioning, the ISI must be longer than 100 ms to generate robust CRs (cf. Fig. 1D of ), suggesting that the sensory stimulation should be at least twice longer to observe spatiotemporal dynamics of granule cell activity. We also confirmed that using a brief CS shorter than 100 ms, our cerebellar model failed to learn to elicit anticipatory CRs (unpublished observation).
Svensson and Ivarsson  have reported that short-lasting CSs elicited CRs after acquisition training with sustained CSs, suggesting that animals somehow learned to bridge the off-stimulus trace interval as well as the conditioning. Their experimental paradigm seems to obey trace eyeblink conditioning rather than delay eyeblink conditioning, in which the hippocampus and/or the prefrontal cortex are regarded to play more important role than the cerebellum. However, the hippocampus and the prefrontal cortex did not work because their animals were decerebrated. These authors interpreted their findings by hypothesizing that the cerebellar cortex contains the neural substrate for keeping some information on CS signals even after the cessation of the CS presentation. They justified the hypothesis by the observation of Larson-Prior et al.  that granule cells in slice preparations sustained firing activities in response to MF stimulation hundreds of milliseconds after the offset of the stimulation. Kotani et al.  have also investigated the trace conditioning in decerebrate guinea pigs. They reported that even decerebrate animals can acquire and express eyeblink conditioning in a trace paradigm. They discussed that the feedback loop from the interpositus nuclei to the pontine nuclei retains the activity during the stimulus-free trace interval. These two studies suggest that decerebrate animals can elicit CRs in response to transient CSs, at least after the training in a delay paradigm. Therefore, they challenge the classical hypothesis on the neural substrates of trace eyeblink conditioning, in which the prefrontal cortex and/or hippocampus sustains the CS information during the off-stimulus period and transmit the activity to the pontine nuclei . Based on these studies, it is expected that if animals are trained in both delay and trace paradigms alternately session by session, they should exhibit intact expression of CRs in both paradigms after sufficient training, even if their hippocampus or prefrontal cortex are inactivated. Nevertheless, Kalmbach et al.  have performed such experiments, and obtained results against the expectation: when the prefrontal cortex was inactivated after the conditioning, CRs were disrupted in the trace paradigm whereas CRs was intact in the delay paradigm. This suggests that the prefrontal cortex plays a role in bridging the stimulus-free trace interval, thereby supporting the classical hypothesis. Therefore, we may still be far from complete identification of neural substrates for delay and trace conditionings.
For the functional role of MF signals, there is another experiment performed by Jörntell and Ekerot  using awake cats. The top-left and bottom-right panels of Fig. 3c in their paper show an irregular temporal modulation of granule cell activities during stimulation, which seems to support our random projection model. However, the authors suggested that granule cells simply work to enhance the signal-to-noise ratio, because the temporal patterns of granule cell spike activity appeared to follow the activity in the presynaptic mossy fibers. This interpretation is regarded to derive from their PSTHs with a large bin size. Therefore, to clarify whether the random projection model is biologically plausible, it is desired that single-unit recordings of cerebellar granule cells in awake animals will be performed with a higher temporal resolution during persistent sensory stimulation.
It is noteworthy that there are experimental studies to explore the activity of Golgi cells rather than granule cells ( and references therein) to elucidate the granular layer dynamics. These studies are performed on the basis of the idea that the spatiotemporal activity pattern of granule cells is shaped by time-varying inhibition from Golgi cells .
A New Trend in Neural Computation
The random projection model is a member of liquid-state machines  or echo-state networks , which has been recently proposed to provide a new paradigm of modern neural computation (see [56, 57] for recent work). A liquid-state machine consists of a large random recurrent network called a reservoir that maps input signals into a higher dimensional space, and a set of neurons called readouts that receive inputs from reservoir neurons to extract time-varying information. A learning rule adopted by the simple perceptron model is applied to a supervised learning of readouts. The network architecture and the learning rule of the liquid-state machine are much simpler than those of conventional recurrent networks, yet the computational power is shown to be versatile. A major advantage of the liquid-state machine over the conventional recurrent networks is their fast learning . Furthermore, there are a variety of real-world applications using liquid-state machines including speech perception , robot control , and financial prediction , which indicates the strong computational power of liquid state machines. Our random projection model is mathematically equivalent to the liquid-state machine, when the granular layer, Purkinje cells, and LTD of PF–Purkinje cell synapses, respectively, are replaced with the reservoir, readouts, and the learning rule. This suggests that the cerebellar cortex, in so far as it can be represented by the random projection model, possesses a versatile computational power .
Owing to its huge computational power, the random projection model is capable of constructing internal models that are believed to exist in the cerebellum . Therefore, we believe that the random projection model is a good candidate computational model of the cerebellum.
In this article, we reviewed four types of computational models of the cerebellar granular layer for POT representation: delay line, spectral timing, oscillator, and random projection. We also briefly introduced granular layer models for other functions and extra granular layer models for POT representation. We pointed out the similarity between the random projection model and the liquid state machine, and thereby suggesting that the random projection model possesses a versatile computational power. Furthermore, we argued what experiments are needed to clarify whether the random projection model is feasible for POT representation in the cerebellum.
Models presented in this article have been implemented in C and C++ independently of the original research. The source codes are available at the Cerebellar Platform . Simulation of these models can be carried out online at the Cerebellar Simulator .
Random projection is the name of an algorithm that maps a vector to another vector via a random matrix . Random projection is usually used to reduce dimensionality in data, whereas this term is used here to map a population of active granule cells to another population without changing the dimension.
We thank Dr. Soichi Nagao at RIKEN Brain Science Institute for his comments on a draft of this manuscript. We thank Mr. Takeru Honda at The University of Electro-Communications for his assistance in writing simulation programs. A part of this study was supported by Neuroinformatics Japan Center.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- 1.Ito M (1984) The cerebellum and neuronal control. Raven, New YorkGoogle Scholar
- 5.Hesslow G, Yeo CH (2002) The functional anatomy of skeletal conditioning. In: Moore JW (ed) A neuroscientist’s guide to classical conditioning. Springer, New York, pp 86–146Google Scholar
- 12.Gluck MA, Reifsnider ES, Thompson RF (1990) Adaptive signal processing and the cerebellum: models of classical conditioning and VOR adaptation. In: Gluck MA, Rumelhart DE (eds) Neuroscience and connectionist theory. Erlbaum, Hillsdale, New Jersey, pp 131–186Google Scholar
- 22.Vempala SS. The random projection method, volume 65. Am Math Soc, 2004.Google Scholar
- 24.Marr D (1969) A theory of cerebellar cortex. J Physiol (Lond) 202:437–470Google Scholar
- 38.Freeman JH, Shi T, Schreurs BG (1998) Pairing-specific long-term depression prevented by blockade of PKC or intracellular Ca2+. Neuro Report 9:2237–2241Google Scholar
- 56.Jaeger H, Maass W, Principe J, editors. Echo state networks and liquid state machines, volume 20 of Neural Netw, 2007.Google Scholar
- 58.Jaeger H. Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the echo state network approach. GMD report 159, 2002.Google Scholar
- 61.Neural Forecasting Competition 3. http://www.neural-forecasting-competition.com/NN3/
- 63.Cerebellar platform. http://platform.cerebellum.neuroinf.jp/.
- 64.Cerebellar simulator. http://capsule.brain.riken.jp/˜cerebellum/model.cgi.