Supervised learning with decision margins in pools of spiking neurons
- 1.1k Downloads
Learning to categorise sensory inputs by generalising from a few examples whose category is precisely known is a crucial step for the brain to produce appropriate behavioural responses. At the neuronal level, this may be performed by adaptation of synaptic weights under the influence of a training signal, in order to group spiking patterns impinging on the neuron. Here we describe a framework that allows spiking neurons to perform such “supervised learning”, using principles similar to the Support Vector Machine, a well-established and robust classifier. Using a hinge-loss error function, we show that requesting a margin similar to that of the SVM improves performance on linearly non-separable problems. Moreover, we show that using pools of neurons to discriminate categories can also increase the performance by sharing the load among neurons.
To make sense of the world, animals must distinguish the sensory input patterns that characterize different objects or situations. In some cases, specific sensory patterns have innate behavioural associations, such as the species-typical meanings of animal vocalizations, for example growls and whines (Altenmüller et al., 2013). In other cases however, these associations must be learned. In the laboratory, pairing an initially-neutral conditioned stimulus such as a tone, light, or odour with an aversive unconditioned stimulus such as a foot shock leads an animal to respond similarly to the conditioned as to the unconditioned stimulus; such learning is believed to depend on synaptic plasticity in the amygdala (Pape & Pare, 2010). Repeated performance of an action in a given circumstance leads to the formation of stimulus–response associations or habits, which are believed to develop through synaptic plasticity in the dorsal striatum (Yin & Knowlton, 2006). Importantly, learning of stimulus categories does not require any explicit behaviour, reward or punishment. For example, new born female Belding’s ground squirrels learn the odours of their siblings simply by their presence in the nest during early life; this association allows later identification of kin during adulthood (Holmes, 1986).
In statistics and machine learning, association of input patterns with desired categories, as specified by a training signal, is referred to as supervised learning. This form of learning should be distinguished from reinforcement learning, in which learning is governed by a reward rather than an explicit training signal; and unsupervised learning, in which representations are found based on structure in the input, without any explicit training signal. A classical algorithm for supervised learning is the Perceptron learning rule F. Rosenblatt 1958), which trains a single artificial neuron to linearly weight its inputs such that category is predicted by whether the weighted sum exceeds a fixed threshold. The Support Vector Machine (SVM) improves on perceptron performance by using a margin (a gap between the training boundaries for different classes), as well as through other innovations such as the introduction of nonlinearities through a kernel function (Cortes & Vapnik, 1995).
A number of learning rules have been suggested by which spiking neurons might perform tasks analogous to supervised learning (Bohte et al., 2002; Florian, 2007; Pfister et al., 2006; Ponulak & Kasiński, 2010; Xu et al., 2013; Legenstein et al., 2005). Recently, concepts of the Perceptron were extended to spiking neurons in a framework called the “Tempotron”, in which an error signal is used to adjust synapses strongly active when the neuron was close to its threshold (Florian, 2012; Gutig & Sompolinsky 2006; Gütig & Sompolinsky, 2009), producing 1 or 0 spikes according to the desired category. In the present work, we describe an adaptation of the SVM to spiking neurons, whose margin allows for the training of more general firing rate modulations than 0/1 spike. We found that a moderate training margin increases the learning speed of single neurons in linearly separable tasks, and increases their performance in linearly non-separable tasks. To further improve learning of linearly non-separable problems, we considered an extension in which neurons work in pools trained simultaneously (Urbanczik & Senn, 2009), whose combined activity forms the network’s response to a pattern. We found that this indeed improved performance as the training signal, although global, nevertheless allowed different neurons to learn different receptive fields.
2 Material and methods
In all simulations, we used a conductance-based integrate-and-fire neuron model with a membrane time constant τm = 20ms, a leak conductance gL = 10nS, and a resting membrane potential Vrest = − 70mV. Spikes were generated when the membrane potential Vm reached the threshold Vthresh = − 50mV. To model the shape of the action potential, the voltage was set to 20 mV after threshold crossing, and then decayed linearly during a refractory period of duration τwidth = 5ms to the reset value Vreset = − 55mV, following which an exponentially decaying depolarizing current of initial magnitude 50pA and time constant τdep = 40ms was applied (similarly to (Clopath et al., 2010; Yger & Harris, 2013)). We used this scheme with a high reset voltage and ADP, rather than the more common low reset value, as it provides a better match to intracellular recordings in vitro and in vivo. Synaptic connections were modelled as transient conductance changes with instantaneous rise followed by exponential decay. Synaptic connections were excitatory only (synaptic weights were clipped when they attempted to cross zero), with a time constant τexc = 5ms and a reversal potential Eexc = 0mV.
To reduce the time of the simulations, we used only 10 input neurons. For each input pattern, the firing rate of each input neuron is independently drawn from a uniform distribution between 0 and 1Hz. The rate pattern is then normalised such that the total input rate is 10 000Hz, comparable to the physiological regime in which neurons operate (assuming an average of 10,000 incoming synapses at 1Hz). Every time a pattern is presented, the rate pattern is transformed into a novel 100 ms spiking pattern via the realization of ten independent and homogeneous Poisson processes.
We derived the learning rule from approximate gradient descent on the Support Vector Machine cost function (see Fig. 1 panel C). This cost function E for a neuronal pool on a given trial is a function of the summed number of spikes Npool emitted by all the neurons within the pool during that trial, and of the category of the input pattern presented during that trial. This function depends on two parameters, the learning thresholds θ+ and θ−. If the input pattern belongs to the same category as the pool, then the pool should respond to it by emitting at least θ+ θ+ spikes. If this is the case then the cost for the pool is 0, otherwise it is equal to the number of missing spikes. The cost function is thus a rectified linear function of the pool’s number of spikes, with parameter θ+. If the input pattern is not of the same category as the pool, then the pool should respond to it by emitting less than θ− spikes. If this is the case then the cost for the pool is 0, otherwise it is equal to the number of superfluous spikes. The cost function is thus a rectified linear function of the pool’s number of spikes, with parameter θ−.
This approximation captures well the relationship between firing rate and voltage in a typical trial generated with the input statistics of the classification task (Supplementary Figure 1).
Ignoring the reset mechanism and the non-linearity due to the spike, this would be exact for a current based neuron, but this is only an approximation for the conductance-based neuron which we implement, estimating the average membrane time constant with the average conductance received during the presentation of a single pattern (Gütig & Sompolinsky, 2009).
We impose the constraint that weights that attempt to become negative are clipped to zero, since we are using only excitatory synapses. In addition, we add a constraint that a neuron that doesn’t fire cannot reduce its incoming synaptic weights.
Simulations of the spiking neurons were performed using a customised version of the NEST simulator (Diesmann & Gewaltig, 2007) and the PyNN interface (Davison et al., 2009), with a fixed time step of 0.1 ms.
Support vector machine
For Figs. 5, 10, the linear Support Vector Machine of the Python scikit toolkit (Pedregosa & Varoquaux, 2011) was trained on Poisson spike counts drawn from the same patterns that were used to train the neuronal pools. For each pattern number, the cost parameter (termed c) was chosen so as to optimise the SVM performance. This yielded the same optimal cost parameter c = 10− 6 for all pattern numbers. In Fig. 5, performance for a lower and a higher value of the cost parameter are also shown.
We studied a learning algorithm for spiking neurons to perform supervised learning, based on the support vector machine (SVM) cost function. The network that was used for the task is shown in Fig. 1a. Working in a rate-based framework, we defined each input pattern by a set of mean rates of each of the input neurons during that pattern, which is transformed into a 100 ms spiking pattern via a homogeneous Poisson process generated anew each time a pattern is presented (see Fig. 1a left, and Material and Methods). The input patterns are normalised random rate vectors (randomly assigned to the two categories A and B, see Materials and Methods). Fig. 1a is a schematic illustration of the learning task addressed by the spiking neurons, and of the generation process of the input spike trains from the input patterns of the two categories. All neurons in the two readout pools received connections from all input neurons. There are no lateral connections between the readout neurons (Fig. 1b). Each pool is assigned one category of inputs to which it must respond, its positive (+) patterns; the other patterns become the pool’s negative (−) patterns. Pool A’s + patterns are thus the A patterns, while its patterns are the B patterns. Classification is assessed as correct if in response to an A pattern, the summed number of spikes from pool A, NA is greater than the summed number of spikes from pool B, NB (and vice versa).
In our learning rule, each synapse thus accumulates an eligibility trace over the course of a trial. At the end of each trial, if the neuron receives an error signal, the eligibility trace is transformed into a synaptic change, the sign of which is dictated by the error signal. This defines a 3-factor learning rule: if there is no error signal, no plasticity occurs; if the error signal is positive, the rule is Hebbian (inputs that make the neuron fire are potentiated); but if the error signal is negative, the rule is anti-Hebbian (inputs that make the neuron fire are depressed). Therefore, unlike purely Hebbian or STDP rules that require homeostasis to ensure stability (Abbott & Nelson, 2000; Clopath et al., 2010; Yger & Harris, 2013), this rule is intrinsically stable. Note that such a notion of eligibility traces has already been proposed in the case of reinforcement learning with a delayed error signal (Izhikevich, 2007; Legenstein et al., 2008).
The fact that optimal performance could be obtained in the 12 pattern case with a margin of 0 was at first surprising. For example, when trained with equal thresholds (θ−, θ+) = (4, 4), if each pool emitted exactly 4 spikes to every pattern, they would receive no error signal during training, yet their classification performance would be 0 %. To investigate how good performance could be obtained without a margin in a close-to-linearly separable situation, we plotted a histogram of spike count outputs (see Fig. 3c). Note that the spike count distribution of each category is broad and bell-shaped, even after learning; this reflects the random distribution of the multiple patterns in each class. High classification without a margin occurred because the centres of the distributions are widely separated. This can be characterised by the difference between the mean spike count in response to target patterns and the mean spike count in response to null patterns, which we will refer to as spike count modulation. We suggest that modulation occurs because the hinge cost function causes plasticity anytime the response exceeds the learning threshold, and the broadness of the spike count distributions for each class causes the centres of the spike count histograms to move apart, resulting in spike count modulation even when no margin was requested. For 24 patterns however, this separation did not occur, suggesting that in a highly nonlinearly separable problem, spike count modulation only occurs when a margin is explicitly requested.
We next asked how requesting a margin affected performance in the two cases. Figs. 3c damd e show histograms, for various margins, of the spike counts emitted by each neuron in response to its + patterns (full lines) and in response to its patterns (dashed lines). As the margin is increased, the spike distributions move further apart, allowing better separation in the case of 24 patterns (green). For 12 patterns however (red), because separation already occurred without a margin, little gain was derived from the margin, and indeed performance actually decreased in the case of an 8-spikes margin, likely due to the broadening of the response distribution for patterns. We speculate this may occur because in order to respond very strongly to + patterns, the neurons cannot avoid also producing strong responses to at least some patterns.
In this study, we presented a learning rule which allows multineuron pools to learn in a supervised way to increase their firing rate in response to a certain set of inputs but not to another set. We combined an approach similar to the Tempotron (Gutig & Sompolinsky 2006; Gütig & Sompolinsky, 2009) for the synaptic update with concepts from the Support Vector Machine literature (Cortes & Vapnik, 1995). We found that a moderate training margin increases the learning speed of single neurons in linearly separable tasks, and increases their performance in linearly non-separable tasks. Although we did not assess the performance of the original Tempotron rule on our task, we found that using a (0,1) threshold a similar rule to the “voltage convolution” implementation of the Tempotron rule (Gutig & Sompolinsky 2006 produced worse performance on our task. We note however that the learning task originally used to test the Tempotron consisted of detecting reliable spatiotemporal patterns, whereas our task consists of discriminating Poisson spike trains that can vary from one repeat to the next. This may provide an explanation of the relatively poor performance of the (0,1) rule to some of the original applications of the Tempotron paper.
The performance of single neurons was bounded by the linear SVM performance, but performance could be increased by training neurons in pools with a single, global training signal. Although the neurons in a given pool received the same error signal derived from the pool’s number of spikes, they were nevertheless able to spontaneously select different features, thus classifying linearly non-separable inputs.
In models of unsupervised learning, lateral or recurrent inhibition is often used to force neurons to develop different receptive fields (Clopath et al., 2010; Masquelier et al., 2009; Yger & Harris, 2013). In the present case, recurrent inhibition was not necessary for neurons to evolve different receptive fields. Since our model has no feed forward inhibition, we normalised the rate patterns such that each pattern had the same global rate (otherwise, a pool would not be able to simultaneously respond with a high number of spikes to patterns of low input rate and with a low number of spikes to patterns of high input rate, and would therefore misclassify many patterns.) Adding divisive feedforward inhibition to the model might allow it to extend to the classification of non-normalised rate patterns. In the present model, synaptic weights were not allowed to become negative. Such a constraint typically reduces the capacity of perceptrons to learn rate-based inputs (see for example Amit et al., 1989; Gardner, 1988; Legenstein & Maass, 2007). This loss in capacity could be compensated for in part by adding subtractive feedforward inhibition to our model.
Could an analogous rule be implemented in the brain? The rule requires two steps: first, an eligibility trace is constructed based on pre-synaptic input occurring shortly prior to or during postsynaptic depolarization; and second, this is consolidated into a change in synaptic strength by a later-arriving training signal. Molecular mechanisms that could underlie the eligibility trace are well described, such as the multiple phosphorylation cascades that occur downstream of calcium influx via the NMDA receptor (Sweatt, 2009). But how might a training signal be conveyed? In the case of reinforcement learning, dopamine has been suggested as a training signal, and dopamine has indeed been implicated in the consolidation of eligibility traces (Kentros et al., 2004). A role for eligibility traces in reinforcement learning has been modelled previously (Izhikevich, 2007; Legenstein et al., 2008; El Boustani et al., 2012). A global reinforcement signal, however, cannot instruct different neuronal populations with different target signals. A more flexible, higher-dimensional training signal might instead be conveyed by glutamatergic inputs. In the cerebellum, for example, climbing fibre inputs provide strong inputs that generate complex-spike bursts which are believed to constitute a training signal (Eccles et al., 1967; Marr, 1969; Raymond et al., 1996). A second example consists of auditory fear conditioning, in which a conditioned reflex is established by the coincidence of signals conveying a conditioned stimulus (a tone) with a stronger unconditioned stimulus (a shock), by potentially glutamatergic inputs onto the amygdala (Pape & Pare, 2010). Understanding how spiking neurons may perform supervised learning at a computational level may lead to better understanding of such neuronal circuits.
This work was supported by the Wellcome Trust (095668) and EPSRC (I005102, K015141).
Conflict of interest
- Abbott, L. F., & Nelson, S. B. (2000). Synaptic plasticity: taming the beast. Nature Neuroscience, 3 Suppl(november), 1178, 83.Google Scholar
- Amit, D. J., Campbell, C., & Wong, K. Y. M. (1989). The interaction space of neural networks with sign-constrained synapses. Journal of Physics A: Mathematical and General, 22(21), 4687.Google Scholar
- Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning. Retrieved from http://link.springer.com/article/10.1007/BF00994018Google Scholar
- Davison, A. A. P., Brüderle, D., Bruderle, D., Eppler, J., Kremkow, J., Muller, E., … Yger, P. (2009). PyNN: a common interface for neuronal network simulators. Frontiers in NeuroInformatics …, 2, 11. doi: 10.3389/neuro.11.011.2008
- Eccles, S. J. C., Itō, M., & Szentágothai, J. (1967). The cerebellum as a neuronal machine (p. 335). Retrieved from http://books.google.fr/books/about/The_cerebellum_as_a_neuronal_machine.html?id=nWh9AAAAIAAJ&pgis=1
- El Boustani, S., Yger, P., Frégnac, Y., & Destexhe, A. (2012). Stable learning in stochastic network states. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 32(1), 194–214.Google Scholar
- Legenstein, R., Naeger, C., & Maas, W. (2005). What can a neuron learn with spike-timing-dependent plasticity? Neural Computation, 17(11), 2337–2382.Google Scholar
- Masquelier, T., Guyonneau, R., & Thorpe, S. J. (2009). Competitive STDP-based spike pattern learning. Neural Computation, 21(5), 1259–1276.Google Scholar
- Pedregosa, F., & Varoquaux, G. (2011). Scikit-learn: Machine learning in Python. … of Machine Learning …, 12, 2825–2830.Google Scholar
- Sweatt, J. D. (2009). Mechanisms of Memory, Second Edition (p. 450). Academic Press. Retrieved from http://www.amazon.com/Mechanisms-Memory-Second-Edition-Sweatt/dp/0123749514
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.