Information theory in neuroscience
- 3.2k Downloads
Information Theory started and, according to some, ended with Shannon’s seminal paper “A Mathematical Theory of Communication” (Shannon 1948). Because its significance and flexibility were quickly recognized, there were numerous attempts to apply it to diverse fields outside of its original scope. This prompted Shannon to write his famous essay “The Bandwagon” (Shannon 1956), warning against indiscriminate use of the new tool. Nevertheless, non-standard applications of Information Theory persisted.
Very soon after Shannon’s initial publication (Shannon 1948), several manuscripts provided the foundations of much of the current use of information theory in neuroscience. MacKay and McCulloch (1952) applied the concept of information to propose limits of the transmission capacity of a nerve cell. This work foreshadowed future work on what can be termed “Neural Information Flow”—how much information moves through the nervous system, and the constraints that information theory imposes on the capabilities of neural systems for communication, computation and behavior. A second set of manuscripts, by Attneave (1954) and Barlow (1961), discussed information as a constraint on neural system structure and function, proposing that neural structure in sensory system is matched to statistical structure of the sensory environment, in a way to optimize information transmission. This is the main idea behind the “Structure from Information” line of research that is still very active today. A third thread, “Reliable Computation with Noisy/Faulty Elements”, started both in the information-theoretic community (Shannon and McCarthy 1956) and neuroscience (Winograd and Cowan 1963). With the advent of integrated circuits that were essentially faultless, interest began to wane. However, as IC technology continues to push towards smaller and faster computational elements (even at the expense of reliability), and as neuromorphic systems are developed with variability designed in (Merolla and Boahen 2006), this topic is gaining in popularity again in the electronics community, and neuroscientists again may have something to contribute to the discussion.
1 Subsequent developments
The theme that arguably has had the widest influence on the neuroscience community, and is most heavily represented in the current special issue of JCNS, is that of “Neural Information Flow”. The initial works of MacKay and McCulloch (1952), McCulloch (1952) and Rapoport and Horvath (1960) showed that neurons are in principle able to relay large quantities of information. This research lead to the first attempts to characterize the information flow in specific neural systems (Werner and Mountcastle 1965), and also started the first major controversy in the field, which still resonates today: the debate about timing versus frequency codes (Stein 1967; Stein et al. 1972). A steady stream of articles followed, both discussing these hypothesis and attempting to clarify the type of information relayed by nerve cells (Abeles and Lass 1975; Eagles and Purple 1974; Eckhorn and Pöpel 1974; Eckhorn et al. 1976; Harvey 1978; Lass and Abeles 1975; Norwich 1977; Poussart 1971; Stark et al. 1969; Taylor 1975; Walloe 1970).
After the initial rise in interest, the application of Information Theory to neuroscience was extended to a few more systems and questions (Eckhorn and Pöpel 1981; Eckhorn and Querfurth 1985; Fuller and Williams 1983; Kjaer et al. 1994; Lestienne and Strehler 1987, 1988; Optican and Richmond 1987; Surmeier and Weinberg 1985; Tsukuda et al. 1984; Victor and Johanessma 1986), but did not spread too broadly. This was presumably because, despite strong theoretical advances in Information Theory, its applicability was hampered by difficulty in measuring and interpreting information-theoretic quantities.
The work of de Ruyter van Steveninck and Bialek (1988) started what could be called the modern era of information-theoretic analysis in neuroscience, in which Information Theory is seeing more and more refined applications. Their work advanced the conceptual aspects of the application of information theory to neuroscience and, subsequently, provided a relatively straightforward way to estimate information-theoretic quantities (Strong et al. 1998). This work provided an approach to removing biases in information estimates due to finite sample size, but the scope of applicability of their approach was limited. The difficulties in obtaining unbiased estimates of information-theoretic quantities were noted early on by Carlton (1969) and Miller (1955) and brought again to attention by Treves and Panzeri (1995). However, it took the renewed interest generated by Strong et al. (1998) to spur the research that eventually resolved them. Almost simultaneously, several groups provided the neuroscience community with robust estimators valid under different conditions (Kennel et al. 2005; Nemenman et al. 2004; Paninski 2003; Victor 2002). This diversity proved important, as Paninski (2003) proved an inconsistency theorem showing that most common estimation techniques can encounter conditions that lead to arbitrary poor estimates.
At the same time, several other branches of Information Theory saw application in neuroscience context. The introduction of techniques stemming from work on quantization and lossy compression (Gersho and Gray 1991) provided lower bounds of information-theoretic quantities and ideas about inference based on them (Dimitrov and Miller 2001; Samengo 2002; Tishby et al. 1999). Furthermore, a large class of neuron models were characterized as samplers that, under appropriate conditions, faithfully encode sensory information in the spike train (time encoding machines, Lazar 2004). The class of neurons includes integrate-and-fire (Lazar and Pnevmatikakis 2008), threshold-and-fire (Lazar et al. 2010) and Hodgkin-Huxley neurons (Lazar 2010).
2 Current state
The work presented in the current issue builds on the developments in both information-theoretic and experimental techniques.
Several of the contributions apply a recent developments in Information Theory - directed information—to clarifying the structure of biological neural networks from observations of their activity. The works originate form the ideas of Granger (1969) on causal interactions, and was placed in an information-theoretic perspective by Massey (1990), Massey and Massey (2005) and Schreiber (2000). Amblard and Michel (2010) here merge the two ideas and extract Granger causality graphs by using directed information measures. They show that such tools are needed to analyze the structure of systems with feedback in general, and neural systems specifically. The authors also provide practical approximations with which to estimate these structures. Quinn et al. (2011) present a novel, nonlinear robust extension of the linear Granger tools and use it to infer the dynamics of neural ensembles based on physiological observations. In particular, the procedure uses point process models of neural spike trains, performs parameter and model order selection with minimal description length, and is applied to the analysis of interactions in neuronal assemblies in the primary motor cortex (MI) of macaque monkeys. Vicente et al. (2011) investigate transfer entropy (TE) as an alternative measure of effective connectivity to electrophysiological data based on simulations and magnetoencephalography (MEG) recordings in a simple motor task. The authors demonstrate that TE improved the detectability of effective connectivity for non-linear interactions, and for sensor level MEG signals where linear methods are hampered by signal-cross-talk due to volume conduction. Using neocortical column neuronal network simulations, Neymotin et al. (2011) demonstrate that networks with greater internal connectivity reduce input/output correlations from excitatory synapses and decrease negative correlations from inhibitory synapses. These changes were measured by normalized TE. Lizier et al. (2011) take the TE idea further and apply it to functional imaging data to determine the direction of information flow between brain regions. As a proof of principle, they show that this approach enables the identification of a hierarchical (tiered) network of cerebellar and cortical regions involved in motor tasks, and how the functional interactions of these regions are modulated by task difficulty.
As mentioned above, the “structure from information” theme is represented by a robust stream of investigations, centered on the notion that neural circuitry exploits the statistical features of the environment to enable efficient coding of sensory stimuli (Barlow 1961). The overwhelming majority of studies concern the implications of this principle for processing at a neuronal level (e.g., Atick and Redlich 1990). In this issue, however, Vanni and Rosenström (2011) consider a larger scale of organization. Through the use of functional imaging techniques to assess the distribution of activity levels in neuronal populations in visual cortex, they show that context effects can be interpreted as means to achieve decorrelation, and hence, efficient coding.
The use of information-theoretic tools to probe the structure of network firing patterns has recently been the subject of great interest, sparked by work from two laboratories (Shlens et al. 2006, 2009; Schneidman et al. 2006; for a review see 2007) which showed that retinal firing patterns are very well-described by a pairwise maximum-entropy model. Not surprisingly (given its complex local circuitry), the pairwise model fails in cortex (Ohiorhenuan et al. 2010); the present contribution from Ohiorhenuan and Victor (2011) shows that the deviations from the pairwise model are highly systematic, and how they can be characterized.
Several contributions also engage classic information- theoretic tools from channel and source coding. Kim et al. (2011) report for the first time that the responses of olfactory sensory neurons in fruit flies expressing the OR59b receptor is very precise and reproducible. The response of these neurons depends not only on the odorant concentration, but also on the rate of change. The authors demonstrate that a two-dimensional encoding manifold in a concentration-concentration gradient space provides a quantitative description of the neuron’s response. By defining a distance measure between spike trains, Gillespie and Houghton (2011) present a novel method for calculating the information channel capacity of spike trains channels. As an example, they calculate the capacity of a data set recorded from auditory neurons in zebra finch. Dimitrov et al. (2011) present a twofold extension of their Information Distortion method (Dimitrov and Miller 2001) as applied to the problem of neural coding. On the theoretical side, they develop the idea of joint quantization that provides optimal lossy compressions of the stimulus and response spaces simultaneously. The practical application to neural coding in the cricket cercal system introduces a family of estimators which provide greater flexibility of the method for different systems and experimental conditions.
And finally, Lewi et al. (2011) use Information Theory to approach the question “What are the best ways to study a biological system?”, when the goal is to characterize the system as efficiently as possible. Tzanakou et al. (1979) initially posed it in the form “Which is the ‘best’ stimulus for a sensory system?”. More recently, Machens (2002) rephrased the problem into “What is the best stimulus distribution that maximizes the information transmission in a sensory system?”. This paper takes the authors’ general formulation (Paninski 2005), “search for stimuli that maximize the information between system description and system observations,” extends it to continuous stimuli, and applies it to the analysis of the auditory midbrain of zebra finches.
In conclusion, Information Theory is thriving in the neuroscience community, and the long efforts are bearing fruit, as diverse research questions are being approached with more elaborate and refined tools. As demonstrated by this issue and the complementary compilation of the Information Theory Society (Milenkovic et al. 2010), Information Theory is firmly integrated in the fabric of neuroscience research, and a progressively wider range of biological research in general, and will continue to play an important role in these disciplines. Conversely, neuroscience is starting to serve as a driver for further research in Information Theory, opening interesting new directions of inquiry.
- Amblard, P.-O., & Michel, O. (2011). On directed information theory and granger causality graphs. Journal of Computational Neuroscience. doi: 10.1007/s10827-010-0231-x.
- Barlow, H. B. (1961). Possible princilples underlying the transformation of sensory messages. In W. A. Rosenblith (Ed.), Sensory communications. MIT Press, Cambridge, MA.Google Scholar
- Dimitrov, A. G., Cummins, G. I., Baker, A., & Aldworth, Z. N. (2011). Characterizing the fine structure of a neural sensory code through information distortion. Journal of Computational Neuroscience. doi: 10.1007/s10827-010-0261-4.
- Dimitrov, A. G., & Miller, J. P. (2001). Neural coding and decoding: Communication channels and quantization. Network: Computation in Neural Systems, 12, 441–472.Google Scholar
- Eckhorn, R., & Pöpel, B. (1974). Rigorous and extended application of information theory to the afferent visual system of the cat. I. Basic conceptsfferent visual system of the cat. I. Basic concepts. Biological Cybernetics, 16, 191–200.Google Scholar
- Gersho, A., & Gray, R. M. (1991). Vector Quantization and Signal Compression. Series in Engineering and Computer Science. Kluwer International.Google Scholar
- Gillespie, J. B., & Houghton, C. (2011). A metric space approach to the information channel capacity of spike trains. Journal of Computational Neuroscience. doi: 10.1007/s10827-010-0286-8.
- Kim, A. J., Lazar, A. A., & Slutskiy, Y. B. (2011). System identification of drosophila olfactory sensory neurons. Journal of Computational Neuroscience. doi: 10.1007/s10827-010-0265-0.
- Lazar, A. A. (2010). Population encoding with Hodgkin-Huxley neurons. IEEE Transactions on Information Theory, Special Issue on Molecular Biology and Neuroscience, 56(2), 821–837.Google Scholar
- Lazar, A. A., Pnevmatikakis, E. A., & Zhou, Y. (2010). Encoding natural scenes with neural circuits with random thresholds. Vision Research, Special Issue on Mathematical Models of Visual Coding, 50(22), 2200–2212.Google Scholar
- Lewi, J., Schneider, D., Woolley, S., & Paninski, L. (2011). Automating the design of informative sequences of sensory stimuli. Journal of Computational Neuroscience. doi: 10.1007/s10827-010-0248-1.
- Lizier, J., Heinzle, J., Horstmann, A., Haynes, J.-D., & Prokopenko, M. (2011). Multivariate information-theoretic measures reveal directed information structure and task relevant changes in fMRI connectivity. Journal of Computational Neuroscience. doi: 10.1007/s10827-010-0271-2.
- Massey, J. (1990). Causality, feedback and directed information. In Proc. Intl. Symp. Information Theory Applications (ISITA-90) (pp. 303–305).Google Scholar
- Massey, J., & Massey, P. (2005). Conservation of mutual and directed information. In Proc. Intl. Symp. Information Theory (ISIT 2005) (pp. 157–158).Google Scholar
- McCulloch, W. S. (1952). An upper bound on the informational capacity of a synapse. In Proceedings of the 1952 ACM national meeting, Pittsburgh, Pensilvania.Google Scholar
- Merolla, P. A., & Boahen, K. (2006). Dynamic computation in a recurrent network of heterogeneous silicon neurons. In IEEE International Symposium on Circuits and Systems (pp. 4539–4542). IEEE Press.Google Scholar
- Miller, G. A. (1955). Note on the bias of information estimates. In Information Theory in Psychology: Problems and Methods (Vol. II-B, pp. 95–100). Free Press.Google Scholar
- Nemenman, I., Bialek, W., & de Ruyter van Steveninck, R. (2004). Entropy and information in neural spike trains: Progress on the sampling problem. Physical Review Letters E, 69, 056111.Google Scholar
- Neymotin, S. A., Jacobs, K. M., Fenton, A. A., & Lytton, W. W. (2011). Synaptic information transfer in computer models of neocortical columns. Journal of Computational Neuroscience. doi: 10.1007/s10827-010-0253-4.
- Ohiorhenuan, I. E., & Victor, J. D. (2011). Information geometric measure of 3-neuron firing patterns characterizes scale-dependence in cortical networks. Journal of Computational Neuroscience. doi: 10.1007/s10827-010-0257-0.
- Optican, L. M., & Richmond, B. J. (1987). Temporal encoding of two-dimensional patterns by single units in primate inferior temporal cortex. III. Information theoretic analysis. Journal of Neurophysiology, 57, 163–178.Google Scholar
- Quinn, C. J., Coleman, T. P., Kiyavash, N., & Hatsopoulos, N. G. (2011). Estimating the directed information to infer causal relationships in ensemble neural spike train recordings. Journal of Computational Neuroscience.. doi: 10.1007/s10827-010-0247-2.
- Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 623–656.Google Scholar
- Shannon, C. E., & McCarthy, J. (Eds.) (1956). Automata studies. In Annals of mathematics studies (Vol. 34). Princeton University Press.Google Scholar
- Taylor, R. C. (1975). Integration in the crayfish antennal neuropile: Topographic representation and multiple-channel coding of mechanoreceptive submodalities. Developmental Neurobiology, 6(5), 475–499.Google Scholar
- Tishby, N., Pereira, F., & Bialek, W. (1999). The information bottleneck method. In Proceedings of The 37th annual Allerton conference on communication, control and computing. University of Illinios.Google Scholar
- Vanni, S., & Rosenström, T. (2011). Local non-linear interactions in the visual cortex may reflect global decorrelation. Journal of Computational Neuroscience. doi: 10.1007/s10827-010-0239-2.
- Vicente, R., Wibral, M., Lindner, M., & Pipa, G. (2011). Transfer entropy—A model-free measure of effective connectivity for the neurosciences. Journal of Computational Neuroscience. doi: 10.1007/s10827-010-0262-3.
- Werner, G., & Mountcastle, V. B. (1965). Neural activity in mechanoreceptive cutaneous afferents: Stimulus-response relations, weber functions, and information transmission. Journal of Neurophysiolgy, 28, 359–397.Google Scholar
- Winograd, S., & Cowan, J. D. (1963). Reliable computation in the presence of noise. The MIT Press.Google Scholar