Abstract
Given the constant rise in quantity and quality of data obtained from neural systems on all scales, information-theoretic analyses became more and more popular over the last decades in the neurosciences. Such analyses can provide deep insights into the functioning of such systems and also be of help in the characterization and analysis of neural dysfunction, a topic that has come into the focus of research in the computational neurosciences recently. This chapter is supposed to give a short introduction to the fundamentals of information theory, not only, but especially, suited for people having a less firm background in mathematics and probability theory. Regarding applications, the focus will be on neuroscientific topics. We start by reviewing fundamentals of probability theory such as the notion of probability, probability distributions, and random variables. We will then discuss the concepts of information and entropy (in the sense of Shannon), mutual information, and transfer entropy (sometimes also referred to as conditional mutual information). As these quantities cannot be computed for measured data in practice, we discuss estimation techniques for information-theoretic quantities. We conclude with a discussion of applications of information theory in the field of neuroscience, including questions of possible medical applications and a short review of software packages that can be used for information-theoretic analyses of neural data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Shannon chose the letter H for denoting entropy after Boltzmann’s H-theorem in classical statistical mechanics.
References
J. Alstott, M. Breakspear, P. Hagmann, L. Cammoun, and O. Sporns. Modeling the impact of lesions in the human brain. PLoS computational biology, 5(6):e1000408, June 2009.
S-I. Amari, H. Nagaoka, and D. Harada. Methods of information geometry, volume 191 of Translations of Mathematical Monographs. American Mathematical Society, Providence, RI, 2000.
I. S. And and K. Staley, editors. Computational Neuroscience in Epilepsy. Academic Press, 2011.
A. Antós and I. Kontoyiannis. Convergence properties of functional estimates for discrete distributions. Random Structures and Algorithms, 19(3–4):163–193, 2001.
M. M. Arnold, J. Szczepanski, N. Montejo, J. M. Amigó, E. Wajnryb, and M. V. Sanchez-Vives. Information content in cortical spike trains during brain state transitions. J Sleep Res, 22(1):13–21, 2013.
F. Attneave. Some informational aspects of visual perception. Psychol Rev, 61(3):183–193, 1954.
N. Ay and D. Polani. Information Flows in Causal Networks. Advances in Complex Systems, 11(01):17–41, 2008.
F. Barcelo and R. T. Knight. An information-theoretical approach to contextual processing in the human brain: evidence from prefrontal lesions. Cerebral cortex, 17 Suppl 1:51–60, 2007.
H. B. Barlow. Sensory Communication, chapter Possible principles underlying the transformation of sensory messages, pages 217–234. MIT Press, 1961.
J. Beirlant and E. J. Dudewicz. Nonparametric entropy estimation: An overview. Intern J Math Stat Sci, 6(1):1–14, 1997.
B. S. Bhattacharya, D. Coyle, and L. P. Maguire. A thalamo-cortico-thalamic neural mass model to study alpha rhythms in Alzheimer’s disease. Neural Networks, 24(6):631–645, 2011.
W. Bialek, F. Rieke, R. de Ruyter van Steveninck, and D. Warland. Reading a neural code. Science, 252(5014):1854–1857, 1991.
W. Bialek, R. Scalettar, and A. Zee. Optimal performance of a feed-forward network at statistical discrimination tasks. Journal of Statistical Physics, 57(1–2):141–156, 1989.
C. R. Blyth. Note on Estimating Information Author. The Annals of Mathematical Statistics, 30(1):71–79, 1959.
A. Borst and F. E. Theunissen. Information theory and neural coding. Nat Neurosci, 2(11):947–957, 1999.
S. L. Bressler and A. K. Seth. Wiener-Granger causality: a well established methodology. Neuroimage, 58(2):323–329, 2011.
N. Brunel and J. P. Nadal. Mutual information, Fisher information, and population coding. Neural Comput, 10(7):1731–1757, 1998.
Z Brzeniak and T. J Zastawniak. Basic Stochastic Processes: A Course Through Exercises. Springer, 1999.
G. T. Buracas, A. M. Zador, M. R. DeWeese, and T. D. Albright. Efficient discrimination of temporal patterns by motion-sensitive neurons in primate visual cortex. Neuron, 20(5):959–969, 1998.
D. A. Butts. How much information is associated with a particular stimulus? Network, 14(2):177–187, 2003.
C. Cellucci, A. Albano, and P. Rapp. Statistical validation of mutual information calculations: Comparison of alternative numerical algorithms. Physical Rev E, 71(6):066208, 2005.
G. Chechik, M. J. Anderson, O. Bar-Yosef, E. D. Young, N. Tishby, and I. Nelken. Reduction of information redundancy in the ascending auditory pathway. Neuron, 51(3):359–368, 2006.
D. Colquhoun and B. Sakmann. Fast events in single-channel currents activated by acetylcholine and its analogues at the frog muscle end-plate. The Journal of Physiology, 369:501–557, 1985.
A. Compte, C. Constantinidis, J. Tegner, S. Raghavachari, M. V. Chafee, P. S. Goldman-Rakic, and X-J. Wang. Temporally irregular mnemonic persistent activity in prefrontal neurons of monkeys during a delayed response task. J Neurophysiol, 90(5):3441–3454, 2003.
C. H. Coombs, R. M. Dawes, and A. Tversky. Mathematical psychology: an elementary introduction. Prentice-Hall, 1970.
T. M. Cover and J. A. Thomas. Elements of Information Theory, volume 2012. John Wiley & Sons, 1991.
M. Crumiller, B. Knight, Y. Yu, and E. Kaplan. Estimating the amount of information conveyed by a population of neurons. Frontiers in Neurosci, 5(July):90, 2011.
V. Cutsuridis, T. Heida, W. Duch, and K. Doya. Neurocomputational models of brain disorders. Neural Networks, 24(6):513–514, 2011.
R. de Ruyter van Steveninck and W. Bialek. Real-time performance of a movement-sensitive neuron in the blowfly visual system: coding and information transfer in short spike sequences. Proc. R. Soc. Lond. B, 234(1277):379–414, 1988.
R. de Ruyter van Steveninck and S. B. Laughlin. The rate of information transfer at graded-potential synapses. Nature, 379:642–645, 1996.
X. Du and B. H. Jansen. A neural network model of normal and abnormal auditory information processing. Neural Networks, 24(6):568–574, 2011.
R. Eckhorn and B. Pöpel. Rigorous and extended application of information theory to the afferent visual system of the cat. I. Basic concepts. Kybernetik, 16(4):191–200, 1974.
R. Eckhorn and B. Pöpel. Rigorous and extended application of information theory to the afferent visual system of the cat. II. Experimental results. Biol Cybern, 17(1):71–77, 1975.
B. Efron and C. Stein. The jackknife estimate of variance. The Annals of Statistics, 9(3):586–596, 1981.
A. Fairhall, E. Shea-Brown, and A. Barreiro. Information theoretic approaches to understanding circuit function. Curr Opin Neurobiol, 22(4):653–659, 2012.
K. Friston. The free-energy principle: a unified brain theory? Nat Rev Neurosci, 11(2):127–138, 2010.
K. Friston. Dynamic causal modeling and Granger causality Comments on: the identification of interacting networks in the brain using fMRI: model selection, causality and deconvolution. Neuroimage, 58(2):303–310, 2011.
K. Friston, J. Kilner, and L. Harrison. A free energy principle for the brain. J Physiol Paris, 100(1–3):70–87, 2006.
K. Friston, R. Moran, and A. K. Seth. Analysing connectivity with Granger causality and dynamic causal modelling. Current opinion in neurobiology, pages 1–7, December 2012.
K. J Friston. Functional and effective connectivity in neuroimaging: A synthesis. Human Brain Mapping, 2(1–2):56–78, October 1994.
K. J. Friston, L. Harrison, and W. Penny. Dynamic causal modelling. Neuroimage, 19(4):1273–1302, 2003.
W. Gerstner, A. K. Kreiter, H. Markram, and A. V. Herz. Neural codes: firing rates and beyond. Proc Natl Acad Sci U S A, 94(24):12740–12741, 1997.
A. Globerson, E. Stark, D. C. Anthony, R. Nicola, B. G. Davis, E. Vaadia, and N. Tishby. The minimum information principle and its application to neural code analysis. Proc Natl Acad Sci U S A, 106(9):3490–3495, 2009.
C. W. J. Granger. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society, 37:424–438, 1969.
M. Haeri, Y. Sarbaz, and S. Gharibzadeh. Modeling the Parkinson’s tremor and its treatments. J Theor Biol, 236(3):311–322, 2005.
K. Hlavackovaschindler, M. Palus, M. Vejmelka, and J. Bhattacharya. Causality detection based on information-theoretic approaches in time series analysis. Physics Reports, 441(1):1–46, 2007.
P. G. Hoel, S. C. Port, and C. J. Stone. Introduction to probability theory. Houghton Mifflin Co., Boston, Mass., 1971.
Q. J. M. Huys, M. Moutoussis, and J. Williams. Are computational models of any use to psychiatry? Neural Networks, 24(6):544–551, 2011.
R. A. A. Ince, A. Mazzoni, R. S. Petersen, and S. Panzeri. Open source tools for the information theoretic analysis of neural data. Frontiers in Neurosci, 4(1):62–70, 2010.
R. A. A. Ince, R. Senatore, E. Arabzadeh, F. Montani, M. E. Diamond, and S. Panzeri. Information-theoretic methods for studying population codes. Neural Networks, 23(6):713–727, 2010.
A. Kaiser and T. Schreiber. Information transfer in continuous processes. Physica D, 166(March):43–62, 2002.
A. Klenke. Probability Theory. Universitext. Springer London, London, 2008.
E. Koechlin and C. Summerfield. An information theoretical approach to prefrontal executive function. Trends in Cognitive Sciences, 11(6):229–235, 2007.
A. Kolmogoroff. Grundbegriffe der Wahrscheinlichkeitsrechnung. Springer-Verlag, Berlin, 1973.
A. Kraskov, H. Stögbauer, and P. Grassberger. Estimating mutual information. Physical Rev E, 69(6):066138, 2004.
S. Krishnamurti, L. Drake, and J. King. Neural network modeling of central auditory dysfunction in Alzheimer’s disease. Neural Networks, 24(6):646–651, 2011.
S. Kullback and R. A. Leibler. On information and sufficiency. The Annals of Mathematical Statistics, 22(1):79–86, 1951.
E. L. Lehmann and G. Casella. Theory of Point Estimation. Springer, 1998.
R. Linsker. Self-organization in a perceptual network. Computer, 21(3):105–117, 1988.
R. Linsker. Perceptual neural organization: some approaches based on network models and information theory. Annu Rev Neurosci, 13:257–281, 1990.
R. Linsker. Local synaptic learning rules suffice to maximize mutual information in a linear network. Neural Comput, 702(1):691–702, 1992.
J. T. Lizier. The Local Information Dynamics of Distributed Computation in Complex Systems. Number October. Springer, springer edition, 2013.
J. T. Lizier and M. Prokopenko. Differentiating information transfer and causal effect. The European Physical Journal B, 73(4):605–615, January 2010.
J.T. Lizier, M. Prokopenko, and A.Y. Zomaya. The information dynamics of phase transitions in random Boolean networks. In Proc Eleventh Intern Conf on the Simulation and Synthesis of Living Systems (ALife XI), pages 374–381. MIT Press, 2008.
M. London, A. Schreibman, M. Häusser, M. E. Larkum, and I. Segev. The information efficacy of a synapse. Nat Neurosci, 5(4):332–340, 2002.
M. Lungarella, K. Ishiguro, Y. Kuniyoshi, and N. Otsu. Methods for Quantifying the Causal Structure of Bivariate Time Series. International Journal of Bifurcation and Chaos, 17(03):903–921, 2007.
D. J. C. MacKay. Information theory, inference and learning algorithms. Cambridge University Press, 2003.
David Marr. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. MIT Press, 1982.
R. Marschinski and H. Kantz. Analysing the information flow between financial time series. The European Physical Journal B, 30(2):275–281, 2002.
C. C. McIntyre, S. Miocinovic, and C. R. Butson. Computational analysis of deep brain stimulation. Expert Rev Med Devices, 4(5):615–622, 2007.
G. A. Miller. Information Theory in Psychology: Problems and Methods, chapter Note on the bias of information estimates, pages 95–100. Free Press, 1955.
P. R. Montague, R. J. Dolan, K. J. Friston, and P. Dayan. Computational psychiatry. Trends in Cognitive Sciences, 16(1):72–80, 2012.
A. A. Moustafa and M. A. Gluck. Computational cognitive models of prefrontal-striatal-hippocampal interactions in Parkinson’s disease and schizophrenia. Neural Networks, 24(6):575–591, 2011.
M. P. Nawrot, C. Boucsein, V. Rodriguez Molina, A. Riehle, A. Aertsen, and S. Rotter. Measurement of variability dynamics in cortical spike trains. J Neurosci Methods, 169(2):374–390, 2008.
I. Nemenman, W. Bialek, and R. R. de Ruyter van Steveninck. Entropy and information in neural spike trains: Progress on the sampling problem. Physical Rev E, 69(5):056111, 2004.
K. H. Norwich. Information, Sensation, and Perception. Academic Press, 1993.
B. A. Olshausen and D. J. Field. Sparse coding with an overcomplete basis set: A strategy employed by VI? Vision Res, 37(23):3311–3325, 1997.
D. Ostwald and A. P. Bagshaw. Information theoretic approaches to functional neuroimaging. Magn Reson Imaging, 29(10):1417–1428, 2011.
L. Paninski. Estimation of entropy and mutual information. Neural Comput, 15(6):1191–1254, 2003.
L. Paninski. Maximum likelihood estimation of cascade point-process neural encoding models. Network: Computation in Neural Systems, 15(4):243–262, November 2004.
S. Panzeri, R. S. Petersen, S. R. Schultz, M. Lebedev, and M. E. Diamond. The role of spike timing in the coding of stimulus location in rat somatosensory cortex. Neuron, 29(3):769–777, 2001.
S. Panzeri, R. Senatore, M. A. Montemurro, and R. S. Petersen. Correcting for the sampling bias problem in spike train information measures. Journal of neurophysiology, 98(3):1064–72, 2007.
S. Panzeri and A. Treves. Analytical estimates of limited sampling biases in different information measures. Network, 7:87–107, 1995.
J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Kaufmann, M, 1988.
J. Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2000.
M. Pirini, L. Rocchi, M. Sensi, and L. Chiari. A computational modelling approach to investigate different targets in deep brain stimulation for Parkinson’s disease. J Comput Neurosci, 26(1):91–107, 2009.
A. Pouget, P. Dayan, and R. Zemel. Information processing with population codes. Nat Rev Neurosci, 1(2):125–132, 2000.
M. Prokopenko, F. Boschetti, and A. J. Ryan. An information-theoretic primer on complexity, self-organization, and emergence. Complexity, 15(1):11–28, 2009.
R. Q. Quiroga and S. Panzeri. Extracting information from neuronal populations: information theory and decoding approaches. Nat Rev Neurosci, 10(3):173–195, 2009.
K. R. Rad and L. Paninski. Information Rates and Optimal Decoding in Large Neural Populations. In NIPS 2011: Granada, Spain, pages 1–9, 2011.
F. Rieke, D. Warland, and W. Bialek. Coding efficiency and information rates in sensory neurons. EPL (Europhysics Letters), 22(2):151–156, 1993.
F. Rieke, D. Warland, R. de Ruyter van Steveninck, and W. Bialek. Spikes: Exploring the Neural Code (Computational Neuroscience). A Bradford Book, 1999.
E. T. Rolls and A. Treves. The neuronal encoding of information in the brain. Prog Neurobiol, 95(3):448–490, 2011.
T. Schreiber. Measuring Information Transfer. Phys Rev Lett, 85(2):461–464, 2000.
T. Schürmann. Bias analysis in entropy estimation. Journal of Physics A: Mathematical and General, 37(27):L295–L301, 2004.
T. J. Sejnowski. Time for a new neural code? Nature, 376(July):21–22, 1995.
C. E. Shannon. A Mathematical Theory of Communication. The Bell System Technical Journal, 27(July, October 1948):379–423, 623–656, 1948.
A. N. Shiryayev. Probability, volume 95 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1984.
J. Shlens, M. B. Kennel, H. D. I. Abarbanel, and E. J. Chichilnisky. Estimating information rates with confidence intervals in neural spike trains. Neural Comput, 19(7):1683–1719, 2007.
E. P. Simoncelli and B. A. Olshausen. Natural image statistics and neural representation. Annu Rev Neurosci, 24:1193–1216, 2001.
W. R. Softky and C. Koch. The highly irregular firing of cortical cells is inconsistent with temporal integration of random EPSPs. The Journal of Neuroscience, 13(1):334–350, 1993.
K. E. Stephan, L. M. Harrison, S. J. Kiebel, O. David, W. D. Penny, and K. J. Friston. Dynamic causal models of neural system dynamics:current state and future extensions. Journal of biosciences, 32(1):129–144, 2007.
R. Stollhoff, I. Kennerknecht, T. Elze, and J. Jost. A computational model of dysfunctional facial encoding in congenital prosopagnosia. Neural Networks, 24(6):652–664, 2011.
S. Strong, R. Koberle, R. de Ruyter van Steveninck, and W. Bialek. Entropy and Information in Neural Spike Trains. Phys Rev Lett, 80(1):197–200, 1998.
H. Theil. Henri Theil’s Contributions to Economics and Econometrics: Econometric Theory and Methodology. Springer, 1992.
I. Todhunter. A History of the Mathematical Theory of Probability from the Time of Pascal to that of Laplace. Elibron Classics, 1865.
G. Tononi, O. Sporns, and G. M. Edelman. A measure for brain complexity: relating functional segregation and integration in the nervous system. Proc Natl Acad Sci U S A, 91(11):5033–5037, 1994.
T. Trappenberg. Fundamentals of Computational Neuroscience. Oxford University Press, 2010.
J. H. van Hateren. A theory of maximizing sensory information. Biol Cybern, 29:23–29, 1992.
R. Vicente, M. Wibral, M. Lindner, and G. Pipa. Transfer entropy–a model-free measure of effective connectivity for the neurosciences. J Comput Neurosci, 30(1):45–67, 2011.
J. D. Victor. Approaches to information-theoretic analysis of neural activity. Biological theory, 1(3):302–316, 2006.
N Wiener. The theory of prediction. In E. Beckenbach, editor, Modern mathematics for engineers. McGraw-Hill, New-York, 1956.
S. Yarrow, E. Challis, and P. Seriès. Fisher and shannon information in finite neural populations. Neural Comput, 1780:1740–1780, 2012.
L. Zhaoping. Theoretical understanding of the early visual processes by data compression and data selection. Network, 17(4):301–334, 2006.
I. Csiszár. Axiomatic characterizations of information measures. Entropy, 10(3):261–273, 2008.
Acknowledgements
The author would like to thank Nihat Ay, Yuri Campbell, Aleena Garner, Jörg Lehnert, Timm Lochmann, Wiktor Młynarski, and Carolin Stier for their useful comments on the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Effenberger, F. (2013). A Primer on Information Theory with Applications to Neuroscience. In: Rakocevic, G., Djukic, T., Filipovic, N., Milutinović, V. (eds) Computational Medicine in Data Mining and Modeling. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8785-2_5
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8785-2_5
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8784-5
Online ISBN: 978-1-4614-8785-2
eBook Packages: Computer ScienceComputer Science (R0)