A Primer on Information Theory with Applications to Neuroscience

Effenberger, Felix

doi:10.1007/978-1-4614-8785-2_5

Felix Effenberger⁵

1830 Accesses
2 Citations

Abstract

Given the constant rise in quantity and quality of data obtained from neural systems on all scales, information-theoretic analyses became more and more popular over the last decades in the neurosciences. Such analyses can provide deep insights into the functioning of such systems and also be of help in the characterization and analysis of neural dysfunction, a topic that has come into the focus of research in the computational neurosciences recently. This chapter is supposed to give a short introduction to the fundamentals of information theory, not only, but especially, suited for people having a less firm background in mathematics and probability theory. Regarding applications, the focus will be on neuroscientific topics. We start by reviewing fundamentals of probability theory such as the notion of probability, probability distributions, and random variables. We will then discuss the concepts of information and entropy (in the sense of Shannon), mutual information, and transfer entropy (sometimes also referred to as conditional mutual information). As these quantities cannot be computed for measured data in practice, we discuss estimation techniques for information-theoretic quantities. We conclude with a discussion of applications of information theory in the field of neuroscience, including questions of possible medical applications and a short review of software packages that can be used for information-theoretic analyses of neural data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 119.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Shannon chose the letter H for denoting entropy after Boltzmann’s H-theorem in classical statistical mechanics.

References

J. Alstott, M. Breakspear, P. Hagmann, L. Cammoun, and O. Sporns. Modeling the impact of lesions in the human brain. PLoS computational biology, 5(6):e1000408, June 2009.
Google Scholar
S-I. Amari, H. Nagaoka, and D. Harada. Methods of information geometry, volume 191 of Translations of Mathematical Monographs. American Mathematical Society, Providence, RI, 2000.
Google Scholar
I. S. And and K. Staley, editors. Computational Neuroscience in Epilepsy. Academic Press, 2011.
Google Scholar
A. Antós and I. Kontoyiannis. Convergence properties of functional estimates for discrete distributions. Random Structures and Algorithms, 19(3–4):163–193, 2001.
Article MathSciNet MATH Google Scholar
M. M. Arnold, J. Szczepanski, N. Montejo, J. M. Amigó, E. Wajnryb, and M. V. Sanchez-Vives. Information content in cortical spike trains during brain state transitions. J Sleep Res, 22(1):13–21, 2013.
Article Google Scholar
F. Attneave. Some informational aspects of visual perception. Psychol Rev, 61(3):183–193, 1954.
Article Google Scholar
N. Ay and D. Polani. Information Flows in Causal Networks. Advances in Complex Systems, 11(01):17–41, 2008.
Article MathSciNet MATH Google Scholar
F. Barcelo and R. T. Knight. An information-theoretical approach to contextual processing in the human brain: evidence from prefrontal lesions. Cerebral cortex, 17 Suppl 1:51–60, 2007.
Google Scholar
H. B. Barlow. Sensory Communication, chapter Possible principles underlying the transformation of sensory messages, pages 217–234. MIT Press, 1961.
Google Scholar
J. Beirlant and E. J. Dudewicz. Nonparametric entropy estimation: An overview. Intern J Math Stat Sci, 6(1):1–14, 1997.
MathSciNet Google Scholar
B. S. Bhattacharya, D. Coyle, and L. P. Maguire. A thalamo-cortico-thalamic neural mass model to study alpha rhythms in Alzheimer’s disease. Neural Networks, 24(6):631–645, 2011.
Article Google Scholar
W. Bialek, F. Rieke, R. de Ruyter van Steveninck, and D. Warland. Reading a neural code. Science, 252(5014):1854–1857, 1991.
Google Scholar
W. Bialek, R. Scalettar, and A. Zee. Optimal performance of a feed-forward network at statistical discrimination tasks. Journal of Statistical Physics, 57(1–2):141–156, 1989.
Article MathSciNet Google Scholar
C. R. Blyth. Note on Estimating Information Author. The Annals of Mathematical Statistics, 30(1):71–79, 1959.
Article MathSciNet MATH Google Scholar
A. Borst and F. E. Theunissen. Information theory and neural coding. Nat Neurosci, 2(11):947–957, 1999.
Article Google Scholar
S. L. Bressler and A. K. Seth. Wiener-Granger causality: a well established methodology. Neuroimage, 58(2):323–329, 2011.
Article Google Scholar
N. Brunel and J. P. Nadal. Mutual information, Fisher information, and population coding. Neural Comput, 10(7):1731–1757, 1998.
Article Google Scholar
Z Brzeniak and T. J Zastawniak. Basic Stochastic Processes: A Course Through Exercises. Springer, 1999.
Google Scholar
G. T. Buracas, A. M. Zador, M. R. DeWeese, and T. D. Albright. Efficient discrimination of temporal patterns by motion-sensitive neurons in primate visual cortex. Neuron, 20(5):959–969, 1998.
Article Google Scholar
D. A. Butts. How much information is associated with a particular stimulus? Network, 14(2):177–187, 2003.
Article Google Scholar
C. Cellucci, A. Albano, and P. Rapp. Statistical validation of mutual information calculations: Comparison of alternative numerical algorithms. Physical Rev E, 71(6):066208, 2005.
Google Scholar
G. Chechik, M. J. Anderson, O. Bar-Yosef, E. D. Young, N. Tishby, and I. Nelken. Reduction of information redundancy in the ascending auditory pathway. Neuron, 51(3):359–368, 2006.
Article Google Scholar
D. Colquhoun and B. Sakmann. Fast events in single-channel currents activated by acetylcholine and its analogues at the frog muscle end-plate. The Journal of Physiology, 369:501–557, 1985.
Google Scholar
A. Compte, C. Constantinidis, J. Tegner, S. Raghavachari, M. V. Chafee, P. S. Goldman-Rakic, and X-J. Wang. Temporally irregular mnemonic persistent activity in prefrontal neurons of monkeys during a delayed response task. J Neurophysiol, 90(5):3441–3454, 2003.
Google Scholar
C. H. Coombs, R. M. Dawes, and A. Tversky. Mathematical psychology: an elementary introduction. Prentice-Hall, 1970.
Google Scholar
T. M. Cover and J. A. Thomas. Elements of Information Theory, volume 2012. John Wiley & Sons, 1991.
Google Scholar
M. Crumiller, B. Knight, Y. Yu, and E. Kaplan. Estimating the amount of information conveyed by a population of neurons. Frontiers in Neurosci, 5(July):90, 2011.
Google Scholar
V. Cutsuridis, T. Heida, W. Duch, and K. Doya. Neurocomputational models of brain disorders. Neural Networks, 24(6):513–514, 2011.
Article Google Scholar
R. de Ruyter van Steveninck and W. Bialek. Real-time performance of a movement-sensitive neuron in the blowfly visual system: coding and information transfer in short spike sequences. Proc. R. Soc. Lond. B, 234(1277):379–414, 1988.
Google Scholar
R. de Ruyter van Steveninck and S. B. Laughlin. The rate of information transfer at graded-potential synapses. Nature, 379:642–645, 1996.
Google Scholar
X. Du and B. H. Jansen. A neural network model of normal and abnormal auditory information processing. Neural Networks, 24(6):568–574, 2011.
Article Google Scholar
R. Eckhorn and B. Pöpel. Rigorous and extended application of information theory to the afferent visual system of the cat. I. Basic concepts. Kybernetik, 16(4):191–200, 1974.
Article Google Scholar
R. Eckhorn and B. Pöpel. Rigorous and extended application of information theory to the afferent visual system of the cat. II. Experimental results. Biol Cybern, 17(1):71–77, 1975.
Article Google Scholar
B. Efron and C. Stein. The jackknife estimate of variance. The Annals of Statistics, 9(3):586–596, 1981.
Article MathSciNet MATH Google Scholar
A. Fairhall, E. Shea-Brown, and A. Barreiro. Information theoretic approaches to understanding circuit function. Curr Opin Neurobiol, 22(4):653–659, 2012.
Article Google Scholar
K. Friston. The free-energy principle: a unified brain theory? Nat Rev Neurosci, 11(2):127–138, 2010.
Article Google Scholar
K. Friston. Dynamic causal modeling and Granger causality Comments on: the identification of interacting networks in the brain using fMRI: model selection, causality and deconvolution. Neuroimage, 58(2):303–310, 2011.
Article Google Scholar
K. Friston, J. Kilner, and L. Harrison. A free energy principle for the brain. J Physiol Paris, 100(1–3):70–87, 2006.
Article Google Scholar
K. Friston, R. Moran, and A. K. Seth. Analysing connectivity with Granger causality and dynamic causal modelling. Current opinion in neurobiology, pages 1–7, December 2012.
Google Scholar
K. J Friston. Functional and effective connectivity in neuroimaging: A synthesis. Human Brain Mapping, 2(1–2):56–78, October 1994.
Google Scholar
K. J. Friston, L. Harrison, and W. Penny. Dynamic causal modelling. Neuroimage, 19(4):1273–1302, 2003.
Article Google Scholar
W. Gerstner, A. K. Kreiter, H. Markram, and A. V. Herz. Neural codes: firing rates and beyond. Proc Natl Acad Sci U S A, 94(24):12740–12741, 1997.
Article Google Scholar
A. Globerson, E. Stark, D. C. Anthony, R. Nicola, B. G. Davis, E. Vaadia, and N. Tishby. The minimum information principle and its application to neural code analysis. Proc Natl Acad Sci U S A, 106(9):3490–3495, 2009.
Article Google Scholar
C. W. J. Granger. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society, 37:424–438, 1969.
Article Google Scholar
M. Haeri, Y. Sarbaz, and S. Gharibzadeh. Modeling the Parkinson’s tremor and its treatments. J Theor Biol, 236(3):311–322, 2005.
Article Google Scholar
K. Hlavackovaschindler, M. Palus, M. Vejmelka, and J. Bhattacharya. Causality detection based on information-theoretic approaches in time series analysis. Physics Reports, 441(1):1–46, 2007.
Article Google Scholar
P. G. Hoel, S. C. Port, and C. J. Stone. Introduction to probability theory. Houghton Mifflin Co., Boston, Mass., 1971.
Google Scholar
Q. J. M. Huys, M. Moutoussis, and J. Williams. Are computational models of any use to psychiatry? Neural Networks, 24(6):544–551, 2011.
Article Google Scholar
R. A. A. Ince, A. Mazzoni, R. S. Petersen, and S. Panzeri. Open source tools for the information theoretic analysis of neural data. Frontiers in Neurosci, 4(1):62–70, 2010.
Google Scholar
R. A. A. Ince, R. Senatore, E. Arabzadeh, F. Montani, M. E. Diamond, and S. Panzeri. Information-theoretic methods for studying population codes. Neural Networks, 23(6):713–727, 2010.
Article Google Scholar
A. Kaiser and T. Schreiber. Information transfer in continuous processes. Physica D, 166(March):43–62, 2002.
Article MathSciNet MATH Google Scholar
A. Klenke. Probability Theory. Universitext. Springer London, London, 2008.
Google Scholar
E. Koechlin and C. Summerfield. An information theoretical approach to prefrontal executive function. Trends in Cognitive Sciences, 11(6):229–235, 2007.
Article Google Scholar
A. Kolmogoroff. Grundbegriffe der Wahrscheinlichkeitsrechnung. Springer-Verlag, Berlin, 1973.
Google Scholar
A. Kraskov, H. Stögbauer, and P. Grassberger. Estimating mutual information. Physical Rev E, 69(6):066138, 2004.
Google Scholar
S. Krishnamurti, L. Drake, and J. King. Neural network modeling of central auditory dysfunction in Alzheimer’s disease. Neural Networks, 24(6):646–651, 2011.
Article Google Scholar
S. Kullback and R. A. Leibler. On information and sufficiency. The Annals of Mathematical Statistics, 22(1):79–86, 1951.
Article MathSciNet MATH Google Scholar
E. L. Lehmann and G. Casella. Theory of Point Estimation. Springer, 1998.
Google Scholar
R. Linsker. Self-organization in a perceptual network. Computer, 21(3):105–117, 1988.
Article Google Scholar
R. Linsker. Perceptual neural organization: some approaches based on network models and information theory. Annu Rev Neurosci, 13:257–281, 1990.
Article Google Scholar
R. Linsker. Local synaptic learning rules suffice to maximize mutual information in a linear network. Neural Comput, 702(1):691–702, 1992.
Article Google Scholar
J. T. Lizier. The Local Information Dynamics of Distributed Computation in Complex Systems. Number October. Springer, springer edition, 2013.
Google Scholar
J. T. Lizier and M. Prokopenko. Differentiating information transfer and causal effect. The European Physical Journal B, 73(4):605–615, January 2010.
Google Scholar
J.T. Lizier, M. Prokopenko, and A.Y. Zomaya. The information dynamics of phase transitions in random Boolean networks. In Proc Eleventh Intern Conf on the Simulation and Synthesis of Living Systems (ALife XI), pages 374–381. MIT Press, 2008.
Google Scholar
M. London, A. Schreibman, M. Häusser, M. E. Larkum, and I. Segev. The information efficacy of a synapse. Nat Neurosci, 5(4):332–340, 2002.
Article Google Scholar
M. Lungarella, K. Ishiguro, Y. Kuniyoshi, and N. Otsu. Methods for Quantifying the Causal Structure of Bivariate Time Series. International Journal of Bifurcation and Chaos, 17(03):903–921, 2007.
Article MathSciNet MATH Google Scholar
D. J. C. MacKay. Information theory, inference and learning algorithms. Cambridge University Press, 2003.
Google Scholar
David Marr. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. MIT Press, 1982.
Google Scholar
R. Marschinski and H. Kantz. Analysing the information flow between financial time series. The European Physical Journal B, 30(2):275–281, 2002.
Article MathSciNet Google Scholar
C. C. McIntyre, S. Miocinovic, and C. R. Butson. Computational analysis of deep brain stimulation. Expert Rev Med Devices, 4(5):615–622, 2007.
Article Google Scholar
G. A. Miller. Information Theory in Psychology: Problems and Methods, chapter Note on the bias of information estimates, pages 95–100. Free Press, 1955.
Google Scholar
P. R. Montague, R. J. Dolan, K. J. Friston, and P. Dayan. Computational psychiatry. Trends in Cognitive Sciences, 16(1):72–80, 2012.
Google Scholar
A. A. Moustafa and M. A. Gluck. Computational cognitive models of prefrontal-striatal-hippocampal interactions in Parkinson’s disease and schizophrenia. Neural Networks, 24(6):575–591, 2011.
Article Google Scholar
M. P. Nawrot, C. Boucsein, V. Rodriguez Molina, A. Riehle, A. Aertsen, and S. Rotter. Measurement of variability dynamics in cortical spike trains. J Neurosci Methods, 169(2):374–390, 2008.
Article Google Scholar
I. Nemenman, W. Bialek, and R. R. de Ruyter van Steveninck. Entropy and information in neural spike trains: Progress on the sampling problem. Physical Rev E, 69(5):056111, 2004.
Google Scholar
K. H. Norwich. Information, Sensation, and Perception. Academic Press, 1993.
Google Scholar
B. A. Olshausen and D. J. Field. Sparse coding with an overcomplete basis set: A strategy employed by VI? Vision Res, 37(23):3311–3325, 1997.
Article Google Scholar
D. Ostwald and A. P. Bagshaw. Information theoretic approaches to functional neuroimaging. Magn Reson Imaging, 29(10):1417–1428, 2011.
Article Google Scholar
L. Paninski. Estimation of entropy and mutual information. Neural Comput, 15(6):1191–1254, 2003.
Article MATH Google Scholar
L. Paninski. Maximum likelihood estimation of cascade point-process neural encoding models. Network: Computation in Neural Systems, 15(4):243–262, November 2004.
Google Scholar
S. Panzeri, R. S. Petersen, S. R. Schultz, M. Lebedev, and M. E. Diamond. The role of spike timing in the coding of stimulus location in rat somatosensory cortex. Neuron, 29(3):769–777, 2001.
Article Google Scholar
S. Panzeri, R. Senatore, M. A. Montemurro, and R. S. Petersen. Correcting for the sampling bias problem in spike train information measures. Journal of neurophysiology, 98(3):1064–72, 2007.
Article Google Scholar
S. Panzeri and A. Treves. Analytical estimates of limited sampling biases in different information measures. Network, 7:87–107, 1995.
Article Google Scholar
J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Kaufmann, M, 1988.
Google Scholar
J. Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2000.
Google Scholar
M. Pirini, L. Rocchi, M. Sensi, and L. Chiari. A computational modelling approach to investigate different targets in deep brain stimulation for Parkinson’s disease. J Comput Neurosci, 26(1):91–107, 2009.
Article Google Scholar
A. Pouget, P. Dayan, and R. Zemel. Information processing with population codes. Nat Rev Neurosci, 1(2):125–132, 2000.
Article Google Scholar
M. Prokopenko, F. Boschetti, and A. J. Ryan. An information-theoretic primer on complexity, self-organization, and emergence. Complexity, 15(1):11–28, 2009.
Article MathSciNet Google Scholar
R. Q. Quiroga and S. Panzeri. Extracting information from neuronal populations: information theory and decoding approaches. Nat Rev Neurosci, 10(3):173–195, 2009.
Article Google Scholar
K. R. Rad and L. Paninski. Information Rates and Optimal Decoding in Large Neural Populations. In NIPS 2011: Granada, Spain, pages 1–9, 2011.
Google Scholar
F. Rieke, D. Warland, and W. Bialek. Coding efficiency and information rates in sensory neurons. EPL (Europhysics Letters), 22(2):151–156, 1993.
Google Scholar
F. Rieke, D. Warland, R. de Ruyter van Steveninck, and W. Bialek. Spikes: Exploring the Neural Code (Computational Neuroscience). A Bradford Book, 1999.
Google Scholar
E. T. Rolls and A. Treves. The neuronal encoding of information in the brain. Prog Neurobiol, 95(3):448–490, 2011.
Article Google Scholar
T. Schreiber. Measuring Information Transfer. Phys Rev Lett, 85(2):461–464, 2000.
Google Scholar
T. Schürmann. Bias analysis in entropy estimation. Journal of Physics A: Mathematical and General, 37(27):L295–L301, 2004.
Article MATH Google Scholar
T. J. Sejnowski. Time for a new neural code? Nature, 376(July):21–22, 1995.
Article Google Scholar
C. E. Shannon. A Mathematical Theory of Communication. The Bell System Technical Journal, 27(July, October 1948):379–423, 623–656, 1948.
Google Scholar
A. N. Shiryayev. Probability, volume 95 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1984.
Google Scholar
J. Shlens, M. B. Kennel, H. D. I. Abarbanel, and E. J. Chichilnisky. Estimating information rates with confidence intervals in neural spike trains. Neural Comput, 19(7):1683–1719, 2007.
Article MathSciNet MATH Google Scholar
E. P. Simoncelli and B. A. Olshausen. Natural image statistics and neural representation. Annu Rev Neurosci, 24:1193–1216, 2001.
Article Google Scholar
W. R. Softky and C. Koch. The highly irregular firing of cortical cells is inconsistent with temporal integration of random EPSPs. The Journal of Neuroscience, 13(1):334–350, 1993.
Google Scholar
K. E. Stephan, L. M. Harrison, S. J. Kiebel, O. David, W. D. Penny, and K. J. Friston. Dynamic causal models of neural system dynamics:current state and future extensions. Journal of biosciences, 32(1):129–144, 2007.
Article Google Scholar
R. Stollhoff, I. Kennerknecht, T. Elze, and J. Jost. A computational model of dysfunctional facial encoding in congenital prosopagnosia. Neural Networks, 24(6):652–664, 2011.
Article Google Scholar
S. Strong, R. Koberle, R. de Ruyter van Steveninck, and W. Bialek. Entropy and Information in Neural Spike Trains. Phys Rev Lett, 80(1):197–200, 1998.
Google Scholar
H. Theil. Henri Theil’s Contributions to Economics and Econometrics: Econometric Theory and Methodology. Springer, 1992.
Google Scholar
I. Todhunter. A History of the Mathematical Theory of Probability from the Time of Pascal to that of Laplace. Elibron Classics, 1865.
Google Scholar
G. Tononi, O. Sporns, and G. M. Edelman. A measure for brain complexity: relating functional segregation and integration in the nervous system. Proc Natl Acad Sci U S A, 91(11):5033–5037, 1994.
Article Google Scholar
T. Trappenberg. Fundamentals of Computational Neuroscience. Oxford University Press, 2010.
Google Scholar
J. H. van Hateren. A theory of maximizing sensory information. Biol Cybern, 29:23–29, 1992.
Article Google Scholar
R. Vicente, M. Wibral, M. Lindner, and G. Pipa. Transfer entropy–a model-free measure of effective connectivity for the neurosciences. J Comput Neurosci, 30(1):45–67, 2011.
Article MathSciNet Google Scholar
J. D. Victor. Approaches to information-theoretic analysis of neural activity. Biological theory, 1(3):302–316, 2006.
Article Google Scholar
N Wiener. The theory of prediction. In E. Beckenbach, editor, Modern mathematics for engineers. McGraw-Hill, New-York, 1956.
Google Scholar
S. Yarrow, E. Challis, and P. Seriès. Fisher and shannon information in finite neural populations. Neural Comput, 1780:1740–1780, 2012.
Article Google Scholar
L. Zhaoping. Theoretical understanding of the early visual processes by data compression and data selection. Network, 17(4):301–334, 2006.
Article Google Scholar
I. Csiszár. Axiomatic characterizations of information measures. Entropy, 10(3):261–273, 2008.
Article MATH Google Scholar

Download references

Acknowledgements

The author would like to thank Nihat Ay, Yuri Campbell, Aleena Garner, Jörg Lehnert, Timm Lochmann, Wiktor Młynarski, and Carolin Stier for their useful comments on the manuscript.

Author information

Authors and Affiliations

Max-Planck-Institute for Mathematics in the Sciences, Inselstr. 22, 04103, Leipzig, Germany
Felix Effenberger

Authors

Felix Effenberger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Felix Effenberger .

Editor information

Editors and Affiliations

Mathematical Institute, Serbian Academy of Science and Arts, Belgrade, Serbia
Goran Rakocevic
Faculty of Engineering, University of Kragujevac, Kragujevac, Serbia
Tijana Djukic
Faculty of Engineering, University of Kragujevac, Kragujevac, Serbia
Nenad Filipovic
School of Electrical Engineering, University of Belgrade, Belgrade, Serbia
Veljko Milutinović

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Effenberger, F. (2013). A Primer on Information Theory with Applications to Neuroscience. In: Rakocevic, G., Djukic, T., Filipovic, N., Milutinović, V. (eds) Computational Medicine in Data Mining and Modeling. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8785-2_5

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8785-2_5
Published: 19 September 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8784-5
Online ISBN: 978-1-4614-8785-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics