Abstract
Advances in recording technologies have given neuroscience researchers access to large amounts of data, in particular, simultaneous, individual recordings of large groups of neurons in different parts of the brain. A variety of quantitative techniques have been utilized to analyze the spiking activities of the neurons to elucidate the functional connectivity of the recorded neurons. In the past, researchers have used correlative measures. More recently, to better capture the dynamic, complex relationships present in the data, neuroscientists have employed causal measures—most of which are variants of Granger causality—with limited success. This paper motivates the directed information, an information and control theoretic concept, as a modality-independent embodiment of Granger’s original notion of causality. Key properties include: (a) it is nonzero if and only if one process causally influences another, and (b) its specific value can be interpreted as the strength of a causal relationship. We next describe how the causally conditioned directed information between two processes given knowledge of others provides a network version of causality: it is nonzero if and only if, in the presence of the present and past of other processes, one process causally influences another. This notion is shown to be able to differentiate between true direct causal influences, common inputs, and cascade effects in more two processes. We next describe a procedure to estimate the directed information on neural spike trains using point process generalized linear models, maximum likelihood estimation and information-theoretic model order selection. We demonstrate that on a simulated network of neurons, it (a) correctly identifies all pairwise causal relationships and (b) correctly identifies network causal relationships. This procedure is then used to analyze ensemble spike train recordings in primary motor cortex of an awake monkey while performing target reaching tasks, uncovering causal relationships whose directionality are consistent with predictions made from the wave propagation of simultaneously recorded local field potentials.
Similar content being viewed by others
References
Abler, B., Roebroeck, A., Goebel, R., Höse, A., Schönfeldt-Lecuona, C., Hole, G., et al. (2006). Investigating directed influences between activated brain areas in a motor-response task using fMRI. Magnetic Resonance Imaging, 24(2), 181–185.
Akaike, H. (1976). An information criterion (AIC). Mathematical Scientist, 14(153), 5–9.
Al-khassaweneh, M., & Aviyente, S. (2008). The relationship between two directed information measures. IEEE Signal Processing Letters, 15, 801–804.
Amblard, P., & Michel, O. (2010). On directed information theory and Granger causality graphs. Arxiv preprint. arXiv:1002.1446.
Barron, A., & Cover, T. (1991). Minimum complexity density estimation. IEEE Transactions on Information Theory, 37(4), 1034–1054.
Bitan, T., Booth, J., Choy, J., Burman, D., Gitelman, D., & Mesulam, M. (2005). Shifts of effective connectivity within a language network during rhyming and spelling. Journal of Neuroscience, 25(22), 5397.
Bremaud, P. (1981). Point processes and queues: martingale dynamics. New York: Springer.
Brovelli, A., Ding, M., Ledberg, A., Chen, Y., Nakamura, R., & Bressler, S. (2004). Beta oscillations in a large-scale sensorimotor cortical network: Directional influences revealed by Granger causality. Proceedings of the National Academy of Sciences of the United States of America, 101(26), 9849.
Brown, E., Barbieri, R., Eden, U., & Frank, L. (2003). Likelihood methods for neural spike train data analysis. In Computational neuroscience: A comprehensive approach.
Brown, E., Barbieri, R., Ventura, V., Kass, R., & Frank, L. (2002). The time-rescaling theorem and its application to neural spike train data analysis. Neural Computation, 14(2), 325–346.
Cai, H., Kulkarni, S., & Verdú, S. (2004). Universal entropy estimation via block sorting. IEEE Transactions on Information Theory, 50(7), 1551–1561.
Cai, H., Kulkarni, S., & Verdu, S. (2006). An algorithm for universal lossless compression with side information. IEEE Transactions on Information Theory, 52(9), 4008–4016.
Casella, G., Berger, R., & Berger, R. (2002). Statistical inference. Pacific Grove: Duxbury.
Cesa-Bianchi, N., & Lugosi, G. (2006). Prediction, learning, and games. Cambridge: Cambridge University Press.
Chávez, M., Martinerie, J., & Le Van Quyen, M. (2003). Statistical assessment of nonlinear causality: Application to epileptic EEG signals. Journal of Neuroscience Methods, 124(2), 113–128.
Cover, T., & Thomas, J. (2006). Elements of information theory. New York: Wiley-Interscience.
Daley, D., & Vere-Jones, D. (1988). An introduction to the theory of point processes. New York: Springer.
David, O., Kiebel, S., Harrison, L., Mattout, J., Kilner, J., & Friston, K. (2006). Dynamic causal modeling of evoked responses in EEG and MEG. NeuroImage, 30(4), 1255–1272.
De Boer, P., Kroese, D., Mannor, S., & Rubinstein, R. (2005). A tutorial on the cross-entropy method. Annals of Operations Research, 134(1), 19–67.
Dhamala, M., Rangarajan, G., & Ding, M. (2008). Analyzing information flow in brain networks with nonparametric Granger causality. NeuroImage, 41(2), 354–362.
Diekman, C. O., Sastry, P., & Unnikrishnan, K. (2009). Statistical significance of sequential firing patterns in multi-neuronal spike trains. Journal of Neuroscience Methods, 182(2), 279–284.
Du, X., Ghosh, B., & Ulinski, P. (2005). Encoding and decoding target locations with waves in the turtle visual cortex. IEEE Transactions on Biomedical Engineering, 52(4), 566–577.
Eguiluz, V., Chialvo, D., Cecchi, G., Baliki, M., & Apkarian, A. (2005). Scale-free brain functional networks. Physical Review Letters, 94(1), 018102.
Elia, N. (2004). When bode meets Shannon: Control-oriented feedback communication schemes. IEEE Transactions on Automatic Control, 49(9), 1477–1488.
Ermentrout, G., & Kleinfeld, D. (2001). Traveling electrical waves in cortex insights from phase dynamics and speculation on a computational role. Neuron, 29(1), 33–44.
Friston, K., Harrison, L., & Penny, W. (2003). Dynamic causal modelling. NeuroImage, 19(4), 1273–1302.
Goebel, R., Roebroeck, A., Kim, D., & Formisano, E. (2003). Investigating directed cortical interactions in time-resolved fMRI data using vector autoregressive modeling and Granger causality mapping. Magnetic Resonance Imaging, 21(10), 1251–1261.
Gorantla, S., & Coleman, T. (2010). On reversible Markov chains and maximization of directed information. In IEEE international symposium on information theory (ISIT), Austin, TX (in press).
Gourevitch, B., & Eggermont, J. (2007). Evaluating information transfer between auditory cortical neurons. Journal of Neurophysiology, 97(3), 2533.
Granger, C. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37(3), 424–438.
Grefkes, C., Eickhoff, S., Nowak, D., Dafotakis, M., & Fink, G. (2008). Dynamic intra-and interhemispheric interactions during unilateral and bilateral hand movements assessed with fMRI and DCM. NeuroImage, 41(4), 1382–1394.
Grünwald, P., & Rissanen, J. (2007). The minimum description length principle. Cambridge: MIT.
Hamandi, K., Powell, H., Laufs, H., Symms, M., Barker, G., Parker, G., et al. (2008). Combined EEG-fMRI and tractography to visualise propagation of epileptic activity. British Medical Journal, 79(5), 594–597.
Hesse, W., Möller, E., Arnold, M., & Schack, B. (2003). The use of time-variant EEG Granger causality for inspecting directed interdependencies of neural assemblies. Journal of Neuroscience Methods, 124(1), 27–44.
Hu, J., Fu, M., & Marcus, S. (2007). A model reference adaptive search method for global optimization. Operations Research, 55(3), 549–568.
Iyengar, S., & Liao, Q. (1997). Modeling neural activity using the generalized inverse Gaussian distribution. Biological Cybernetics, 77(4), 289–295.
Kaminski, M., & Blinowska, K. (1991). A new method of the description of the information flow in the brain structures. Biological Cybernetics, 65(3), 203–210.
Kamiński, M., Ding, M., Truccolo, W., & Bressler, S. (2001). Evaluating causal relations in neural systems: Granger causality, directed transfer function and statistical assessment of significance. Biological Cybernetics, 85(2), 145–157.
Kim, Y., Pennuter, H., & Weissman, T. (2009). Directed information and causal estimation in continuous time. In IEEE international symposium on information theory (ISIT).
Korzeniewska, A., Mańczak, M., Kamiński, M., Blinowska, K., & Kasicki, S. (2003). Determination of information flow direction among brain structures by a modified directed transfer function (dDTF) method. Journal of Neuroscience Methods, 125(1–2), 195–207.
Kramer, G. (1998). Directed information for channels with feedback. Ph.D. thesis, University of Manitoba, Canada.
Kramer, M., Eden, U., Cash, S., & Kolaczyk, E. (2009). Network inference with confidence from multivariate time series. Physical Review E, 79(6), 61916.
Kraskov, A. (2008). Synchronization and interdependence measures and their application to the electroencephalogram of epilepsy patients and clustering of data. Report Nr.: NIC series; 24.
Lastras, L. (2002). An almost sure convergence proof of the sliding-window Lempel-Ziv algorithm. In Proceedings 2002 IEEE international symposium on information theory.
Marko, H. (1973). The bidirectional communication theory–A generalization of information theory. IEEE Transactions on Communications, 21(12), 1345–1351.
Martins, N., & Dahleh, M. (2008). Feedback control in the presence of noisy channels: “Bode-like” fundamental limitations of performance. IEEE Transactions on Automatic Control, 53(7), 1604 –1615.
Massey, J. (1990). Causality, feedback and directed information. In Proc. int. symp. information theory application (ISITA-90) (pp. 303–305).
Massey, J., & Massey, P. (2005). Conservation of mutual and directed information. In Proceedings international symposium on information theory, 2005. ISIT 2005 (pp. 157–158).
Mathai, P., Martins, N., & Shapiro, B. (2007). On the detection of gene network interconnections using directed mutual information. San Deigo: ITA.
Meyn, S., & Tweedie, R. (2009). Markov chains and stochastic stability (p. 622). Cambridge: Cambridge Mathematical Library.
Okatan, M., Wilson, M., & Brown, E. (2005). Analyzing functional connectivity using a network likelihood model of ensemble neural spiking activity. Neural Computation, 17(9), 1927–1961.
Paninski, L. (2003). Estimation of entropy and mutual information. Neural Computation, 15(6), 1191–1253.
Paninski, L., Fellows, M., Hatsopoulos, N., & Donoghue, J. (2004). Spatiotemporal tuning of motor cortical neurons for hand position and velocity. Journal of Neurophysiology, 91(1), 515.
Pearl, J. (2009). Causality: Models, reasoning and inference. New York: Cambridge University Press.
Pereda, E., Quiroga, R., & Bhattacharya, J. (2005). Nonlinear multivariate analysis of neurophysiological signals. Progress in Neurobiology, 77(1–2), 1–37.
Perez-Cruz, F. (2008). Estimation of information theoretic measures for continuous random variables. NIPS.
Permuter, H., Kim, Y., & Weissman, T. (2008). On directed information and gambling. In IEEE international symposium on information theory, 2008. ISIT 2008 (pp. 1403–1407).
Permuter, H., Kim, Y., & Weissman, T. (2009a). Interpretations of directed information in portfolio theory, data compression, and hypothesis testing. Arxiv preprint. arXiv:0912.4872.
Permuter, H., Weissman, T., & Goldsmith, A. (2009b). Finite state channels with time-invariant deterministic feedback. IEEE Transactions on Information Theory, 55(2), 644–662.
Prechtl, J., Cohen, L., Pesaran, B., Mitra, P., & Kleinfeld, D. (1997). Visual stimuli induce waves of electrical activity in turtle cortex. Proceedings of the National Academy of Sciences of the United States of America, 94(14), 7621.
Ramnani, N., Behrens, T., Penny, W., & Matthews, P. (2004). New approaches for exploring anatomical and functional connectivity in the human brain. Biological Psychiatry, 56(9), 613–619.
Rao, A., Hero III, A., States, D., & Engel, J. (2006). Inference of biologically relevant gene influence networks using the directed information criterion. In Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP) (Vol. 2, pp. 1028–1031).
Rao, A., Hero III, A., States, D.J., & Engel, J. D. (2007). Inferring time-varying network topologies from gene expression data. EURASIP Journal on Bioinformatics and System Biology-Special Issue on Gene Networks, 2007, 51947.
Rao, A., Hero III, A., David, J., & Engel, J. (2008). Using directed information to build biologically relevant influence networks. Journal of Bioinformatics and Computational Biology, 6(3), 493–519.
Rissanen, J., & Wax, M. (1987). Measures of mutual and causal dependence between two time series (Corresp.). IEEE Transactions on Information Theory, 33(4), 598–601.
Roebroeck, A., Formisano, E., & Goebel, R. (2005). Mapping directed influence over the brain using Granger causality and fMRI. NeuroImage, 25(1), 230–242.
Rogers, B., Morgan, V., Newton, A., & Gore, J. (2007). Assessing functional connectivity in the human brain by fMRI. Magnetic Resonance Imaging, 25(10), 1347–1357.
Rubino, D., Robbins, K., & Hatsopoulos, N. (2006). Propagating waves mediate information transfer in the motor cortex. Nature Neuroscience, 9(12), 1549–1557.
Salvador, R., Suckling, J., Schwarzbauer, C., & Bullmore, E. (2005). Undirected graphs of frequency-dependent functional connectivity in whole brain networks. Philosophical Transactions of the Royal Society B: Biological Sciences, 360(1457), 937–946.
Schreiber, T. (2000). Measuring information transfer. Physical Review Letters, 85(2), 461–464.
Schuyler, B., Ollinger, J., Oakes, T., Johnstone, T., & Davidson, R. (2009). Dynamic Causal Modeling applied to fMRI data shows high reliability. NeuroImage, 49, 603–611.
Seth, A., & Edelman, G. (2007). Distinguishing causal interactions in neural populations. Neural Computation, 19(4), 910–933.
Smith, V., Yu, J., Smulders, T., Hartemink, A., & Jarvis, E. (2006). Computational inference of neural information flow networks. PLoS Computational Biology, 2(11), e161.
Stephan, K., Kasper, L., Harrison, L., Daunizeau, J., den Ouden, H., Breakspear, M., et al. (2008). Nonlinear dynamic causal models for fMRI. NeuroImage, 42(2), 649–662.
Stevenson, I., Rebesco, J., Hatsopoulos, N., Haga, Z., Miller, L., & Körding, K. (2009). Bayesian inference of functional connectivity and network structure from spikes. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 17(3), 203.
Sundaresan, R., & Verdú, S. (2006). Capacity of queues via point-process channels. IEEE Transactions on Information Theory, 52(6), 2697–2709.
Tatikonda, S. (2000). Control under communication constraints. Ph.D. thesis, Massachusetts Institute of Technology.
Tatikonda, S., & Mitter, S. (2009). The capacity of channels with feedback. IEEE Transactions on Information Theory, 55(1), 323–349.
Truccolo, W., Eden, U., Fellows, M., Donoghue, J., & Brown, E. (2005). A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects. Journal of Neurophysiology, 93(2), 1074–1089.
Uddin, L., Clare Kelly, A., Biswal, B., Xavier Castellanos, F., & Milham, M. (2009). Functional connectivity of default mode network components: Correlation, anticorrelation, and causality. Human Brain Mapping, 30(2), 625–637.
Venkataramanan, R., & Pradhan, S. (2007). Source coding with feed-forward: Rate-distortion theorems and error exponents for a general source. IEEE Transactions on Information Theory, 53(6), 2154–2179.
Vogels, T., & Abbott, L. (2005). Signal propagation and logic gating in networks of integrate-and-fire neurons. Journal of Neuroscience, 25(46), 10786.
Wang, X., Chen, Y., Bressler, S., & Ding, M. (2007). Granger causality between multiple interdependent neurobiological time series: Blockwise versus pairwise methods. International Journal of Neural Systems, 17(2), 71.
Wu, W., & Hatsopoulos, N. (2006). Evidence against a single coordinate system representation in the motor cortex. Experimental Brain Research, 175(2), 197–210.
Zhao, L., Permuter, H., Kim, Y., & Weissman, T. (2010). Universal estimation of directed information. In IEEE international symposium on information theory (ISIT), Austin, TX (in press).
Ziv, J., & Lempel, A. (1977). A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3), 337–343.
Author information
Authors and Affiliations
Corresponding author
Additional information
Action Editor: Alexander G. Dimitrov
Appendices
Appendix A: Proof of Lemma 1
Proof
First, prove that \(\mathcal{H}(\text{Y}||\text{X})\). This proof closely follows the proof for the unconditional entropy rate in Cover and Thomas (2006). An important theorem used for the proof is the Cesaro mean theorem (Cover and Thomas 2006): For sequences of real numbers (a 1, ⋯ a n ) and (b 1, ⋯ b n ), if limn → ∞ a n = a, and \(b_n = \frac{1}{n} \sum_{i=1}^n a_n\), then limn → ∞ b n = a.
By definition, \( \text{H}(Y^n || X^n) = \frac{1}{n}\sum_{i=1}^n \text{H}(Y_i | Y^{i-1}, X^i) \). Since conditioning reduces entropy, entropy is nonnegative, and the processes are jointly stationary, we have
Observe that
where Eq. (46) uses the property that conditioning reduces entropy (in reverse) and Eq. (47) uses stationarity. This sequence of real numbers (once the process is defined, that is, the underlying probability distribution is specified), the entropies are deterministic numbers) \(a_i \triangleq \text{H}(Y_{i} | Y^{i-1}, X^{i})\) are nonincreasing and bounded below by 0. Therefore, limit of a n as n → ∞ exists, and thus, by employing Cesaro mean theorem, \(\text{H}(\mathcal{Y} || \mathcal{X}) \triangleq \lim_{n \to \infty} \frac{1}{n}\text{H}(Y^n || X^n)\) exists.
Next, taking X n to be a deterministic sequence, and following the above, \(\text{H}(\mathcal{Y} ) \triangleq \lim_{n \to \infty} \frac{1}{n}\text{H}(Y^n )\) exists. Taking the limit in Eq. (24), \(\mathcal{I}(\text{X} \to \text{Y})\triangleq \lim_{n \to \infty} \frac{1}{n}\text{I}(X^n \to Y^n)\) also exists.□
Appendix B: Proof of Lemma 2
Proof
The normalized causal entropy can be rewritten as
where Eq. (48) follows by the definition of causally conditioned entropy, Eq. (49) follows by chain rule for entropy, Eq. (50) follows from the Markov assumption, and Eq. (51) follows from the stationarity assumption.□
Rights and permissions
About this article
Cite this article
Quinn, C.J., Coleman, T.P., Kiyavash, N. et al. Estimating the directed information to infer causal relationships in ensemble neural spike train recordings. J Comput Neurosci 30, 17–44 (2011). https://doi.org/10.1007/s10827-010-0247-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10827-010-0247-2