Skip to main content

Synergy, redundancy, and multivariate information measures: an experimentalist’s perspective


Information theory has long been used to quantify interactions between two variables. With the rise of complex systems research, multivariate information measures have been increasingly used to investigate interactions between groups of three or more variables, often with an emphasis on so called synergistic and redundant interactions. While bivariate information measures are commonly agreed upon, the multivariate information measures in use today have been developed by many different groups, and differ in subtle, yet significant ways. Here, we will review these multivariate information measures with special emphasis paid to their relationship to synergy and redundancy, as well as examine the differences between these measures by applying them to several simple model systems. In addition to these systems, we will illustrate the usefulness of the information measures by analyzing neural spiking data from a dissociated culture through early stages of its development. Our aim is that this work will aid other researchers as they seek the best multivariate information measure for their specific research goals and system. Finally, we have made software available online which allows the user to calculate all of the information measures discussedwithin this paper.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5



  2. Throughout the paper we will use capital letters to refer to variables and lower case letters to refer to individual values of those variables. We will also use discrete variables, though several of the information measures discussed can be directly extended to continuous variables. When working with a continuous variable, various techniques exists, such as kernel density estimation, which can be used to infer a discrete distribution from a continuous variable. Logarithms will be base 2 throughout in order to produce information values in units of bits.

  3. We will use S to refer to a set of n X variables such that S = {X 1, X 2, . . . X n } throughout the paper.

  4. It should be noted that DeWeese and Meister refer to the expression in Eq. (29) as the specific surprise.


Download references


We would like to thank Paul Williams, Randy Beer, Alexander Murphy-Nakhnikian, Shinya Ito, Ben Nicholson, Emily Miller, Virgil Griffith, and Elizabeth Timme for providing useful comments. We would also like to thank the anonymous reviewers for their helpful comments on this paper. Their input during the revision process was invaluable.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Nicholas Timme.

Additional information

Action Editor: Jonathan David Victor

Conflict of Interest

The authors declare that they have no conflict of interest.


Appendix A: Additional total correlation derivation

Equation (14) can be rewritten as Eq. (15) by adding and subtracting several joint entropy terms and then using Eq. (2). For instance, when n = 3, we have:

$$\begin{array}{rll} TC(S)&=&\left(\sum\limits_{X_i \in S}H(X_i)\right)-H(S)\notag\\ &=& H(X_1)+H(X_2)+H(X_3)-H(X_1,X_2,X_3)\notag\\ &=& H(X_1)+H(X_2)-H(X_1,X_2)+H(X_1,X_2)\notag\\ &&{\kern2pt}+H(X_3)-H(X_1,X_2,X_3)\notag\\ &=&I(X_1;X_2)+I(X_1,X_2;X_3) \end{array} $$

A similar substitution can be peformed for n > 3.

Appendix B: Additional dual total correlation derivation

Equation (16) can be rewritten as Eq. (18) by substituting the expression for the total correlation in Eq. (14) and then applying Eq. (2).

$$\begin{array}{rll} DTC(S)&=&\left(\sum\limits_{X_i \in S}H(S/X_i)\right)-(n-1)H(S)\\ &=&\left(\sum\limits_{X_i \in S}H(S/X_i)+H(X_i)\right)-nH(S)-TC(S)\\ &=&\left(\sum\limits_{X_i \in S}I(S/X_i;X_i)\right)-TC(S) \end{array} $$

Appendix C: Model network

Given values for p r , p 1 y , p 12, and p 2 y , the relevant conditional probabilities can be calculated in the following way:

$$ p(x_1=1)=p_r $$
$$ p(x_1=0)=1-p_r $$
$$ p(x_2=1|x_1=1)=p_r+p_{12}-p_rp_{12} $$
$$ p(x_2=0|x_1=1)=1-p(x_2=1|x_1=1) $$
$$ p(x_2=1|x_1=0)=p_r $$
$$ p(x_2=0|x_1=0)= 1 - p_r $$
$$ p(y=1|x_1=0,x_2=0)=p_r $$
$$ p(y=0|x_1=0,x_2=0)= 1 - p_r $$
$$ p(y=1|x_1=1,x_2=0)=p_r + p_{1y} - p_rp_{1y} $$
$$p(y\,=\,0|x_1\,=\,1,x_2\,=\,0)\,=\, 1\,-\, p(y\,=\,1|x_1 = 1,x_2=0) $$
$$ p(y=1|x_1=0,x_2=1)=p_r + p_{2y} - p_rp_{2y} $$
$$p(y\,=\,0|x_1\,=\,0,x_2\,=\,1)\,=\,1\,-\, p(y \,=\,1|x_1\,=\,0,x_2\,=\,1) $$
$$\begin{array}{rll} p(y\,=\,1|x_1\,=\,1,x_2\,=\,1)&=& p_r \,+\, p_{1y} + p_{2y} -p_rp_{1y}-p_rp_{2y} \\ &&- p_{1y}p_{2y} \,+\, p_rp_{1y}p_{2y} \end{array} $$
$$ p(y\,=\,0|x_1\,=\,1,x_2\,=\,1)\,=\, 1\,-\, p(y=1 |x_1\,=\,1,x_2\,=\,1) $$

Once these conditional probabilities are defined, the joint probabilities p(y, x 1, x 2) can be calculated using the general relationship between joint and conditional probabilities:

$$ p(A=a,B=b) = p(A=a|B=b)p(B=b) $$

The joint probabilities for the examples discussed in the main text of the article are shown in Table 10.

Table 10 Joint probabilities for examples 9 to 13

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Timme, N., Alford, W., Flecker, B. et al. Synergy, redundancy, and multivariate information measures: an experimentalist’s perspective. J Comput Neurosci 36, 119–140 (2014).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Information theory
  • Multivariate information measures
  • Complex systems
  • Neural coding
  • Dissociated neuronal cultures
  • Multielectrode array