Skip to main content
Log in

Four simple axioms of dependence measures

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

Recently new methods for measuring and testing dependence have appeared in the literature. One way to evaluate and compare these measures with each other and with classical ones is to consider what are reasonable and natural axioms that should hold for any measure of dependence. We propose four natural axioms for dependence measures and establish which axioms hold or fail to hold for several widely applied methods. All of the proposed axioms are satisfied by distance correlation. We prove that if a dependence measure is defined for all bounded nonconstant real valued random variables and is invariant with respect to all one-to-one measurable transformations of the real line, then the dependence measure cannot be weakly continuous. This implies that the classical maximal correlation cannot be continuous and thus its application is problematic. The recently introduced maximal information coefficient has the same disadvantage. The lack of weak continuity means that as the sample size increases the empirical values of a dependence measure do not necessarily converge to the population value.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tamás F. Móri.

Additional information

T. F. Móri was supported by the Hungarian National Research, Development and Innovation Office NKFIH—Grant No. K125569. Part of this research was based on work supported by the National Science Foundation, while the second author was working at the Foundation. G. J. Székely is grateful for many interesting discussions with Yakir and David Reshef, Abram M. Kagan, and Gábor Tusnády.

Appendix

Appendix

Proof of Proposition 1

Without loss of generality assume that \(E[X]=0\). Let Q denote the distribution of X on the Borel sets of \(\mathbb {R}\). We have to find a 1–1 function f such that \(\int xf(x)\,dQ=0\).

By assumption, there exist real numbers \(t_1<t_2<t_3\) such that each of the intervals \((-\infty ,t_1]\), \((t_1,t_2]\), \((t_2,t_3]\), \((t_3,+\infty )\) has positive measure (w.r.t. Q). Let \(\delta \) be a suitably small positive number (the meaning of “suitably” will be made clear later). One can find \(t_0<t_1\) and \(t_4>t_3\) such that both \(Q(-\infty ,t_0]\) and \(Q(t_4,+\infty )\) are less than \(\delta \) (possibly 0).

Let the intervals \((-\infty ,t_0]\), \((t_0.t_1]\), \((t_1,t_2]\), \((t_2,t_3]\), \((t_3,t_4]\), and \((t_4,+\infty )\) be denoted by \(A_0\), \(A_1\), \(A_2\), \(A_3\), \(A_4\), and \(A_5\), respectively. Introduce

$$\begin{aligned} \mu _i=\int _{A_i}x\,dQ,\quad \sigma _i^2=\int _{A_i}x^2dQ,\quad 0\le i\le 5. \end{aligned}$$

Then \(\mu _0+\cdots +\mu _5=0\).

It is not hard to see that there exist real constants \(a_1,a_2,a_3,a_4\), all different, such that

$$\begin{aligned} a_1(\mu _0+\mu _1)+a_2\mu _2+a_3\mu _3+a_4(\mu _4+\mu _5)=0. \end{aligned}$$
(1)

Indeed, consider the hyperplane \(\mathcal {L}\) of all vectors \((a_1,a_2,a_3,a_4) \in \mathbb {R}^4\) satisfying (1). \(\mathcal {L}\) cannot coincide with the hyperplane \(\mathcal {L}_{1,2}=\{a_1=a_2\}\), because the \(\mathcal {L}_{1,2}\) is orthogonal to the vector \((1,-1,0,0)\), which is not parallel to \((\mu _0+\mu _1,\mu _2,\mu _3,\mu _4+\mu _5)\), since the latter can have at most one 0 coordinate. Thus, \(\dim (\mathcal {L}\cap \mathcal {L}_{1,2})=2\). The same holds for \(\mathcal {L}_{i,j}\), the hyperplane defined by equality \(a_i=a_j\)\((i\ne j)\). Since \(\mathcal {L}\) cannot be covered by six of its lower dimensional subspaces, the existence of a vector in \(\mathcal {L}\) with different coordinates follows.

Let \(K>\max _{1\le i\le 4}|a_i|\). By continuity, if \(\delta \) is small enough, one can find constants \(b_1,b_2,b_3,b_4\) all different, such that \(\max _{1\le i\le 4}|b_i|<K\), and

$$\begin{aligned} -K\mu _0+b_1\mu _1+b_2\mu _2+b_3\mu _3+b_4\mu _4+K\mu _5=0. \end{aligned}$$

Finally, choose \(c_0,c_1,\dots ,c_5\) in such a way that none of them are equal to 0, \(c_0\) and \(c_5\) are positive, and \(\sum _{i=0}^5 c_i\sigma _i^2=0\). This can be done, because there are at least 3 positive among the quantities \(\sigma _i^2\).

Now, let \(b_0=-K\), \(b_5=K\), and \(f(x)=b_i+\varepsilon c_i x\) if \(x\in A_i\), \(0\le i\le 5\). Then f is injective provided \(\varepsilon \) is a sufficiently small positive number, and

$$\begin{aligned} \int _{\mathbb {R}} xf(x)\,dQ=\sum _{i=0}^5 (b_i\mu _i+\varepsilon c_i \sigma _i^2)=0, \end{aligned}$$

as needed.

Such an f cannot exist if X can take on exactly two values, because in that case uncorrelatedness is equivalent to independence.

When the distribution of X is concentrated on exactly 3 points, and X is supposed to have mean 0, then such an f exists if and only if zero is not among the possible values of X. (If \(E[X]=0\) is not supposed, the necessary and sufficient condition for f to exist is \(P(X=E[X])=0\).) Indeed, let \(x_1<x_2<x_3\) be the possible values of X, with probabilities \(q_1,q_2,q_3\), respectively. Then \(q_1x_1+q_2x_2+q_3x_3=0\), and \(x_1<0<x_3\). We are looking for real numbers \(f_1,f_2,f_3\) such that \(q_1x_1f_1+q_2x_2f_2+ q_3x_3f_3=0\). If \(x_2=0\), then it can only achieved with \(f_1=f_3\). In the complementary case \(f_1=-1\), \(f_3=1\) and \(f_2=(q_1x_1-q_3x_3)/(q_2x_2)\) will do, because \(f_2=1\) would imply \(-q_1x_1+q_2x_2+q_3x_3=0\), hence \(q_1x_1=0\), which is not allowed, and similarly, \(f_2=-1\) would imply \(q_3x_3=0\). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Móri, T.F., Székely, G.J. Four simple axioms of dependence measures. Metrika 82, 1–16 (2019). https://doi.org/10.1007/s00184-018-0670-3

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-018-0670-3

Keywords

Navigation