Means and Covariance Functions for Geostatistical Compositional Data: an Axiomatic Approach

Abstract

This work focuses on the characterization of the central tendency of a sample of compositional data. It provides new results about theoretical properties of means and covariance functions for compositional data, with an axiomatic perspective. Original results that shed new light on geostatistical modeling of compositional data are presented. As a first result, it is shown that the weighted arithmetic mean is the only central tendency characteristic satisfying a small set of axioms, namely continuity, reflexivity, and marginal stability. Moreover, this set of axioms also implies that the weights must be identical for all parts of the composition. This result has deep consequences for spatial multivariate covariance modeling of compositional data. In a geostatistical setting, it is shown as a second result that the proportional model of covariance functions (i.e., the product of a covariance matrix and a single correlation function) is the only model that provides identical kriging weights for all components of the compositional data. As a consequence of these two results, the proportional model of covariance function is the only covariance model compatible with reflexivity and marginal stability.

This is a preview of subscription content, log in to check access.

References

  1. Aczél J (1966) Lectures on functional equations and their applications. Academic Press, New York

    Google Scholar 

  2. Aczél J, Dhombres J (1989) Functional equations in several variables. Cambridge University Press, Cambridge

    Google Scholar 

  3. Aitchison J (1982) The statistical analysis of compositional data. J R Stat Soc Ser B (Stat Methodol) 44(2):139–177

    Google Scholar 

  4. Aitchison J (1986) The statistical analysis of compositional data. Chapman and Hall, London

    Google Scholar 

  5. Aitchison J (1989) Measures of location of compositional data sets. Math Geol 24(4):365–379

    Article  Google Scholar 

  6. Arrow KL (1950) A difficulty in the concept of social welfare. J Political Econ 58(4):328–346

    Article  Google Scholar 

  7. Billheimer D, Guttorp P, Fagan WF (2001) Statistical interpretation of species composition. J Am Stat Assoc 96:1205–1214

    Article  Google Scholar 

  8. Bogaert P (2002) Spatial prediction of categorical variables: the Bayesian maximum entropy approach. Stoch Environ Res Risk A 16:425–448

    Article  Google Scholar 

  9. Chilès JP, Delfiner P (2012) Geostatistics: modeling spatial uncertainty, 2nd edn. Wiley, London

    Google Scholar 

  10. Cressie N (1993) Statistics for spatial data, Revised edn. Wiley, London

    Google Scholar 

  11. Egozcue JJ, Pawlowksy-Glahn V (2011) Basic concepts and procedures. In: Pawlowsky-Glahn V, Buccianti A (eds) Compositional data analysis: theory and applications. Wiley, London

    Google Scholar 

  12. Egozcue JJ, Pawlowksy-Glahn V, Figureas GM, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35(3):279–300

    Article  Google Scholar 

  13. Griva I, Nash SG, Sofer A (2009) Linear and nonlinear optimization, 2nd edn. SIAM

  14. Helterbrand JD, Cressie N (1994) Universal cokriging under intrinsic coregionalization. Math Geol 26:205–226

    Article  Google Scholar 

  15. Kolmogorov A (1930) Sur la notion de la moyenne. Atti R Accad Naz Lincei Mem Cl Sci Fis Mat Natur Sez 12:323–343

    Google Scholar 

  16. Liu RY, Parelius JM, Singh K (1999) Multivariate analysis by data depth: descriptive statistics, graphics and inference (with discussion and a rejoinder by Liu and Singh). Ann Stat 27(3):783–858

    Article  Google Scholar 

  17. Mateu-Figueras G, Pawlowksy-Glahn V, Egozcue JJ (2011) The principle of working on coordinates. In: Pawlowsky-Glahn V, Buccianti A (eds) Compositional data analysis: theory and applications. Wiley, London

    Google Scholar 

  18. Matkowski J (2010) Generalized weighted quasi-arithmetic means. Aequ Math 79:203–212

    Article  Google Scholar 

  19. Miller WE (2002) Revisiting the geometry of a ternary diagram with the half-taxi metric. Math Geol 34(3):275–290

    Article  Google Scholar 

  20. Pawlowksy-Glahn V, Egozcue JJ (2002) BLU estimators and compositional data. Math Geol 34(3):259–274

    Article  Google Scholar 

  21. Pawlowksy-Glahn V, Olea RA (2004) Geostatistical analysis of compositional data. Oxford University Press, Oxford

    Google Scholar 

  22. Scealy JL, Welsh AH (2014) Colours and cocktails: compositional data analysis 2013 Lancaster lecture. Aust NZ J Stat 56(2):145–169

    Article  Google Scholar 

  23. Sharp WE (2006) The graph median—a stable alternative measure of central tendency for compositional data sets. Math Geol 38:221–229

    Article  Google Scholar 

  24. Shurtz RF (2000) Comment on “Logratios and natural laws in compositional data analysis” by J Aitchison. Math Geol 32:645–647

    Article  Google Scholar 

  25. Tolosana-Delgado R, van den Boogaart K, Pawlowsky-Glahn V (2011) Geostatistics for compositions. In: Pawlowsky-Glahn V, Buccianti A (eds) Compositional data analysis: theory and applications. Wiley, London

    Google Scholar 

  26. Vardi Y, Zhang CH (2000) The multivariate \(L_1\)-median and associated data depth. Proc Natl Acad Sci 97(4):1423–1426

    Article  Google Scholar 

  27. Wackernagel H (2013) Multivariate geostatistics: an introduction with applications. Springer, Berlin

    Google Scholar 

  28. Walvoort DJJ, de Gruijter JJ (2001) Compositional kriging: a spatial interpolation method for compositional data. Math Geol 33(8):951–966

    Article  Google Scholar 

  29. Zuo Y, Serfling R (2000) General notions of statistical depth function. Ann Stat 28(2):461–482

    Article  Google Scholar 

Download references

Acknowledgements

We are truly indebted to one Advisory Editor and to the Editor-in-Chief for their very constructive comments, which helped to improve this paper.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Denis Allard.

Additional information

Denis Allard and Thierry Marchant contributed equally to this work. They are listed alphabetically.

Appendices

Appendix A: Proof of Theorem 2

The simplex is a closed set; that is, some compositions are allowed to be equal to 0. While this might be a problem for the definition of log-ratios, it will not be a problem here. Clearly, the conditions are necessary: it is simple to check that (5) satisfies conditions C1–C3 for part A and C1–C2 for part B.

For the sufficiency, assume conditions C1–C3. By marginal stability, one has

$$\begin{aligned} M^{k}(\mathbf {x}_{1}, \ldots , \mathbf {x}_{n}) = F_{k}\left( x_{1}^{k}, \ldots , x_{n}^{k}\right) \end{aligned}$$

for some function \(F_{k}: \ [0,1]^{n} \rightarrow \mathbb {R}\), for any \(k \in \{ 1, \ldots , p \}\). If continuity is assumed, as in part A, then, by the extreme value theorem, \(F_{k}\) is bounded since \(x_{i}^k \in [0,1]\) for all \(i=1,\dots ,n\). If we assume \(\mathbf {M}(\mathbf {x}_1,\dots ,\mathbf {x}_n) \in \mathbb {S}^{p-1}\), as in part B, then \(F_{k}\) is also bounded. Choose any \(l,l',l'' \in \{1, \ldots , p\}\) and suppose \(x_{i}^{k}=0\) for all \(k \ne l,l',l''\) and all \(i \in \{1, \ldots , n\}\). Since \(\sum _{k=1}^{p} M^{k}(\mathbf {x}_{1}, \ldots , \mathbf {x}_{n}) =1\) and \(x_{i}^{l''} = 1 - x_{i}^{l} - x_{i}^{l'}\) for all \(i \in \{1, \ldots , n\}\), it is the case that

$$\begin{aligned} F_{l}\left( x_{1}^{l}, \ldots , x_{n}^{l}\right) + F_{l'}\left( x_{1}^{l'}, \ldots , x_{n}^{l'}\right) + F_{l''}\left( 1 - x_{1}^{l} - x_{1}^{l'}, \ldots , 1 - x_{n}^{l} - x_{n}^{l'}\right) =1. \end{aligned}$$
(9)

Let us define the mapping \(G: \ [0,1]^{n} \rightarrow \mathbb {R}\) by \(G(u_{1}, \ldots , u_{n}) = 1-F_{l''}(1-u_{1}, \ldots , 1-u_{n})\). Equation (9) then becomes

$$\begin{aligned} F_{l}\left( x_{1}^{l}, \ldots , x_{n}^{l}\right) + F_{l'}\left( x_{1}^{l'}, \ldots , x_{n}^{l'}\right) = G\left( x_{1}^{l} + x_{1}^{l'}, \ldots , x_{n}^{l} + x_{n}^{l'}\right) , \end{aligned}$$
(10)

for all \(x_{1}^{l}, x_{1}^{l'} \in [0,1]\) such that \(x_{1}^{l} + x_{1}^{l'} \le 1\). In particular, it holds for all \(x_{1}^{l}, x_{1}^{l'} \in [0,1/2]\). Equation (10) is a generalized Pexider equation, and because \(F_l\), \(F_{l'}\), and G are bounded, its unique solution is

$$\begin{aligned} F_{l}(u_{1}, \ldots , u_{n})&= \lambda _{1} u_{1} + \cdots + \lambda _{n} u_{n} + \gamma _{l},&\forall u_{1}, \ldots , u_{n} \in [0,1/2],\\ F_{l'}(u_{1}, \ldots , u_{n})&= \lambda _{1} u_{1} + \cdots + \lambda _{n} u_{n} + \gamma _{l'},&\forall u_{1}, \ldots , u_{n} \in [0,1/2],\\ G( u_{1}, \ldots , u_{n} )&= \lambda _{1} u_{1} + \cdots + \lambda _{n} u_{n} + \gamma _{l} + \gamma _{l'},&\forall u_{1}, \ldots , u_{n} \in [0,1], \end{aligned}$$

for some real numbers \(\lambda _{1}, \ldots , \lambda _{n}, \gamma _{l}\), and \(\gamma _{l'}\) (Aczél 1966). The expression for G yields \(F_{l''}( u_{1}, \ldots , u_{n} ) = 1- G(1-u_{1}, \ldots , 1- u_{n}) = \lambda _{1} u_{1} + \cdots + \lambda _{n} u_{n} + \beta _{l''}\) for all \(u_{1}, \ldots , u_{n} \in [0,1]\) and for some real \(\beta _{l''}\).

Since the choice of the components l, \(l'\), and \(l''\) in the above reasoning is arbitrary, \(M^{k}(\mathbf {x}_{1}, \ldots , \mathbf {x}_{n}) =F_{k}(x_{1}^{k}, \ldots , x_{n}^{k}) = \sum _{i=1}^{n} \lambda _{i} x_{i}^{k} + \beta _{k}\), for all \(k = 1, \ldots , p\) and all \(x_{1}^{k}, \ldots , x_{n}^{k}\) in [0, 1] is obtained. By reflexivity, \(F_{k}(u, \ldots , u) = \sum _{i=1}^{n} \lambda _{i} u + \beta _{k} = u\), for all \(u \in [0,1]\) and for all \(k = 1, \ldots , p\). This is possible only if \(\beta _{k}=0\) for all \(k = 1, \ldots , p\) and \(\sum _{i=1}^{n} \lambda _{i} = 1\). \(\square \)

Appendix B: Proof of Theorem 3

In the following, \(\mathbf {1}_{n}\) denotes a vector of ones of length n, \(\mathbf {I}_n\) denotes the identity matrix of dimension n, and \(\mathbf {0}_{p,q}\) denotes a \(p \times q\) matrix of zeros. If \(\mathbf {A}\) is an \(m \times n\) matrix and \(\mathbf {B}\) is a \(p \times q\) matrix, the Kronecker product \(\mathbf {A}\otimes \mathbf {B}\) is the \(mp \times nq\) block matrix

$$\begin{aligned} \mathbf {A}\otimes \mathbf {B}= \left( \begin{array}{c@{\quad }c@{\quad }c} a_{11} \mathbf {B}&{} \cdots &{} a_{1n} \mathbf {B}\\ \vdots &{} \ddots &{} \vdots \\ a_{m1} \mathbf {B}&{} \cdots &{} a_{mn} \mathbf {B}\end{array} \right) . \end{aligned}$$

In particular, the matrix \(\mathbf {J}= \mathbf {I}_p \otimes \mathbf {1}_n\) is the \(np \times p\) matrix

$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c} \mathbf {1}_{n} &{} \cdots &{} \mathbf {0}_{n,1} \\ \vdots &{} \ddots &{} \vdots \\ \mathbf {0}_{n,1} &{} \cdots &{} \mathbf {1}_{n} \end{array} \right) . \end{aligned}$$
  1. (A)

    Unconstrained kriging is first considered. For each variable \(k=1,\dots ,p\), the kriging of the mean, \(\hat{m}_k\), is a linear combination of the data

    $$\begin{aligned} \hat{m}_k = \sum _{l=1}^p \mathbf {X}_l^\top \pmb {\lambda }^k_l, \end{aligned}$$

    where \(\mathbf {X}_l = (x_{1,l},\dots ,x_{n,l})^\top \) and \(\pmb {\lambda }^k_l = (\lambda ^k_{1,l},\dots ,\lambda ^k_{n,l})^\top \). Unbiasedness conditions impose

    $$\begin{aligned} \mathbf {1}_{n}^\top \pmb {\lambda }^k_k=1\ \ \hbox {and} \ \ \mathbf {1}_{n}^\top \pmb {\lambda }^k_l=0,\ \hbox {for}\ l \ne k. \end{aligned}$$

    Let \(\pmb {\lambda }^k\) be the stacked np-vector \(\pmb {\lambda }^k = ( (\pmb {\lambda }^k_1)^\top ,\dots ,(\pmb {\lambda }^k_p)^\top )^\top \) and let \(\pmb {\Lambda }=(\pmb {\lambda }^1,\dots ,\pmb {\lambda }^p)\) be the \((np \times p)\) matrix of kriging weights. When solved simultaneously, the p kriging equations are, in matrix notation,

    $$\begin{aligned} \left( \begin{array}{c@{\quad }c} \mathbf {C}&{} \mathbf {J}\\ \mathbf {J}^t &{} \mathbf {0}_{p,p} \end{array} \right) \left( \begin{array}{c} \pmb {\Lambda }\\ -\pmb {\mu }\end{array} \right) = \left( \begin{array}{c} \mathbf {0}_{np,p} \\ \mathbf {I}_p \end{array} \right) , \end{aligned}$$
    (11)

    where \(\pmb {\mu }\) is the \(p \times p\) matrix of Lagrange multipliers. The matrix \(\mathbf {C}\) arises from a valid model of covariance functions. If one excludes multiple values at the same location, it is invertible. The general solution for \(\pmb {\Lambda }\) therefore satisfies

    $$\begin{aligned} \mathbf {C}\pmb {\Lambda }= \mathbf {J}\pmb {\mu }, \end{aligned}$$
    (12)

    where \(\pmb {\mu }=(\mathbf {J}^\top \mathbf {C}^{-1} \mathbf {J})^{-1}\) with elements \(\mu _{kl}\), \(1 \le k,l \le p\).

    For each \(k=1,\dots ,p\), \(\pmb {\lambda }^k\) must be a vector of zeros except at coordinates corresponding to the k-th variable, where the weights are equal to a common vector \(\pmb {\lambda }\), i.e., \(\pmb {\lambda }^k = (\mathbf {0}_{1,n(k-1)},\pmb {\lambda }^\top ,\dots ,\mathbf {0}_{1,n(p-k)})^\top \). Hence, \(\pmb {\Lambda }= \mathbf {I}_p \otimes \pmb {\lambda }\). Thus, Eq. (12) becomes

    $$\begin{aligned} \mathbf {C}(\mathbf {I}_p \otimes \pmb {\lambda }) = \mathbf {J}\pmb {\mu }. \end{aligned}$$
    (13)

    Then, Eq. (13) is equivalent to

    $$\begin{aligned} \mathbf {C}_{kl} \pmb {\lambda }= \mu _{kl} \mathbf {1}_n, \quad \forall \ 1 \le k,l \le p. \end{aligned}$$
    (14)

    Since \(\mathbf {C}\) is invertible, \(\mathbf {C}_{kk}\) is invertible and \(\mu _{kk} \ne 0\), for all \(k=1,\dots ,p\). Hence, plugging \(\mu _{kk} \mathbf {C}_{kk}^{-1} \mathbf {1}_n = \pmb {\lambda }\) into Eq. (14) leads to

    $$\begin{aligned} \frac{\mu _{kk}}{\mu _{ll}} \mathbf {C}_{kk}^{-1} \mathbf {C}_{ll} = \mathbf {I}_n, \ \forall \ 1 \le k,l \le p. \end{aligned}$$

    This condition shows that \(\mathbf {C}_{kk} = a_{kk} \mathbf {R}\), where \(\mathbf {R}\) is a correlation matrix with \(a_{kk}>0\). With a similar argument, one can show that \(\mathbf {C}_{kl} = a_{kl} \mathbf {R}\) when \(k \ne l\). In conclusion, there is a single correlation matrix for describing all direct and cross covariance matrices

    $$\begin{aligned} \mathbf {C}= \pmb {\Sigma }\otimes \mathbf {R}. \end{aligned}$$

    In other words, the model is proportional.

  2. (B)

    Nonnegativity of the kriging weights is now imposed. This constrained kriging is the solution of the quadratic system

    $$\begin{aligned}&\mathop {\min }\nolimits _{\pmb {\lambda }} \quad \sum _{k=1}^p \pmb {\lambda }^\top \mathbf {C}_{kk} \pmb {\lambda }\\&\hbox {s.t.} \,\,\quad \quad \pmb {\lambda }^\top \mathbf {1}_n = 1 \\&\qquad \qquad \pmb {\lambda }\ge \mathbf {0}_n, \end{aligned}$$

    where the last inequality must be satisfied componentwise. The Kuhn–Tucker stationary conditions (Griva et al. 2009) corresponding to this system are

    $$\begin{aligned} \mathbf {C}_{kk} \pmb {\lambda }- \mu _k \mathbf {1}_n - \pmb {\alpha }= & {} 0 \qquad k=1,\dots ,p\\ \pmb {\lambda }^\top \mathbf {1}_n= & {} 1\\ \alpha _i\ge & {} 0 \qquad i=1,\dots ,n\\ \alpha _i \lambda _i= & {} 0 \qquad i=1,\dots ,n, \end{aligned}$$

    where \(\pmb {\alpha }= (\alpha _1,\dots ,\alpha _n)^\top \) and \(\pmb {\mu }= (\mu _1,\dots ,\mu _p)^\top \) are the \(n+p\) Lagrange multipliers. A nonnegativity constraint is said to be active if \(\alpha _i>0\) and nonactive when \(\alpha _i=0\).

    The Lagrange multipliers \(\pmb {\alpha }\), and hence the set of active constraints, are identical for all \(k=1,\dots ,p\). Let us reorder the dataset such that the first m elements, with \(1 < m \le n\), correspond to inactive constraints, that is \(\alpha _i=0\) and \(\lambda _i > 0\) for \(i=1,\dots ,m\). Let us denote by \(\pmb {\lambda }_m\) the corresponding vector of nonnull kriging weights and by \(\mathbf {C}_{kk}^{m,m}\) the corresponding \(m\times m\) matrix with the m first rows and columns of \(\mathbf {C}_{kk}\). Let us also denote by \(\mathbf {C}_{kk}^{m,n-m}\) the \(m\times (n-m)\) matrix with elements from the m first rows and \((n-m)\) last columns. Then, \(\pmb {\lambda }_m\) is solution of the kriging system

    $$\begin{aligned} \mathbf {C}_{kk}^{m,m} \pmb {\lambda }_m - \mu _k \mathbf {1}_m= & {} 0 \qquad k=1,\dots ,p \end{aligned}$$
    (15)
    $$\begin{aligned} \pmb {\lambda }_m^\top \mathbf {1}_m= & {} 1 \nonumber \\ \pmb {\lambda }_m \mathbf {C}_{kk}^{m,n-m} - \mu _k \mathbf {1}_{n-m} - \pmb {\alpha }_{n-m}= & {} 0 \qquad k=1,\dots ,p \nonumber \\ \alpha _i\ge & {} 0 \qquad i=m+1,\dots ,n. \end{aligned}$$
    (16)

    Following arguments similar to those in part A, Eqs. (15) and (16) induce that all covariance matrices \(\mathbf {C}_{kk}^{m,m}\) must be proportional to each other, that is \(\mathbf {C}_{kk}^{m,m} = a_{kk} \mathbf {R}^{m,m}\). This condition is satisfied for all m if and only if \(\mathbf {C}_{kk} = a_{kk} \mathbf {R}\), that is if and only if \(\mathbf {C}= \pmb {\Sigma }\otimes \mathbf {R}\).\(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Allard, D., Marchant, T. Means and Covariance Functions for Geostatistical Compositional Data: an Axiomatic Approach. Math Geosci 50, 299–315 (2018). https://doi.org/10.1007/s11004-017-9713-y

Download citation

Keywords

  • Aitchison geometry
  • Central tendency
  • Functional equation
  • Geostatistics
  • Multivariate covariance function model