Skip to main content
Log in

Weighting of Parts in Compositional Data Analysis: Advances and Applications

  • Published:
Mathematical Geosciences Aims and scope Submit manuscript

Abstract

It often occurs in practice that it is sensible to give different weights to the variables involved in a multivariate data analysis—and the same holds for compositional data as multivariate observations carrying relative information. It can be convenient to apply weights to better accommodate differences in the quality of the measurements, the occurrence of zeros and missing values, or generally to highlight some specific features of compositional parts. The characterisation of compositional data as elements of a Bayes space, which is as a natural generalisation of the ordinary Aitchison geometry, enables the definition of a formal framework to implement weighting schemes for the parts of a composition. This is formally achieved by considering a reference measure in the Bayes space alternative to the common uniform measure via the well-known chain rule. Unweighted centred logratio (clr) coefficients and isometric logratio (ilr) coordinates then allow us to express compositions in real space equipped with (unweighted) Euclidean geometry. The resulting elements of real space generated by the clr coefficients or ilr coordinates are invariant to the scale of the original compositions, but the actual scale of the weights matters. In this work, these formal developments are presented and used to introduce a general approach for weighting parts in compositional data analysis. The practical use is demonstrated on simulated and real-world data sets in the context of the earth sciences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Availability of data and materials

Data are available from the authors on request.

References

  • Aitchison J (1982) The statistical analysis of compositional data (with discussion). J R Stat Soc Ser B (Stat Methodol) 44(2):139–177

    Google Scholar 

  • Aitchison J (1983) Principal component analysis of compositional data. Biometrika 70(1):57–65

    Article  Google Scholar 

  • Aitchison J (1986) The statistical analysis of compositional data. Chapman & Hall, London. (Reprinted in 2003 with additional material by The Blackburn Press)

  • Aitchison J, Barceló-Vidal C, Martín-Fernández JA, Pawlowsky-Glahn V (2000) Logratio analysis and compositional distance. Math Geol 32(3):271–275

    Article  Google Scholar 

  • Aitchison J, Greenacre M (2002) Biplots of compositional data. J R Stat Soc Ser C (Appl Stat) 51(4):375–392

    Article  Google Scholar 

  • Barceló-Vidal C, Martín-Fernández JA (2016) The mathematics of compositional analysis. Aust J Stat 45:57–71

    Article  Google Scholar 

  • Barceló-Vidal C, Martín-Fernández JA, Pawlowsky-Glahn V (2001) Mathematical foundations of compositional data analysis. In: Ross G (ed) Proceedings of IAMG’01—The VII annual conference of the international association for mathematical geology, p 20

  • Billheimer D, Guttorp P, Fagan W (2001) Statistical interpretation of species composition. J Am Stat Assoc 96(456):1205–1214

    Article  Google Scholar 

  • Butler BM, Palarea-Albaladejo J, Shepherd KD, Nyambura KM, Towett EK, Sila AM, Hillier S (2020) Mineral-nutrient relationships in African soils assessed using cluster analysis of X-ray powder diffraction patterns and compositional methods. Geoderma 375:124474

    Article  Google Scholar 

  • Eaton ML (1983) Multivariate statistics. A vector space approach. Wiley, New York

    Google Scholar 

  • Egozcue JJ (2009) Reply to “On the Harker variation diagrams; ...” by J.A. Cortés. Math Geosci 41(7):829–834

  • Egozcue JJ, Pawlowsky-Glahn V (2005) Groups of parts and their balances in compositional data analysis. Math Geol 37(7):795–828

    Article  Google Scholar 

  • Egozcue JJ, Pawlowsky-Glahn V (2016) Changing the reference measure in the simplex and its weighting effects. Aust J Stat 45(4):25–44

    Article  Google Scholar 

  • Egozcue JJ, Pawlowsky-Glahn V (2018) Modelling compositional data. The sample space approach. In: Daya Sagar BS, Cheng Q, Agterberg F (eds) Handbook of mathematical geosciences—fifty years of IAMG. Springer, Cham, pp 81–103

    Chapter  Google Scholar 

  • Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35(3):279–300

    Article  Google Scholar 

  • Egozcue JJ, Barceló-Vidal C, Martín-Fernández JA, Jarauta-Bragulat E, Díaz-Barrero JL, Mateu-Figueras G (2011) Elements of simplicial linear algebra and geometry. In: Pawlowsky-Glahn V, Buccianti A (eds) Compositional data analysis: theory and applications. Wiley, Chichester, pp 141–157

    Google Scholar 

  • Filzmoser P, Hron K, Reimann C (2009) Univariate statistical analysis of environmental (compositional) data: problems and possibilities. Sci Total Environ 407:6100–6108

    Article  Google Scholar 

  • Filzmoser P, Hron K, Templ M (2018) Applied compositional data analysis. Springer series in statistics. Springer, Cham

    Book  Google Scholar 

  • Fišerová E, Hron K (2011) On interpretation of orthonormal coordinates for compositional data. Math Geosci 43(4):455–468

    Article  Google Scholar 

  • Greenacre M (2018) Compositional data in practice. CRC Press, Boca Raton

    Book  Google Scholar 

  • Greenacre M, Lewi P (2009) Distributional equivalence and subcompositional coherence in the analysis of compositional data, contingency tables and ratio-scale measurements. J Classif 26(1):29–54

    Article  Google Scholar 

  • Hron K, Templ M, Filzmoser P (2010) Imputation of missing values for compositional data using classical and robust methods. Comput Stat Data Anal 54(12):3095–3107

    Article  Google Scholar 

  • Hron K, Filzmoser P, de Caritat P, Fišerová E, Gardlo A (2017) Weighted pivot coordinates for compositional data and their application to geochemical mapping. Math Geosci 49(6):797–814

    Article  Google Scholar 

  • Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, New York

    Google Scholar 

  • Martín-Fernández JA, Hron K, Templ M, Filzmoser P, Palarea-Albaladejo J (2012) Model-based replacement of rounded zeros in compositional data: classical and robust approaches. Comput Stat Data Anal 56:2688–2704

    Article  Google Scholar 

  • Mert C, Filzmoser P, Hron K (2016) Error propagation in compositional data analysis: theoretical and practical considerations. Math Geosci 48(8):941–961

    Article  Google Scholar 

  • Palarea-Albaladejo J, Martín-Fernández JA (2008) A modified EM alr-algorithm for replacing rounded zeros in compositional data sets. Comput Geosci 34(8):902–917

    Article  Google Scholar 

  • Palarea-Albaladejo J, Martín-Fernández JA (2013) Values below detection limit in compositional chemical data. Anal Chim Acta 764:32–43

    Article  Google Scholar 

  • Palarea-Albaladejo J, Martín-Fernández J (2015) zCompositions—R package for multivariate imputation of left-censored data under a compositional approach. Chemometr Intell Lab Syst 143:85–96

    Article  Google Scholar 

  • Palarea-Albaladejo J, Martín-Fernández JA, Gómez-García J (2007) A parametric approach for dealing with compositional rounded zeros. Math Geol 39(7):625–645

    Article  Google Scholar 

  • Pawlowsky-Glahn V, Egozcue JJ (2001) Geometric approach to statistical analysis on the simplex. Stochastic Environ Res Risk Assess (SERRA) 15(5):384–398

    Article  Google Scholar 

  • Pawlowsky-Glahn V, Egozcue JJ (2002) BLU estimators and compositional data. Math Geol 34(3):259–274

    Article  Google Scholar 

  • Pawlowsky-Glahn V, Egozcue JJ, Tolosana-Delgado R (2015) Modeling and analysis of compositional data. Wiley, Chichester

    Google Scholar 

  • Reimann C, Demetriades A, Eggen O, Filzmoser P (2009) the EuroGeoSurveys Geochemistry expert group, The EuroGeoSurveys geochemical mapping of agricultural and grazing land soils project (GEMAS)—Evaluation of quality control results of aqua regia extraction analysis. NGU Report 2009:049

  • Reimann C, Filzmoser P, Fabian K, Hron K, Birke M, Demetriades A, Dinelli E, Ladenberger A, The GEMAS Project Team (2012) The concept of compositional data analysis in practice-Total major element concentrations in agricultural and grazing land soils of Europe. Sci Total Environ 426:196–210

  • Talská R, Menafoglio A, Hron K, Egozcue JJ, Palarea-Albaladejo J (2020) Weighting the domain of probability densities in functional data analysis. Stat. 9(1):e283

  • Templ M, Hron K, Filzmoser P (2011) robCompositions: an R-package for robust statistical analysis of compositional data. In: Pawlowsky-Glahn V, Buccianti A (eds) Compositional data analysis: theory and applications. Wiley, Chichester, pp 341–355

    Chapter  Google Scholar 

  • van den Boogaart KG, Egozcue JJ, Pawlowsky-Glahn V (2014) Bayes Hilbert spaces. Aust N Z J Stat 56(2):171–194

  • van den Boogaart K, Tolosana-Delgado R, Templ M (2015) Regression with compositional response having unobserved components or below detection limit values. Stat Model 15(2):191–213

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

KH and AM conceived this research; JPA and PF designed the experiments and provided the data sets and interpretations; KH wrote the first draft of the paper, and KH, AM, JPA, PF, RT and JJE all participated in the revisions of it.

Corresponding author

Correspondence to Karel Hron.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

K.H., P.F. and R.T. gratefully acknowledge the support by the Czech Science Foundation (GACR), GA 19-01768S; J. P-A was partly supported by the Scottish Government’s Rural and Environment Science and Analytical Services Division and the Spanish Ministry of Economy and Competitiveness [Ref: RTI2018-095518-B-C21].

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hron, K., Menafoglio, A., Palarea-Albaladejo, J. et al. Weighting of Parts in Compositional Data Analysis: Advances and Applications. Math Geosci 54, 71–93 (2022). https://doi.org/10.1007/s11004-021-09952-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11004-021-09952-y

Keywords

Navigation