Abstract
Multivariate observations which carry exclusively relative information are known under the name compositional data, and they have very specific geometrical properties. In order to represent them in the usual Euclidean geometry, they need to be expressed in orthonormal coordinates prior to their possible further statistical processing. As it is not possible to construct Cartesian coordinates for the compositions, that would assign a coordinate for each of the parts separately, a choice of interpretable orthonormal coordinates is of particular interest. Although recent experiences show clear advantages of such coordinates, where the first coordinate aggregates information from logratios with a particular compositional part of interest, their usefulness is limited if there are distortions like rounding errors or other data problems in the involved parts. The aim of the paper is thus to introduce a “robust” version of these coordinates, where the role of the remaining parts (with respect to the part of interest) is weighted according to their relevance for the purpose of the statistical analysis. Theoretical considerations are accompanied by examples with data sets from chemistry and geochemistry, pointing out the role of robust estimation in the context of regression with compositional covariates.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aitchison, J.: The Statistical Analysis of Compositional Data. Chapman and Hall, London (1986)
Billheimer, D., Guttorp, P., Fagan, W.: Statistical interpretation of species composition. J. Am. Stat. Assoc. 96, 1205–1214 (2001)
Buccianti, A., Pawlowsky-Glahn, V., Egozcue, J.J.: Variation diagrams to statistically model the behavior of geochemical variables: theory and applications. J. Hydrol. 519, 988–998 (2014)
Eaton, M.: Multivariate Statistics. A Vector Space Approach. Wiley, New York (1983)
Egozcue, J.J.: Reply to “On the Harker Variation Diagrams;…” by J.A. Cortés. Math. Geosci. 41, 829–834 (2009)
Egozcue, J.J., Pawlowsky-Glahn, V.: Groups of parts and their balances in compositional data analysis. Math. Geol. 37, 795–828 (2005)
Egozcue, J.J., Pawlowsky-Glahn, V.: Simplicial geometry for compositional data. In: Buccianti, A., Mateu-Figueras, G., Pawlowsky-Glahn, V. (eds.) Compositional Data in the Geosciences: From Theory to Practice, pp. 145–160. Geological Society, London (2006)
Egozcue, J.J., Pawlowsky-Glahn, V., Mateu-Figueras, G., Barceló-Vidal, C.: Isometric logratio transformations for compositional data analysis. Math. Geol. 35, 279–300 (2003)
Filzmoser, P. Hron, K.: Robust statistical analysis. In: Pawlowsky-Glahn, V., Buccianti, A. (eds.) Compositional Data Analysis: Theory and Applications, pp. 59–72. Wiley, Chichester (2011)
Fišerová, E., Hron, K.: On interpretation of orthonormal coordinates for compositional data. Math. Geosci. 43, 455–468 (2011)
Hron, K., Filzmoser, P., Thompson, K.: Linear regression with compositional explanatory variables. J. Appl. Stat. 39, 1115–1128 (2012)
Hron, K., Templ, M., Filzmoser, P.: Imputation of missing valeus for compositional data using classical and robust methods. Comput. Stat. Data Anal. 54, 3095–3107 (2010)
Kalivodová, A., Hron, K., Filzmoser, P., Najdekr, L., Janečková, H., Adam, T.: PLS-DA for compositional data with application to metabolomics. J. Chemometr. 29, 21–28 (2015)
Maronna, R., Martin, R.D., Yohai, V.J.: Robust Statistics: Theory and Methods. Wiley, New York (2006)
Pawlowsky-Glahn, V., Egozcue, J.J.: Geometric approach to statistical analysis on the simplex. Stoch. Environ. Res. Risk Assess. 15, 384–398 (2001)
Pearson, K.: Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc. R. Soc. Lond. LX, 489–502 (1897)
Reimann, C., Filzmoser, P., Fabian, K., Hron, K., Birke, M., Demetriades, A., Dinelli, E., Ladenberger, A.: The concept of compositional data analysis in practice. Total major element concentrations in agricultural and grazing land soils in Europe. Sci. Total Environ. 426, 196–210 (2012)
Reimann, C., Birke, M., Demetriades, M., Filzmoser, P., O’Connor, P. (eds.): Chemistry of Europe’s Agricultural Soils – Part A: Methodology and Interpretation of the GEMAS Data Set. Geologisches Jahrbuch (Reihe B). Schweizerbarth, Hannover (2014)
Reimann, C., Birke, M., Demetriades, M., Filzmoser, P., O’Connor, P. (eds.): Chemistry of Europe’s Agricultural Soils – Part B: General Background Information and Further Analysis of the GEMAS Data Set. Geologisches Jahrbuch (Reihe B). Schweizerbarth, Hannover (2014)
Renaud, O., Victoria-Feser, M.-P.: A robust coefficient of determination for regression. J. Stat. Plan. Inference 140(7), 1852–1862 (2010)
Rousseeuw, P.J.: Least median of squares regression. J. Am. Stat. Assoc. 79, 871–881 (1984)
Yohai, V.J.: High breakdown-point and high efficiency estimates for regression. Ann. Stat. 15, 642–65 (1987)
Acknowledgements
The GEMAS project is a cooperative project of the EuroGeoSurveys Geochemistry Expert Group with a number of outside organizations (e.g., Alterra, The Netherlands; Norwegian Forest and Landscape Institute; Research Group Swiss Soil Monitoring Network, Swiss Research Station Agroscope Reckenholz-Tänikon, several Ministries of the Environment and University Departments of Geosciences, Chemistry and Mathematics in a number of European countries and New Zealand; ARCHE Consulting in Belgium; CSIRO Land and Water in Adelaide, Australia). The analytical work was co-financed by the following industry organizations: Eurometaux, European Borates Association, European Copper Institute, European Precious Metals Federation, International Antimony Association, International Lead Association-Europe, International Manganese Institute, International Molybdenum Association, International Tin Research Institute, International Zinc Association, The Cobalt Development Institute, The Nickel Institute, The (REACH) Selenium and Tellurium Consortium and The (REACH) Vanadium Consortium. The Directors of the European Geological Surveys, and the additional participating organizations, are thanked for making sampling of almost all of Europe in a tight time schedule possible. The Federal Institute for Geosciences and Natural Resourced (BGR), the Geological Survey of Norway and SGS (Canada) are thanked for special analytical input to the project.
The authors gratefully acknowledge the support of the grant COST Action CRoNoS IC1408 and the grant IGA_PrF_2015_013 Mathematical Models of the Internal Grant Agency of the Palacky University in Olomouc.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Filzmoser, P., Hron, K. (2015). Robust Coordinates for Compositional Data Using Weighted Balances. In: Nordhausen, K., Taskinen, S. (eds) Modern Nonparametric, Robust and Multivariate Methods. Springer, Cham. https://doi.org/10.1007/978-3-319-22404-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-22404-6_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22403-9
Online ISBN: 978-3-319-22404-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)