Abstract
Many practical data sets in environmental sciences, official statistics and various other disciplines are in fact compositional data because only the ratios between the variables are informative. Compositional data are represented in the Aitchison geometry on the simplex, and for applying statistical methods designed for the Euclidean geometry they need to be transformed first. The isometric logratio (ilr) transformation has the best geometrical properties, and it avoids the singularity problem introduced by the centered logratio (clr) transformation. Robust multivariate methods which are based on a robust covariance estimation can thus only be used with ilr transformed data. However, usually the results are difficult to interpret because the ilr coordinates are formed by non-linear combinations of the original variables. We show for different multivariate methods how robustness can be managed for compositional data, and provide algorithms for the computation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
AITCHISON, J. (1986): The statistical analysis of compositional data. Chapman and Hall, London.
AITCHISON, J., BARCELÓ-VIDAL, C., MARTÍN-FERNÁNDEZ, J.A. and PAWLOWSKY-GLAHN, V. (2000): Logratio analysis and compositional distance. Mathematical Geology 32 (3), 271-275.
AITCHISON, J. and GREENACRE, M. (2002): Biplots of compositional data. Applied Statistics 51, 375-392.
EGOZCUE, J.J., PAWLOWSKY-GLAHN, V. and MATEU-FIGUERAS, G., BARCELÓ-VIDAL, C. (2003): Isometric logratio transformations for compositional data analysis. Mathematical Geology 35 (3), 279-300.
EGOZCUE, J.J.and PAWLOWSKY-GLAHN, V. (2005): Groups of parts and their balances in compositional data analysis. Mathematical Geology 37 (7), 795-828.
EGOZCUE, J.J. and PAWLOWSKY-GLAHN, V. (2006): Simplicial geometry for compositional data. In: Buccianti, A., Mateu-Figueras, G. and Pawlowsky-Glahn, V. (Eds.): Compositional data analysis in the geosciences: From theory to practice.Geological Society, London, 145–160.
FILZMOSER, P., HRON, K. (2008):Outlier detection for compositional data using robust methods. Mathematical Geosciences 40 (3), 233-248.
FILZMOSER, P., HRON, K. and REIMANN, C. (2009a):Principal component analysis for compositional data with outliers. Environmetrics 20, 621-632.
FILZMOSER, P., HRON, K., REIMANN, C. and GARRETT, R. (2009b):Robust factor analysis for compositional data. Computers & Geosciences 35 (9), 1854-1861.
FILZMOSER, P., MARONNA, R. and WERNER, M. (2008):Outlier identification in high dimensions. Computational Statistics & Data Analysis 52, 1694-1711.
MARONNA, R., MARTIN, R.D. and YOHAI, V.J. (2006):Robust statistics: theory and methods. Wiley, New York.
PEARSON, K. (1897): Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proceedings of the Royal Society of London 60, 489-502.
REIMANN, C., FILZMOSER, P., GARRETT, R. and DUTTER, R. (2008): Statistical data analysis explained: Applied environmental statistics with R. Wiley, Chichester.
ROUSSEEUW, P. and VAN DRIESSEN, K. (1999): A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212-223.
TEMPL, M., HRON, K. and FILZMOSER, P. (2009): robCompositions: Robust estimation for compositional data, http://www.r-project.org, R package version 1.2, 2009.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Filzmoser, P., Hron, K. (2010). Robust Methods for Compositional Data. In: Lechevallier, Y., Saporta, G. (eds) Proceedings of COMPSTAT'2010. Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-2604-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-7908-2604-3_7
Published:
Publisher Name: Physica-Verlag HD
Print ISBN: 978-3-7908-2603-6
Online ISBN: 978-3-7908-2604-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)