Skip to main content

Robust Methods for Compositional Data

  • Conference paper
  • First Online:
Proceedings of COMPSTAT'2010
  • 5895 Accesses

Abstract

Many practical data sets in environmental sciences, official statistics and various other disciplines are in fact compositional data because only the ratios between the variables are informative. Compositional data are represented in the Aitchison geometry on the simplex, and for applying statistical methods designed for the Euclidean geometry they need to be transformed first. The isometric logratio (ilr) transformation has the best geometrical properties, and it avoids the singularity problem introduced by the centered logratio (clr) transformation. Robust multivariate methods which are based on a robust covariance estimation can thus only be used with ilr transformed data. However, usually the results are difficult to interpret because the ilr coordinates are formed by non-linear combinations of the original variables. We show for different multivariate methods how robustness can be managed for compositional data, and provide algorithms for the computation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • AITCHISON, J. (1986): The statistical analysis of compositional data. Chapman and Hall, London.

    MATH  Google Scholar 

  • AITCHISON, J., BARCELÓ-VIDAL, C., MARTÍN-FERNÁNDEZ, J.A. and PAWLOWSKY-GLAHN, V. (2000): Logratio analysis and compositional distance. Mathematical Geology 32 (3), 271-275.

    Article  MATH  Google Scholar 

  • AITCHISON, J. and GREENACRE, M. (2002): Biplots of compositional data. Applied Statistics 51, 375-392.

    MathSciNet  MATH  Google Scholar 

  • EGOZCUE, J.J., PAWLOWSKY-GLAHN, V. and MATEU-FIGUERAS, G., BARCELÓ-VIDAL, C. (2003): Isometric logratio transformations for compositional data analysis. Mathematical Geology 35 (3), 279-300.

    Article  MathSciNet  Google Scholar 

  • EGOZCUE, J.J.and PAWLOWSKY-GLAHN, V. (2005): Groups of parts and their balances in compositional data analysis. Mathematical Geology 37 (7), 795-828.

    Article  MathSciNet  MATH  Google Scholar 

  • EGOZCUE, J.J. and PAWLOWSKY-GLAHN, V. (2006): Simplicial geometry for compositional data. In: Buccianti, A., Mateu-Figueras, G. and Pawlowsky-Glahn, V. (Eds.): Compositional data analysis in the geosciences: From theory to practice.Geological Society, London, 145–160.

    Google Scholar 

  • FILZMOSER, P., HRON, K. (2008):Outlier detection for compositional data using robust methods. Mathematical Geosciences 40 (3), 233-248.

    Article  MATH  Google Scholar 

  • FILZMOSER, P., HRON, K. and REIMANN, C. (2009a):Principal component analysis for compositional data with outliers. Environmetrics 20, 621-632.

    Article  Google Scholar 

  • FILZMOSER, P., HRON, K., REIMANN, C. and GARRETT, R. (2009b):Robust factor analysis for compositional data. Computers & Geosciences 35 (9), 1854-1861.

    Article  Google Scholar 

  • FILZMOSER, P., MARONNA, R. and WERNER, M. (2008):Outlier identification in high dimensions. Computational Statistics & Data Analysis 52, 1694-1711.

    Article  MathSciNet  MATH  Google Scholar 

  • MARONNA, R., MARTIN, R.D. and YOHAI, V.J. (2006):Robust statistics: theory and methods. Wiley, New York.

    Book  MATH  Google Scholar 

  • PEARSON, K. (1897): Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proceedings of the Royal Society of London 60, 489-502.

    Article  MATH  Google Scholar 

  • REIMANN, C., FILZMOSER, P., GARRETT, R. and DUTTER, R. (2008): Statistical data analysis explained: Applied environmental statistics with R. Wiley, Chichester.

    Book  Google Scholar 

  • ROUSSEEUW, P. and VAN DRIESSEN, K. (1999): A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212-223.

    Article  Google Scholar 

  • TEMPL, M., HRON, K. and FILZMOSER, P. (2009): robCompositions: Robust estimation for compositional data, http://www.r-project.org, R package version 1.2, 2009.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Filzmoser .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Filzmoser, P., Hron, K. (2010). Robust Methods for Compositional Data. In: Lechevallier, Y., Saporta, G. (eds) Proceedings of COMPSTAT'2010. Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-2604-3_7

Download citation

Publish with us

Policies and ethics