Skip to main content
Log in

Outlier Detection for Compositional Data Using Robust Methods

  • Published:
Mathematical Geosciences Aims and scope Submit manuscript

Abstract

Outlier detection based on the Mahalanobis distance (MD) requires an appropriate transformation in case of compositional data. For the family of logratio transformations (additive, centered and isometric logratio transformation) it is shown that the MDs based on classical estimates are invariant to these transformations, and that the MDs based on affine equivariant estimators of location and covariance are the same for additive and isometric logratio transformation. Moreover, for 3-dimensional compositions the data structure can be visualized by contour lines. In higher dimension the MDs of closed and opened data give an impression of the multivariate data behavior.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aitchison J (1986) The statistical analysis of compositional data. Monographs on statistics and applied probability. Chapman & Hall, London, 416 p

    Google Scholar 

  • Aitchison J (1992) On criteria for measures of compositional difference. Math Geol 24(4):365–379

    Article  Google Scholar 

  • Aitchison J, Egozcue JJ (2005) Compositional data analysis: where are we and where should we be heading? Math Geol 37(7):829–850

    Article  Google Scholar 

  • Barceló C, Pawlowsky V, Grunsky E (1996) Some aspects of transformations of compositional data and the identification of outliers. Math Geol 28(4):501–518

    Article  Google Scholar 

  • Barceló-Vidal CB, Martín-Fernandez JA, Pawlowsky-Glahn V (1999) Comment on “Singularity and nonnormality in the classification of compositional data” by Bohling GC, Davis JC, Olea RA, Harff J (Letter to the editor). Math Geol 31(5):581–585

    Article  Google Scholar 

  • Bohling GC, Davis JC, Olea RA, Harff J (1998) Singularity and nonnormality in the classification of compositional data. Math Geol 30(1):5–20

    Article  Google Scholar 

  • Coakley JP, Rust BR (1968) Sedimentation in an Arctic lake. J Sed Pet 38(4):1290–1300. Quoted in Aitchison (1986), the statistical analysis of compositional data. Chapman & Hall, London, 416 p

    Google Scholar 

  • Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35(3):279–300

    Article  Google Scholar 

  • Filzmoser P, Garrett RG, Reimann C (2005) Multivariate outlier detection in exploration geochemistry. Comput Geosci 31:579–587

    Article  Google Scholar 

  • Gnanadesikan R, Kettenring JR (1972) Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 28:81–124

    Article  Google Scholar 

  • Hardin J, Rocke DM (2005) The distribution of robust distances. J Comput Graph Stat 14:928–946

    Article  Google Scholar 

  • Harville DA (1997) Matrix algebra from a statistician’s perspective. Springer, New York, 630 p

    Google Scholar 

  • Maronna R, Zamar R (2002) Robust estimates of location and dispersion for high-dimensional data sets. Technometrics 44(4):307–317

    Article  Google Scholar 

  • Maronna R, Martin RD, Yohai VJ (2006) Robust statistics: theory and methods. Wiley, New York, 436 p

    Book  Google Scholar 

  • Martín-Fernández JA, Barceló-Vidal C, Pawlowsky-Glahn V (2003) Dealing with zeros and missing values in compositional data sets using nonparametric imputation. Math Geol 35(3):253–278

    Article  Google Scholar 

  • Peña D, Prieto F (2001) Multivariate outlier detection and robust covariance matrix estimation. Technometrics 43(3):286–310

    Article  Google Scholar 

  • R development core team, 2006, R: A language and environment for statistical computing. Vienna. http://www.r-project.org

  • Reimann C, Äyräs M, Chekushin V, Bogatyrev I, Boyd R, Caritat P. d., Dutter R, Finne T, Halleraker J, Jæger O, Kashulina G, Lehto O, Niskavaara H, Pavlov V, Räisänen M, Strand T, Volden T (1998) Environmental geochemical atlas of the Central Barents Region: Geological Survey of Norway (NGU), Geological Survey of Finland (GTK), and Central Kola Expedition (CKE), Special Publication, Trondheim, Espoo, Monchegorsk, 745 p

  • Rousseeuw PJ, Leroy AM (2003) Robust regression and outlier detection. Wiley, New York, 360 p

    Google Scholar 

  • Rousseeuw P, Van Driessen K (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41:212–223

    Article  Google Scholar 

  • Rousseeuw PJ, Van Zomeren BC (1990) Unmasking multivariate outliers and leverage points. J Am Stat Assoc 85(411):633–651

    Article  Google Scholar 

  • Thompson RN, Esson J, Duncan AC (1972) Major element chemical variation in the Eocene lavas of the Isle of Skye Scotland. J Petrol 13(2):219–253. Quoted in Aitchison, J., 1986, The statistical analysis of compositional data. Chapman & Hall, London, 416 p

    Google Scholar 

  • Visuri S, Koivunen V, Oja H (2000) Sign and rank covariance matrices. J Stat Plan Inference 91:557–575

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Filzmoser.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Filzmoser, P., Hron, K. Outlier Detection for Compositional Data Using Robust Methods. Math Geosci 40, 233–248 (2008). https://doi.org/10.1007/s11004-007-9141-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11004-007-9141-5

Keywords

Navigation