Environmental and Ecological Statistics

, Volume 23, Issue 3, pp 421–434 | Cite as

Sparse PCA and investigation of multi-elements compositional repositories: theory and applications

  • Michele Gallo
  • Nickolay T. Trendafilov
  • Antonella Buccianti
Article

Abstract

The geochemistry of floodplain sediments is fundamental to monitor environmental changes and to quantify their contribution to natural and anthropic processes. A floodplain sediment composition is a vector of positive elements which sum to a fixed constant. The analysis of high-dimensional compositions requires methods that produce results involving only a small portion of the original variables. On the other hand, the analysis must take into account the additional constraints specific to compositions. With the purpose of studying these problems, a new procedure for sparse PCA is proposed on European floodplain sediment samples.

Keywords

CoDa Logratios Steepest descent Stiefel manifold 

References

  1. Aitchison J (1982) The statistical analysis of compositional data. J R Stat Soc Ser B (Methodol) 44(2):139–177Google Scholar
  2. Aitchison J (1983) Principal component analysis of compositional data. Biometrika 70:57–65CrossRefGoogle Scholar
  3. Aitchison J (1986) The statistical analysis of compositional data. Chapman and Hall, LondonCrossRefGoogle Scholar
  4. Aitchison J, Greenacre M (2002) Biplots of compositional data. J R Stat Soc Ser C (Appl Stat) 51:375–392CrossRefGoogle Scholar
  5. Baize D, Sterckeman T (2001) Of the necessity of knowledge of the natural pedo-geochemical background content in the evaluation of the contamination of soils by trace elements. Sci Total Environ 264:127–139CrossRefPubMedGoogle Scholar
  6. Billheimer D, Guttorp P, Fagan W (2001) Statistical interpretation of species composition. J Am Stat Assoc 456:1205–1214CrossRefGoogle Scholar
  7. Boumal N, Mishra B, Absil P-A, Sepulchre R (2014) Manopt: a Matlab toolbox for optimization on manifolds. J Mach Learn Res 15:1455–1459Google Scholar
  8. Buccianti A, Lima A, Albanese S, Cannatelli C, Esposito R, De Vivo B (2015) Exploring topsoil geochemistry from the CoDA (compositional data analysis) perspective: the multi-element data archive of the Campania Region (Southern Italy). J Geochem Explor 159:302–316CrossRefGoogle Scholar
  9. De Vos W, Tarvainen T, Salminen R, Reeder S, De Vivo B, Demetriades A, Pirc S, Batista MJ, Marsina K, Ottesen RT, O’Connor PJ, Bidovec M, Lima A, Siewers U, Smith B, Taylor H, Shaw R, Salpeteur I, Gregorauskiene V, Halamic J, Slaninka I, Lax K, Gravese P, Birke M, Breward N, Ander EL, Jordan G, Duris M, Klein P, Locutura J, Bel-Lan A, Pasieczna A, Lis J, Mazreku A, Gilucis A, Heitzmann P, Klaver G, Petersell V (2006) Geochemical Atlas of Europe. Part 2. Interpretation of Geochemical Maps, Additional Tables, Figures, Maps, and Related Publications. Geological Survey of Finland, Espoo. ISBN: 951-690-956-6Google Scholar
  10. Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barcelo-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35(3):279–300CrossRefGoogle Scholar
  11. Egozcue J, Pawlowsky-Glahn V (2005) Groups of parts and their balances in compositional data analysis. Math Geol 37:795–820CrossRefGoogle Scholar
  12. Egozcue JJ, Barcelo-Vidal C, Martín-Fernández JA, Jarauta-Bragulat E, Díaz-Barrero JL, Mateu-Figueras G (2011) Elements of simplicial linear algebra and geometry, compositional data analysis: theory and applications. Wiley, ChichesterGoogle Scholar
  13. Engle MA, Gallo M, Schroeder KT, Geboy NJ, Zupancic JW (2014) Three-way compositional analysis of water quality monitoring data. Environ Ecol Stat 21(3):565–581CrossRefGoogle Scholar
  14. Gallo M, Buccianti A (2013) Weighted principal component analysis for compositional data: application example for the water chemistry of the Arno river (Tuscany, central Italy). Environmetrics 24(4):269–277CrossRefGoogle Scholar
  15. Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, New YorkGoogle Scholar
  16. Mert MC, Filzmoser P, Hron H (2015) Sparse principal balances. Stat Model 15:159–174CrossRefGoogle Scholar
  17. Pawlowsky-Glahn V, Egozcue J (2001) Geometric approach to statistical analysis on the simplex. Stoch Environ Res Risk Assess 15:38–398CrossRefGoogle Scholar
  18. Pawlowsky-Glahn V, Buccianti A (2011) Compositional data analysis: theory and applications. Wiley, LondonCrossRefGoogle Scholar
  19. Salminen R, Batista MJ, Bidovec M, Demetriades A, De Vivo B, De Vos W, Gilucis A, Gregorauskiene V, Halamic J, Heitzmann P, Lima A, Jordan G, Klaver G, Klein P, Lis J, Locutura J, Marsina K, Mazreku A, Mrnkova J, O’Connor P. J, Olsson S, Ottesen R-T, Petersell V, Plant JA, Reeder S, Salpeteu I, Sandström H, Siewers U, Steenfelt A, Tarvaine T (2005) FOREGS Geochemical Atlas of Europe. Part 1. Background Information, Methodology, and Maps. Geological Survey of Finland, Espoo. ISBN: 951-690-913-2 http://www.gtk/publ/foregsatlas, March 15, 2005
  20. Salpeteur I, Locutura J, Tyráek J (2005) A brief summary of the Tertiary-Quaternary landscape evolution focusing on palaeodrainage settlement on the European Shield. Salminen (Chiefeditor) R, Batista MJ, Bidovec M, Demetriades A, De Vivo B, De Vos W, Duris M, Gilucis A, Gregorauskiene V, Halamic J, Heitzmann P, Lima A, Jordan G, Klaver G, Klein P, Lis J, Locutura J, Marsina K, Mazreku A, O’Connor PJ, Sölsson RT, Ottesen V Petersell, Plant JA, Reeder S, Salpeteur I, Sandström H, Siewers U, Steenfelt A, Tarvainen T FOREGS Geochemical Atlas of Europe, Part 1: Background Information. Methodology and Maps. Geological Survey of Finland, Espoo, pp 51–61Google Scholar
  21. Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc B 58:267–288Google Scholar
  22. Trendafilov N, Jolliffe I (2006) Projected gradient approach to the numerical solution of the SCoTLASS. Comput Stat Data Anal 50:242–253CrossRefGoogle Scholar
  23. Trendafilov NT (2012) DINDSCAL: direct INDSCAL. Stat Comput 22:445–454CrossRefGoogle Scholar
  24. Trendafilov NT (2014) From simple structure to sparse components: a review. Comput Stat 29:431–454CrossRefGoogle Scholar
  25. Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15(2):265–286CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Michele Gallo
    • 1
  • Nickolay T. Trendafilov
    • 2
  • Antonella Buccianti
    • 3
  1. 1.Department of Human and Social SciencesUniversity of Naples “L’Orientale”NapoliItaly
  2. 2.Department of Mathematics and StatisticsThe Open UniversityMilton KeynesUK
  3. 3.Department of Earth SciencesUniversity of FlorenceFlorenceItaly

Personalised recommendations