Environmental and Ecological Statistics

, Volume 21, Issue 3, pp 565–581 | Cite as

Three-way compositional analysis of water quality monitoring data

  • Mark A. Engle
  • Michele Gallo
  • Karl T. Schroeder
  • Nicholas J. Geboy
  • John W. Zupancic


Water quality monitoring data typically consist of \(J\) parameters and constituents measured at \(I\) number of static locations at \(K\) sets of seasonal occurrences. The resulting \(I \times J \times K\) three-way array can be difficult to interpret. Additionally, the constituent portion of the dataset (e.g., major ion and trace element concentration, pH, etc.) is compositional in that it sums to a constant (e.g., 1 kg/L) and is mathematically confined to the simplex, the sample space for compositional data. Here we apply a Tucker3 model on centered log-ratio data to find low dimensional representation of latent variables as a means to simplify data processing and interpretation of three years of seasonal compositional groundwater chemistry data for 14 wells at a study site in Wyoming, USA. The study site has been amended with treated coalbed methane produced water, using a subsurface drip irrigation system, to allow for irrigation of forage crops. Results from three-way compositional data analysis indicate that primary controls on water quality at the study site include: solutes concentration by evapotranspiration, cation exchange, and dissolution of native salts. These findings agree well with results from more detailed investigations of the site. In addition, the model identified Ba uptake during gypsum precipitation in some portions of the site during the final 6–9 months of investigation, a process for which the timing and extent had not previously been identified. These results suggest that multi-way compositional analyses hold promise as a means to more easily interpret water quality monitoring data.


Coalbed natural gas Log-ratio Multi-mode analysis  Powder River Basin Produced waters Tucker3 


  1. Aitchison J (1982) The statistical analysis of compositional data. J R Stat Soc B Methodol 44:139–177Google Scholar
  2. Aitchison J (1986) The statistical analysis of compositional data. Chapman & Hall, London. Reprinted in 2003 with additional material by Blackburn PressGoogle Scholar
  3. Aitchison J, Greenacre M (2002) Biplots of compositional data. J R Stat Soc C Appl Stat 51:375–392CrossRefGoogle Scholar
  4. Astel A, Simeonov V, Bauer H, Puxbaum H (2010) Multidimensional modeling of aerosol monitoring data. Environ Pollut 158:3201–3208PubMedCrossRefGoogle Scholar
  5. Bern CR, Boehlke AR, Engle MA et al (2013a) Shallow groundwater and soil chemistry response to 3 years of subsurface drip irrigation using coalbed natural gas produced water. Hydrogeol J 21:1803–1820Google Scholar
  6. Bern CR, Breit GN, Healy RW et al (2013b) Deep subsurface drip irrigation using coal-bed sodic water: Part I. Water and solute movement. Agric Water Manag 118:122–134CrossRefGoogle Scholar
  7. Buccianti A, Pawlowsky-Glahn V (2005) New perspectives on water chemistry and compositional data analysis. Math Geol 37:703–727CrossRefGoogle Scholar
  8. Carroll JD, Chang J-J (1970) Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart–Young” decomposition. Psychometrika 35:283–319CrossRefGoogle Scholar
  9. Daunis-i-Estadella J, Thió-Henestrosa S, Mateu-Figueras G (2011) Including supplementary elements in a compositional biplot. Comput Geosci 37:696–701CrossRefGoogle Scholar
  10. Egozcue J, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35:279–300CrossRefGoogle Scholar
  11. Engle MA, Bern CR, Healy RW et al (2011) Tracking solutes and water from subsurface drip irrigation application of coalbed methane-produced waters, Powder River Basin, Wyoming. Environ Geosci 18:169–187CrossRefGoogle Scholar
  12. Engle MA, Rowan EL (2013) Interpretation of Na–Cl–Br systematics in sedimentary basin brines: comparison of concentration, element ratio, and isometric log-ratio approaches. Math Geosci 45:87–101CrossRefGoogle Scholar
  13. Gallo M (2012) CoDa in three-way arrays and relative sample spaces. Electron J Appl Stat Anal 5:400–405Google Scholar
  14. Gallo M (2013a) Tucker3 analysis for compositional data. Commun Stat A Theor (in press)Google Scholar
  15. Gallo M (2013b) Log-ratio and parallel factor analysis: an approach to analyze threeway compositional data. In: Proto AN, Squillante M, Kacpzyk J (eds) Advanced dynamic modeling of economic and social systems. Springer, Berlin, pp 209–221CrossRefGoogle Scholar
  16. Gallo M, Buccianti A (2013) Weighted principal component analysis for compositional data: application example for the water chemistry of the Arno river (Tuscany, central Italy). Environmet 24:269–277CrossRefGoogle Scholar
  17. Ganjegunte GK, King LA, Vance GF (2008) Cumulative soil chemistry changes from land application of saline–sodic waters. J Environ Qual 37:S128–S138PubMedCrossRefGoogle Scholar
  18. Geboy NJ, Engle MA, Schroeder KT, Zupancic JW (2011) Summary of inorganic compositional data for groundwater, soil-water, and surface-water samples at the Headgate Draw subsurface drip irrigation site, Johnson County. Wyoming. U.S. Geological Survey Data Series 619Google Scholar
  19. Hanor JS (2000) Barite–celestine geochemistry and environments of formation. Rev Miner Geochem 40:193–275CrossRefGoogle Scholar
  20. Harshman RA (1970) Foundations of the PARAFAC procedure: models and conditions for an “explanatory” multimodal factor analysis. University of California at Los Angeles working papers in phonetics 16Google Scholar
  21. Hirsch RM, Slack JR, Smith RA (1982) Techniques of trend analysis for monthly water quality data. Water Resour Res 18:107–121CrossRefGoogle Scholar
  22. Jackson RE, Reddy KJ (2007) Geochemistry of coalbed natural gas (CBNG) produced water in Powder River Basin, Wyoming: salinity and sodicity. Water Air Soil Pollut 184:49–61CrossRefGoogle Scholar
  23. Kroonenberg PM (2008) Applied multiway data analysis. Wiley-Interscience, LondonCrossRefGoogle Scholar
  24. Mateu-Figueras G, Pawlowsky-Glahn V, Egozcue JJ (2011) The principle of working on coordinates. In: Pawlowsky-Glahn V, Buccianti A (eds) Compositional data analysis: theory and applications. Wiley, London, pp 31–42Google Scholar
  25. Minsley BJ, Smith BD, Hammack R et al (2012) Calibration and filtering strategies for frequency domain electromagnetic data. J Appl Geophys 80:56–66CrossRefGoogle Scholar
  26. Otero N, Tolosana-Delgado R, Soler A et al (2005) Relative vs. absolute statistical analysis of compositions: a comparative study of surface waters of a Mediterranean river. Water Res 39:1404–1414PubMedCrossRefGoogle Scholar
  27. Pawlowsky-Glahn V, Egozcue JJ (2001) Geometric approach to statistical analysis on the simplex. Stoch Environ Res Risk A 15:384–398CrossRefGoogle Scholar
  28. Pawlowsky-Glahn V, Egozcue JJ (2006) Compositional data and their analysis: an introduction. Geol Soc Lond (special publications) 264:1–10CrossRefGoogle Scholar
  29. Reimann C, Filzmoser P, Garrett RG (2002) Factor analysis applied to regional geochemical data: problems and possibilities. Appl Geochem 17:185–206CrossRefGoogle Scholar
  30. Rice CA, Ellis M, Bullock J (2000), Water co-produced with coalbed methane in the Powder River Basin, Wyoming: preliminary compositional data. USGS open-file report 00-372Google Scholar
  31. Sams JI, Smith BD, Veloski G, et al. (2010) Third year of subsurface drip irrigation monitoring using GEM2 electromagnetic surveys, Powder River Basin, Wyoming. In: Symposium on the application of geophysics to engineering and environmental problems (SAGEEP) 2010. Keystone, Colorado, p 9Google Scholar
  32. Singh K, Malik A, Sinha S et al (2007) Exploring groundwater hydrochemistry of alluvial aquifers using multi-way modeling. Anal Chim Acta 596:171–182PubMedCrossRefGoogle Scholar
  33. Smilde AK, Bro R, Geladi P (2004) Multi-way analysis with applications in the chemical sciences. Wiley, ChichesterCrossRefGoogle Scholar
  34. Tucker L (1966) Some mathematical notes on three-mode factor analysis. Psychometrika 31:279–311PubMedCrossRefGoogle Scholar
  35. Ward RC, Loftis JC, McBride GB (1986) The “data-rich but information-poor” syndrome in water quality monitoring. Environ Manag 10:291–297Google Scholar
  36. U.S. Energy, Information Administration (2012) Annual energy outlook 2012Google Scholar
  37. Yli-Tuomi T, Hopke P, Paatero P et al (2003) Atmospheric aerosol over Finnish Arctic: source analysis by the multilinear engine and the potential source contribution function. Atmos Environ 37:4381–4392CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York (outside the USA)  2013

Authors and Affiliations

  • Mark A. Engle
    • 1
    • 2
  • Michele Gallo
    • 3
  • Karl T. Schroeder
    • 4
  • Nicholas J. Geboy
    • 1
  • John W. Zupancic
    • 5
  1. 1.US Geological SurveyVAUSA
  2. 2.Department of Geological SciencesUniversity of TexasEl PasoUSA
  3. 3.Department of Human and Social SciencesUniversity of Naples “L’Orientale”NaplesItaly
  4. 4.National Energy Technology LaboratoryUS Department of EnergyPittsburghUSA
  5. 5.BeneTerra, LLCSheridanUSA

Personalised recommendations