Skip to main content
Log in

Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies

  • Research Article
  • Published:
Frontiers of Earth Science Aims and scope Submit manuscript

Abstract

Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abdel-Halim R E, Abdel-Aal R E (1998). Classification of urinary stones by cluster analysis of ionic composition data. Comput Methods Programs Biomed, 58(1): 69–81

    Article  Google Scholar 

  • Afzal P, Khakzad A, Moarefvand P, Omran N R, Esfandiari B, Alghalandis Y F (2010). Geochemical anomaly separation by multifractal modeling in Kahang (Gor Gor) porphyry system, Central Iran. J Geochem Explor, 104(1–2): 34–46

    Article  Google Scholar 

  • Agharezaei M, Hezarkhani A (2016). Delineation of geochemical anomalies based on Cu by the boxplot as an exploratory data analysis (EDA) method and concentration-volume (C-V) fractal modeling in Mesgaran mining area, Eastern Iran. Open Journal of Geology, 6(10): 1269–1278

    Article  Google Scholar 

  • Agterberg F P (2012). Multifractals and geostatistics. J Geochem Explor, 122: 113–122

    Article  Google Scholar 

  • Aitchison J (1982). The statistical analysis of compositional data. J R Stat Soc B, 44(2): 139–177

    Google Scholar 

  • Aitchison J (1999). Logratios and natural laws in compositional data analysis. Math Geol, 31(5): 563–580

    Article  Google Scholar 

  • Aitchison J, Barcelo-Vidal C, Martin-Fernandez J A, Pawlowsky-Glahn V (2000). Logratio analysis and compositional distance. Math Geol, 32(3): 271–275

    Article  Google Scholar 

  • Aitchison J, Egozcue J J (2005). Compositional data analysis: where are we and where should we be heading? Math Geol, 37(7): 829–850

    Article  Google Scholar 

  • Bölviken B, Stokke P R, Feder J, Jossang T (1992). The fractal nature of geochemical landscapes. J Geochem Explor, 43(2): 91–109

    Article  Google Scholar 

  • Bounessah M, Atkin B P (2003). An application of exploratory data analysis (EDA) as a robust non-parametric technique for geochernical mapping in a semi-arid climate. Appl Geochem, 18(8): 1185–1195

    Article  Google Scholar 

  • Buccianti A (2013). Is compositional data analysis a way to see beyond the illusion? Comput Geosci, 50: 165–173

    Article  Google Scholar 

  • Carranza E J M (2009). Geochemical anomaly and mineral prospectivity mapping in GIS. Handbook of exploration and environmental geochemistry, 11. Elsevier Science

    Google Scholar 

  • Carranza E J M (2010). Catchment basin modelling of stream sediment anomalies revisited: incorporation of EDA and fractal analysis. Geochem Explor Environ Anal, 10(4): 365–381

    Article  Google Scholar 

  • Carranza E JM(2011). Analysis and mapping of geochemical anomalies using logratio-transformed stream sediment data with censored values. J Geochem Explor, 110(2): 167–185

    Article  Google Scholar 

  • Carranza E J M, Hale M (1997). A catchment basin approach to the analysis of reconnaissance geochemical-geological data from Albay Province, Philippines. J Geochem Explor, 60(2): 157–171

    Article  Google Scholar 

  • Cheng Q, Agterberg F P (2009). Singularity analysis of ore-mineral and toxic trace elements in stream sediments. Comput Geosci, 35(2): 234–244

    Article  Google Scholar 

  • Cheng Q, Agterberg F P, Bonham-Carter G F (1996). A spatial analysis method for geochemical anomaly separation. J Geochem Explor, 56 (3): 183–195

    Article  Google Scholar 

  • Davis J C (2002). Statistics and Data Analysis in Geology (3rd ed). New York, Chichester, Brisbane, Toronto, Singapore: John Wiley and Sons

    Google Scholar 

  • Egozcue J J, Pawlowsky-Glahn V, Mateu-Figueras G, Barcelo-Vidal C (2003). Isometric logratio transformations for compositional data analysis. Math Geol, 35(3): 279–300

    Article  Google Scholar 

  • Eilermann M, Post C, Schwarz D, Leufke S, Schembecker G, Bramsiepe C (2017). Generation of an equipment module database for heat exchangers by cluster analysis of industrial applications. Chem Eng Sci, 167: 278–287

    Article  Google Scholar 

  • Fatehi M, Asadi H H (2017). Application of semi-supervised fuzzy cmeans method in clustering multivariate geochemical data, a case study from the Dalli Cu-Au porphyry deposit in central Iran. Ore Geol Rev, 81: 245–255

    Article  Google Scholar 

  • Fattahi H (2016). Indirect estimation of deformation modulus of an in situ rock mass: an ANFIS model based on grid partitioning, fuzzy cmeans clustering and subtractive clustering. Geosci J, 20(5): 681–690

    Article  Google Scholar 

  • Filzmoser P, Hron K, Reimann C (2009). Univariate statistical analysis of environmental (compositional) data: problems and possibilities. Sci Total Environ, 407(23): 6100–6108

    Article  Google Scholar 

  • Filzmoser P, Hron K, Reimann C (2010). The bivariate statistical analysis of environmental (compositional) data. Sci Total Environ, 408(19): 4230–4238

    Article  Google Scholar 

  • Ghosh T, Kanchan R (2014). Geoenvironmental appraisal of groundwater quality in Bengal alluvial tract, India: a geochemical and statistical approach. Environ Earth Sci, 72(7): 2475–2488

    Article  Google Scholar 

  • Han J, Kamber M (2006). Data Minning: Concepts and Techniques (2nd ed). Beijing: China Machine Press

    Google Scholar 

  • Hassanpour S, Afzal P (2013). Application of concentration–number (C–N) multifractal modeling for geochemical anomaly separation in Haftcheshmeh porphyry system, NW Iran. Arab J Geosci, 6(3): 957–970

    Article  Google Scholar 

  • Hawkes H E, Webb J S (1962). Geochemistry in Mineral Exploration. New York: Harper

    Google Scholar 

  • He G Q, Chen S D, Xu X, Li J Y, Hao J (2004). An Introduction to the Explanatory Text of the Map of Tectonics of Xinjiang and Its Neighbouring Area (1:250000). Beijing: Geological Publishing House (in Chinese)

    Google Scholar 

  • Howarth R J (1983). Statistics and Data Analysis in Geochemical Prospecting. Handbook of Exploration Geochemistry, 2. Amsterdam-Oxford-New York Elsevier

    Google Scholar 

  • Kim T, Moon D C, Park W B, Park K H, Ko G W (2007). Classification of springs of Jeju Island using cluster analysis of annual fluctuations in discharge variables: investigation of the regional groundwater system. Geosci J, 11(4): 397–413

    Article  Google Scholar 

  • Kitzig M C, Kepic A, Kieu D T (2017). Testing cluster analysis on combined petrophysical and geochemical data for rock mass classification. Explor Geophys, 48(3): 344–352

    Article  Google Scholar 

  • Lee J Y, Song S H (2007). Groundwater chemistry and ionic ratios in a western coastal aquifer of Buan, Korea: implication for seawater intrusion. Geosci J, 11(3): 259–270

    Article  Google Scholar 

  • Leite M L C (2016). Applying compositional data methodology to nutritional epidemiology. Stat Methods Med Res, 25(6): 3057–3065

    Article  Google Scholar 

  • Meng H, Song Y, Song F, Shen H (2011). Research and application of cluster and association analysis in geochemical data processing. Computat Geosci, 15(1): 87–98

    Article  Google Scholar 

  • Sahraei Parizi H, Samani N (2013). Geochemical evolution and quality assessment of water resources in the Sarcheshmeh copper mine area (Iran) using multivariate statistical techniques. Environ Earth Sci, 69 (5): 1699–1718

    Article  Google Scholar 

  • Parsa M, Maghsoudi A, Yousefi M, Carranza E J M (2017). Multifractal interpolation and spectrum–area fractal modeling of stream sediment geochemical data: implications for mapping exploration targets. J Afr Earth Sci, 128: 5–15

    Article  Google Scholar 

  • Pazand K, Hezarkhani A, Ataei M, Ghanbari Y (2011). Application of multifractal modeling technique in systematic geochemical stream sediment survey to identify copper anomalies: a case study from Ahar, Azarbaijan, Northwest Iran. Chemie der Erde-Geochemistry, 71(4): 397–402

    Article  Google Scholar 

  • Reimann C, Filzmoser P (2000). Normal and lognormal data distribution in geochemistry: death of a myth. Consequences for the statistical treatment of geochemical and environmental data. Environmental Geology, 39(9): 1001–1014

    Google Scholar 

  • Reimann C, Filzmoser P, Garrett R G (2002). Factor analysis applied to regional geochemical data: problems and possibilities. Appl Geochem, 17(3): 185–206

    Article  Google Scholar 

  • Reimann C, Filzmoser P, Garrett R G (2005). Background and threshold: critical comparison of methods of determination. Sci Total Environ, 346(1–3): 1–16

    Article  Google Scholar 

  • Reimann C, Garrett R G (2005). Geochemical background-concept and reality. Sci Total Environ, 350(1–3): 12–27

    Article  Google Scholar 

  • Rock N M S (1988). Numerical Geology. Lecture Notes in Earth Sciences, 18. New York, Berlin, Heidelberg: Springer-Verlag

    Google Scholar 

  • Stück H, Koch R, Siegesmund S (2013). Petrographical and petrophysical properties of sandstones: statistical analysis as an approach to predict material behaviour and construction suitability. Environ Earth Sci, 69(4): 1299–1332

    Article  Google Scholar 

  • Su Y, Tang H, Hou G, Liu C (2006). Geochemistry of aluminous A-type granites along Darabut teconic belt in west Junggar, Xinjiang. Geochimica, 35(1): 55–67 (in Chinese)

    Google Scholar 

  • Templ M, Filzmoser P, Reimann C (2008). Cluster analysis applied to regional geochemical data: problems and possibilities. Appl Geochem, 23(8): 2198–2213

    Article  Google Scholar 

  • Templ M, Hron K, Filzmoser P (2017). Exploratory tools for outlier detection in compositional data with structural zeros. J Appl Stat, 44 (4): 734–752

    Article  Google Scholar 

  • Tolosana-Delgado R, McKinley J (2016). Exploring the joint compositional variability of major components and trace elements in the Tellus soil geochemistry survey (Northern Ireland). Appl Geochem, 75: 263–276

    Article  Google Scholar 

  • Tukey J W (1977). Exploratory Data Analysis. Reading: Addison-Wesley

    Google Scholar 

  • Wang L, Wang Y, Zhang W, Xu C, An Z (2014). Multivariate statistical techniques for evaluating and identifying the environmental significance of heavy metal contamination in sediments of the Yangtze River, China. Environ Earth Sci, 71(3): 1183–1193

    Article  Google Scholar 

  • Wang X Q, Xie X J, Zhang B R, Hou Q Y (2011). Geochemical probe into China’s continental crust. Acta Geoscientica Sinica, 32: 65–83 (in Chinese)

    Google Scholar 

  • Xie X, Mu X, Ren T (1997). Geochemical mapping in China. J Geochem Explor, 60(1): 99–113

    Article  Google Scholar 

  • Xie X, Wang X, Zhang Q, Zhou G, Cheng H, Liu D, Cheng Z, Xu S (2008). Multi-scale geochemical mapping in China. Geochem Explor Environ Anal, 8(3–4): 333–341

    Article  Google Scholar 

  • Yusta I, Velasco F, Herrero J M (1998). Anomaly threshold estimation and data normalization using EDA statistics: application to lithogeochemical exploration in lower Cretaceous Zn-Pb carbonatehosted deposits, northern Spain. Appl Geochem, 13(4): 421–439

    Article  Google Scholar 

  • Zhang C, Huang X (1992). The ages and tectonic settings of ophiolites in West Junggar, Xinjiang. Geological Review, 38(6): 509–524 (in Chinese)

    Google Scholar 

  • Zhang F (2003). The study of geological characteristics of the gold associated minerals and gold vine of Hatu gold deposit. Journal of Xinjiang Nonferrous Metals, 26(3): 5–6 (in Chinese)

    Google Scholar 

  • Zhu Y, An F, Xu C, Guo H, Xia F, Xiao F, Zhang F, Lin C, Qiu T,Wei S (2013a). Geology and Au-Cu Deposits in the Hatu and its Adjacent Region (Xinjiang): Evolution and Prospecting Model. Beijing: Geological Publishing House

    Google Scholar 

  • Zhu Y, Chen B, Xu X, Qiu T, An F (2013b). A new geological map of the western Junggar, north Xinjiang (NW China): implications for Paleoenvironmental reconstruction. Episodes, 36(3): 205–220

    Google Scholar 

  • Zumlot T, Goodell P, Howari F (2009). Geochemical mapping of New Mexico, USA, using stream sediment data. Environmental Geology, 58(7): 1479–1497

    Article  Google Scholar 

  • Zuo R (2012). Exploring the effects of cell size in geochemical mapping. J Geochem Explor, 112: 357–367

    Article  Google Scholar 

  • Zuo R, Cheng Q (2008). Mapping singularities-a technique to identify potential Cu mineral deposits using sediment geochemical data, an example for Tibet, west China. Mineral Mag, 72(1): 531–534

    Article  Google Scholar 

  • Zuo R, Wang J, Chen G, Yang M (2015). Identification of weak anomalies: a multifractal perspective. J Geochem Explor, 148: 12–24

    Article  Google Scholar 

  • Zuo R, Xia Q,Wang H (2013a). Compositional data analysis in the study of integrated geochemical anomalies associated with mineralization. Appl Geochem, 28: 202–211

    Article  Google Scholar 

  • Zuo R, Xia Q, Zhang D (2013b). A comparison study of the C-A and SA models with singularity analysis to identify geochemical anomalies in covered areas. Appl Geochem, 33: 165–172

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank Ratheesh Kumar R.T, Rustam Orozbaev for their assistance to revise the language before we submit the manuscript and the authors are grateful for the anonymous reviewers’ constructive comments and suggestions. This study was funded by the National Natural Science Foundation of China (Grant Nos. U1503291 and 41402296), and a Major Project in Xinjiang Uygur Autonomous Region (201330121-3).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuguang Zhou.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, S., Zhou, K., Wang, J. et al. Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies. Front. Earth Sci. 12, 491–505 (2018). https://doi.org/10.1007/s11707-017-0682-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11707-017-0682-8

Keywords

Navigation