Skip to main content

A “Weighted” Geochemical Variable Classification Method Based on Latent Variables

Abstract

Clustering of variables relies on relationships among them. The strength of those relationships is generally measured by the correlation coefficients between pairs of variables. This paper proposes specified variable weighted correlation coefficients and takes the clustering around latent variables (CLV) approach as an example to transform the common clustering method into a “weighted” clustering method. The aim is to eliminate factors that are unrelated to the variable that was adopted for weighting to ensure that the cluster centers are sufficiently different and have good correlations with the adopted variable. A log-transformed dataset was used to evaluate the proposed method. Three clusters were obtained under the restriction of the As element, and they represented three ore-controlling factors related to the Goldenville Formation, namely geologic features such as formation, fault contacts, and granitoid intrusions. Not only did the new cluster centers account for most of the variability related to the weighted element (As) but they also showed significant differences in spatial distributions.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12

References

  • Agterberg, F. P., Bonham-Carter, G. F., & Wright, D. F. (1990). Statistical pattern integration for mineral exploration. InComputer applications in resource estimation (pp. 1–21). Pergamon.

  • Aitchison, J. (1982). The statistical analysis of compositional data.Journal of the Royal Statistical Society: Series B (Methodological),44(2), 139–160.

    Google Scholar 

  • Aitchison, J. (1984). The statistical analysis of geochemical compositions.Journal of the International Association for Mathematical Geology,16(6), 531–564.

    Article  Google Scholar 

  • Aitchison, J. (1986).The statistical analysis of compositional data. London: Chapman & Hall.

    Book  Google Scholar 

  • Bonham-Carter, G. F., Agterberg, F. P., & Wright, D. F. (1988). Integration of geological datasets for gold exploration in Nova Scotia.Photogrammetric Engineering and Remote Sensing,54(11), 1585–1592.

    Google Scholar 

  • Buccianti, A., Lima, A., Albanese, S., & De Vivo, B. (2018). Measuring the change under compositional data analysis (CoDA): Insight on the dynamics of geochemical systems.Journal of Geochemical Exploration,189, 100–108.

    Article  Google Scholar 

  • Castillo-Muñoz, R., & Howarth, R. J. (1976). Application of the empirical discriminant function to regional geochemical data from the United Kingdom.Geological Society of America Bulletin,87(11), 1567–1581.

    Article  Google Scholar 

  • Chatterjee, A. K. (1983).Metallogenic map of the Province of Nova Scotia. Department of Mines and Energy, Nova Scotia, Canada, ver. 1, scale 1: 500,000

  • Chen, M., & Vigneau, E. (2014). Supervised clustering of variables.Advances in Data Analysis and Classification,10(1), 85–101.

    Article  Google Scholar 

  • Chun, H., & Keleş, S. (2010). Sparse partial least squares regression for simultaneous dimension reduction and variable selection.Journal of the Royal Statistical Society: Series B (Statistical Methodology),72(1), 3–25.

    Article  Google Scholar 

  • Crocket, J., Fueten, F., & Clifford, P. (1986). Distribution and localization of gold in Meguma Group rocks, Nova Scotia: Implications of metal distribution patterns in quartz veins and host rocks on mineralization processes at Harrigan Cove, Halifax County.Atlantic Geology,22(1), 15–33.

    Article  Google Scholar 

  • Dhillon, I. S., Marcotte, E. M., & Roshan, U. (2003). Diametrical clustering for identifying anti-correlated gene clusters.Bioinformatics,19(13), 1612–1619.

    Article  Google Scholar 

  • Dmitrijeva, M., Ehrig, K. J., Ciobanu, C. L., Cook, N. J., Verdugo-Ihl, M. R., & Metcalfe, A. V. (2019). Defining IOCG signatures through compositional data analysis: A case study of lithogeochemical zoning from the Olympic Dam deposit, South Australia.Ore Geology Reviews,105, 86–101.

    Article  Google Scholar 

  • Dunn, C. E., Coker, W. B., & Rogers, P. J. (1991). Reconnaissance and detailed geochemical surveys for gold in eastern Nova Scotia using plants, lake sediment, soil and till.Journal of Geochemical Exploration,40(1–3), 143–163.

    Article  Google Scholar 

  • Egozcue, J. J., Pawlowsky-Glahn, V., Mateu-Figueras, G., & Barcelo-Vidal, C. (2003). Isometric logratio transformations for compositional data analysis.Mathematical Geology,35(3), 279–300.

    Article  Google Scholar 

  • Greenacre, M., Grunsky, E., & Bacon-Shone, J. (2021). A comparison of isometric and amalgamation logratio balances in compositional data analysis.Computers and Geosciences,148, 104621.

    Article  Google Scholar 

  • Gustavsson, N., & Bjorklund, A. (1976). Lithological classification of tills by discriminant analysis.Journal of Geochemical Exploration,5, 393–395.

    Google Scholar 

  • Hanesch, M., Scholger, R., & Dekkers, M. J. (2001). The application of fuzzy c-means cluster analysis and non-linear mapping to a soil data set for the detection of polluted sites.Physics and Chemistry of the Earth, Part A: Solid Earth and Geodesy,26(11–12), 885–891.

    Article  Google Scholar 

  • Howarth, R. J., & Jones, M. J. (1972). The pattern recognition problem in applied geochemistry. Geochemical Exploration, 259–273.

  • Jain, A. K., & Dubes, R. C. (1988).Algorithms for clustering data. Prentice-Hall.

    Google Scholar 

  • Ji, H., Zeng, D., Shi, Y., Wu, Y., & Wu, X. (2007). Semi-hierarchical correspondence cluster analysis and regional geochemical pattern recognition.Journal of Geochemical Exploration,93(2), 109–119.

    Article  Google Scholar 

  • Jolliffe, I. T., Trendafilov, N. T., & Uddin, M. (2003). A modified principal component technique based on the LASSO.Journal of Computational and Graphical Statistics,12(3), 531–547.

    Article  Google Scholar 

  • Kerswill, J. A. (1991). Lithogeochemical indicators of gold potential in the eastern Meguma Terrane of Nova Scotia.Papers-Geological Survey of Canada, 19–19.

  • Kontak, D. J., Horne, R. J., Sandeman, H., Archibald, D., & Lee, J. K. (1998). 40Ar/39Ar dating of ribbon-textured veins and wall-rock material from Meguma lode gold deposits, Nova Scotia: Implications for timing and duration of vein formation in slate-belt hosted vein gold deposits.Canadian Journal of Earth Sciences,35(7), 746–761.

    Article  Google Scholar 

  • Kontak, D. J., & Kerrich, R. (1997). An isotopic (C, O, Sr) study of vein gold deposits in the Meguma Terrane, Nova Scotia; implication for source reservoirs.Economic Geology,92(2), 161–180.

    Article  Google Scholar 

  • Kontak, D. J., Smith, P. K., Kerrich, R., & Williams, P. F. (1990). Integrated model for Meguma group lode gold deposits, Nova Scotia, Canada.Geology,18(3), 238–242.

    Article  Google Scholar 

  • Kramar, U. (1995). Application of limited fuzzy clusters to anomaly recognition in complex geological environments.Journal of Geochemical Exploration,55(1–3), 81–92.

    Article  Google Scholar 

  • Kriegel, H. P., Kröger, P., Schubert, E., & Zimek, A. (2008). A general framework for increasing the robustness of PCA-based correlation clustering algorithms. InInternational Conference on Scientific and Statistical Database Management (pp. 418–435). Springer.

  • Lê Cao, K. A., Rossouw, D., Robert-Granié, C., & Besse, P. (2008). A sparse PLS for variable selection when integrating omics data.Statistical Applications in Genetics and Molecular Biology7(1).

  • Liu, J., Cheng, Q., & Wang, J. (2015). Identification of geochemical factors in regression to mineralization endogenous variables using structural equation modeling.Journal of Geochemical Exploration,150, 125–136.

    Article  Google Scholar 

  • Mawer, C. K. (1986). The bedding-concordant gold-quartz veins of the Meguma Group, Nova Scotia.Turbidite-Hosted Gold Deposits,32, 135–148.

    Google Scholar 

  • Qannari, E. M., Vigneau, E., & Courcoux, P. (1998). Une nouvelle distance entre variables. Application en classification.Revue de Statistique Appliquée,46(2), 21–32.

  • Qannari, E. M., Vigneau, E., Luscan, P., Lefebvre, A. C., & Vey, F. (1997). Clustering of variables, application in consumer and sensory studies.Food Quality and Preference,8(5–6), 423–428.

    Article  Google Scholar 

  • Rantitsch, G. (2000). Application of fuzzy clusters to quantify lithological background concentrations in stream-sediment geochemistry.Journal of Geochemical Exploration,71(1), 73–82.

    Article  Google Scholar 

  • Reimann, C., & Filzmoser, P. (2000). Normal and lognormal data distribution in geochemistry: Death of a myth Consequences for the statistical treatment of geochemical and environmental data.Environmental Geology,39(9), 1001–1014.

    Article  Google Scholar 

  • Reimann, C., Filzmoser, P., & Garrett, R. G. (2002). Factor analysis applied to regional geochemical data: Problems and possibilities.Applied Geochemistry,17(3), 185–206.

    Article  Google Scholar 

  • Rogers, P. J., Chatterjee, A. K., & Aucott, J. W. (1990). Metallogenic domains and their reflection in regional lake sediment surveys from the Meguma Zone, southern Nova Scotia, Canada.Journal of Geochemical Exploration,39(1–2), 153–174.

    Article  Google Scholar 

  • Ryan, R. J., & Ramsay, W. R. H. (1996). Preliminary comparison of gold field in the Meguma Terrain, Nova Scotia, and Victoria, Australia.MacDonald, DR, Mills, & KA (eds). Mines and Mineral Branch. Report of Activities, 97–1.

  • Sangster, A. L. (1990). Metallogeny of the Meguma Terrane, Nova Scotia.Mineral Deposit Studies in Nova Scotia,1, 90–98.

    Google Scholar 

  • Särndal, C. E., Swensson, B., & Wretman, J. (2003).Model assisted survey sampling. Springer.

    Google Scholar 

  • Soffritti, G. (1999). Hierarchical clustering of variables: A comparison among strategies of analysis.Communications in Statistics-Simulation and Computation,28(4), 977–999.

    Article  Google Scholar 

  • Subedi, S., Punzo, A., Ingrassia, S., & McNicholas, P. D. (2013). Clustering and classification via cluster-weighted factor analyzers.Advances in Data Analysis and Classification,7(1), 5–40.

    Article  Google Scholar 

  • Templ, M., Filzmoser, P., & Reimann, C. (2008). Cluster analysis applied to regional geochemical data: Problems and possibilities.Applied Geochemistry,23(8), 2198–2213.

    Article  Google Scholar 

  • Thiombane, M., Martín-Fernández, J. A., Albanese, S., Lima, A., Doherty, A., & De Vivo, B. (2018). Exploratory analysis of multi-element geochemical patterns in soil from the Sarno River Basin (Campania region, southern Italy) through compositional data analysis (CODA).Journal of Geochemical Exploration,195, 110–120.

    Article  Google Scholar 

  • Tillé, Y., & Matei, A. (2011). Sampling: Survey Sampling. R Package Version 2.4.

  • Vigneau, E., Endrizzi, I., & Qannari, E. M. (2011). Finding and explaining clusters of consumers using the CLV approach.Food Quality and Preference,22(8), 705–713.

    Article  Google Scholar 

  • Vigneau, E., & Qannari, E. M. (2003). Clustering of variables around latent components.Communications in Statistics-Simulation and Computation,32(4), 1131–1150.

    Article  Google Scholar 

  • Vriend, S. P., van Gaans, P. M., Middelburg, J., & De Nijs, A. (1988). The application of fuzzy c-means cluster analysis and non-linear mapping to geochemical datasets: Examples from Portugal.Applied Geochemistry,3(2), 213–224.

    Article  Google Scholar 

  • Xie, X., Liu, D., Xiang, Y., Yan, G., & Lian, C. (2004). Geochemical blocks for predicting large ore deposits—concept and methodology.Journal of Geochemical Exploration,84(2), 77–91.

    Article  Google Scholar 

  • Xu, Y., & Cheng, Q. (2001). A fractal filtering technique for processing regional geochemical maps for mineral exploration.Geochemistry: Exploration, Environment, Analysis,1(2), 147–156.

    Google Scholar 

  • Zentilli, M., Graves, M. C., Mulja, T., MacInnis, I., & Matheson, J. R. (1985). Geochemical characterization of the Goldenville-Halifax Transition of the Meguma Group of Nova Scotia; preliminary report.Information Series-Nova Scotia, Department of Mines and Energy.

  • Zou, H., Hastie, T., & Tibshirani, R. (2006). Sparse principal component analysis.Journal of Computational and Graphical Statistics,15(2), 265–286.

    Article  Google Scholar 

Download references

Acknowledgments

This study was funded by the Foreign Aid Project of the Ministry of Commerce of the People’s Republic of China (2021-28) and the China National Major Water Conservancy Project Construction Fund (0001212012AC50001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yusen Dong.

Ethics declarations

Conflict of Interest

The authors declare that they have no known competing financial interests or personal relationships that could affect the work reported in this article.

Data Availability

The data that supported the findings of this study are openly available in Department of Natural Resources and Renewables, Nova Scotia, Canada at https://novascotia.ca/natr/meb/geoscience-online/geochemistry.asp.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Cheng, Q., Wang, JG. et al. A “Weighted” Geochemical Variable Classification Method Based on Latent Variables. Nat Resour Res 31, 1925–1941 (2022). https://doi.org/10.1007/s11053-022-10061-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11053-022-10061-8

Keywords

  • Variable clustering
  • Clustering around latent variables (CLV)
  • Weighted clustering
  • Geochemical factor extraction