Abstract
Clustering of variables relies on relationships among them. The strength of those relationships is generally measured by the correlation coefficients between pairs of variables. This paper proposes specified variable weighted correlation coefficients and takes the clustering around latent variables (CLV) approach as an example to transform the common clustering method into a “weighted” clustering method. The aim is to eliminate factors that are unrelated to the variable that was adopted for weighting to ensure that the cluster centers are sufficiently different and have good correlations with the adopted variable. A log-transformed dataset was used to evaluate the proposed method. Three clusters were obtained under the restriction of the As element, and they represented three ore-controlling factors related to the Goldenville Formation, namely geologic features such as formation, fault contacts, and granitoid intrusions. Not only did the new cluster centers account for most of the variability related to the weighted element (As) but they also showed significant differences in spatial distributions.
Similar content being viewed by others
References
Agterberg, F. P., Bonham-Carter, G. F., & Wright, D. F. (1990). Statistical pattern integration for mineral exploration. InComputer applications in resource estimation (pp. 1–21). Pergamon.
Aitchison, J. (1982). The statistical analysis of compositional data.Journal of the Royal Statistical Society: Series B (Methodological),44(2), 139–160.
Aitchison, J. (1984). The statistical analysis of geochemical compositions.Journal of the International Association for Mathematical Geology,16(6), 531–564.
Aitchison, J. (1986).The statistical analysis of compositional data. London: Chapman & Hall.
Bonham-Carter, G. F., Agterberg, F. P., & Wright, D. F. (1988). Integration of geological datasets for gold exploration in Nova Scotia.Photogrammetric Engineering and Remote Sensing,54(11), 1585–1592.
Buccianti, A., Lima, A., Albanese, S., & De Vivo, B. (2018). Measuring the change under compositional data analysis (CoDA): Insight on the dynamics of geochemical systems.Journal of Geochemical Exploration,189, 100–108.
Castillo-Muñoz, R., & Howarth, R. J. (1976). Application of the empirical discriminant function to regional geochemical data from the United Kingdom.Geological Society of America Bulletin,87(11), 1567–1581.
Chatterjee, A. K. (1983).Metallogenic map of the Province of Nova Scotia. Department of Mines and Energy, Nova Scotia, Canada, ver. 1, scale 1: 500,000
Chen, M., & Vigneau, E. (2014). Supervised clustering of variables.Advances in Data Analysis and Classification,10(1), 85–101.
Chun, H., & Keleş, S. (2010). Sparse partial least squares regression for simultaneous dimension reduction and variable selection.Journal of the Royal Statistical Society: Series B (Statistical Methodology),72(1), 3–25.
Crocket, J., Fueten, F., & Clifford, P. (1986). Distribution and localization of gold in Meguma Group rocks, Nova Scotia: Implications of metal distribution patterns in quartz veins and host rocks on mineralization processes at Harrigan Cove, Halifax County.Atlantic Geology,22(1), 15–33.
Dhillon, I. S., Marcotte, E. M., & Roshan, U. (2003). Diametrical clustering for identifying anti-correlated gene clusters.Bioinformatics,19(13), 1612–1619.
Dmitrijeva, M., Ehrig, K. J., Ciobanu, C. L., Cook, N. J., Verdugo-Ihl, M. R., & Metcalfe, A. V. (2019). Defining IOCG signatures through compositional data analysis: A case study of lithogeochemical zoning from the Olympic Dam deposit, South Australia.Ore Geology Reviews,105, 86–101.
Dunn, C. E., Coker, W. B., & Rogers, P. J. (1991). Reconnaissance and detailed geochemical surveys for gold in eastern Nova Scotia using plants, lake sediment, soil and till.Journal of Geochemical Exploration,40(1–3), 143–163.
Egozcue, J. J., Pawlowsky-Glahn, V., Mateu-Figueras, G., & Barcelo-Vidal, C. (2003). Isometric logratio transformations for compositional data analysis.Mathematical Geology,35(3), 279–300.
Greenacre, M., Grunsky, E., & Bacon-Shone, J. (2021). A comparison of isometric and amalgamation logratio balances in compositional data analysis.Computers and Geosciences,148, 104621.
Gustavsson, N., & Bjorklund, A. (1976). Lithological classification of tills by discriminant analysis.Journal of Geochemical Exploration,5, 393–395.
Hanesch, M., Scholger, R., & Dekkers, M. J. (2001). The application of fuzzy c-means cluster analysis and non-linear mapping to a soil data set for the detection of polluted sites.Physics and Chemistry of the Earth, Part A: Solid Earth and Geodesy,26(11–12), 885–891.
Howarth, R. J., & Jones, M. J. (1972). The pattern recognition problem in applied geochemistry. Geochemical Exploration, 259–273.
Jain, A. K., & Dubes, R. C. (1988).Algorithms for clustering data. Prentice-Hall.
Ji, H., Zeng, D., Shi, Y., Wu, Y., & Wu, X. (2007). Semi-hierarchical correspondence cluster analysis and regional geochemical pattern recognition.Journal of Geochemical Exploration,93(2), 109–119.
Jolliffe, I. T., Trendafilov, N. T., & Uddin, M. (2003). A modified principal component technique based on the LASSO.Journal of Computational and Graphical Statistics,12(3), 531–547.
Kerswill, J. A. (1991). Lithogeochemical indicators of gold potential in the eastern Meguma Terrane of Nova Scotia.Papers-Geological Survey of Canada, 19–19.
Kontak, D. J., Horne, R. J., Sandeman, H., Archibald, D., & Lee, J. K. (1998). 40Ar/39Ar dating of ribbon-textured veins and wall-rock material from Meguma lode gold deposits, Nova Scotia: Implications for timing and duration of vein formation in slate-belt hosted vein gold deposits.Canadian Journal of Earth Sciences,35(7), 746–761.
Kontak, D. J., & Kerrich, R. (1997). An isotopic (C, O, Sr) study of vein gold deposits in the Meguma Terrane, Nova Scotia; implication for source reservoirs.Economic Geology,92(2), 161–180.
Kontak, D. J., Smith, P. K., Kerrich, R., & Williams, P. F. (1990). Integrated model for Meguma group lode gold deposits, Nova Scotia, Canada.Geology,18(3), 238–242.
Kramar, U. (1995). Application of limited fuzzy clusters to anomaly recognition in complex geological environments.Journal of Geochemical Exploration,55(1–3), 81–92.
Kriegel, H. P., Kröger, P., Schubert, E., & Zimek, A. (2008). A general framework for increasing the robustness of PCA-based correlation clustering algorithms. InInternational Conference on Scientific and Statistical Database Management (pp. 418–435). Springer.
Lê Cao, K. A., Rossouw, D., Robert-Granié, C., & Besse, P. (2008). A sparse PLS for variable selection when integrating omics data.Statistical Applications in Genetics and Molecular Biology, 7(1).
Liu, J., Cheng, Q., & Wang, J. (2015). Identification of geochemical factors in regression to mineralization endogenous variables using structural equation modeling.Journal of Geochemical Exploration,150, 125–136.
Mawer, C. K. (1986). The bedding-concordant gold-quartz veins of the Meguma Group, Nova Scotia.Turbidite-Hosted Gold Deposits,32, 135–148.
Qannari, E. M., Vigneau, E., & Courcoux, P. (1998). Une nouvelle distance entre variables. Application en classification.Revue de Statistique Appliquée,46(2), 21–32.
Qannari, E. M., Vigneau, E., Luscan, P., Lefebvre, A. C., & Vey, F. (1997). Clustering of variables, application in consumer and sensory studies.Food Quality and Preference,8(5–6), 423–428.
Rantitsch, G. (2000). Application of fuzzy clusters to quantify lithological background concentrations in stream-sediment geochemistry.Journal of Geochemical Exploration,71(1), 73–82.
Reimann, C., & Filzmoser, P. (2000). Normal and lognormal data distribution in geochemistry: Death of a myth Consequences for the statistical treatment of geochemical and environmental data.Environmental Geology,39(9), 1001–1014.
Reimann, C., Filzmoser, P., & Garrett, R. G. (2002). Factor analysis applied to regional geochemical data: Problems and possibilities.Applied Geochemistry,17(3), 185–206.
Rogers, P. J., Chatterjee, A. K., & Aucott, J. W. (1990). Metallogenic domains and their reflection in regional lake sediment surveys from the Meguma Zone, southern Nova Scotia, Canada.Journal of Geochemical Exploration,39(1–2), 153–174.
Ryan, R. J., & Ramsay, W. R. H. (1996). Preliminary comparison of gold field in the Meguma Terrain, Nova Scotia, and Victoria, Australia.MacDonald, DR, Mills, & KA (eds). Mines and Mineral Branch. Report of Activities, 97–1.
Sangster, A. L. (1990). Metallogeny of the Meguma Terrane, Nova Scotia.Mineral Deposit Studies in Nova Scotia,1, 90–98.
Särndal, C. E., Swensson, B., & Wretman, J. (2003).Model assisted survey sampling. Springer.
Soffritti, G. (1999). Hierarchical clustering of variables: A comparison among strategies of analysis.Communications in Statistics-Simulation and Computation,28(4), 977–999.
Subedi, S., Punzo, A., Ingrassia, S., & McNicholas, P. D. (2013). Clustering and classification via cluster-weighted factor analyzers.Advances in Data Analysis and Classification,7(1), 5–40.
Templ, M., Filzmoser, P., & Reimann, C. (2008). Cluster analysis applied to regional geochemical data: Problems and possibilities.Applied Geochemistry,23(8), 2198–2213.
Thiombane, M., Martín-Fernández, J. A., Albanese, S., Lima, A., Doherty, A., & De Vivo, B. (2018). Exploratory analysis of multi-element geochemical patterns in soil from the Sarno River Basin (Campania region, southern Italy) through compositional data analysis (CODA).Journal of Geochemical Exploration,195, 110–120.
Tillé, Y., & Matei, A. (2011). Sampling: Survey Sampling. R Package Version 2.4.
Vigneau, E., Endrizzi, I., & Qannari, E. M. (2011). Finding and explaining clusters of consumers using the CLV approach.Food Quality and Preference,22(8), 705–713.
Vigneau, E., & Qannari, E. M. (2003). Clustering of variables around latent components.Communications in Statistics-Simulation and Computation,32(4), 1131–1150.
Vriend, S. P., van Gaans, P. M., Middelburg, J., & De Nijs, A. (1988). The application of fuzzy c-means cluster analysis and non-linear mapping to geochemical datasets: Examples from Portugal.Applied Geochemistry,3(2), 213–224.
Xie, X., Liu, D., Xiang, Y., Yan, G., & Lian, C. (2004). Geochemical blocks for predicting large ore deposits—concept and methodology.Journal of Geochemical Exploration,84(2), 77–91.
Xu, Y., & Cheng, Q. (2001). A fractal filtering technique for processing regional geochemical maps for mineral exploration.Geochemistry: Exploration, Environment, Analysis,1(2), 147–156.
Zentilli, M., Graves, M. C., Mulja, T., MacInnis, I., & Matheson, J. R. (1985). Geochemical characterization of the Goldenville-Halifax Transition of the Meguma Group of Nova Scotia; preliminary report.Information Series-Nova Scotia, Department of Mines and Energy.
Zou, H., Hastie, T., & Tibshirani, R. (2006). Sparse principal component analysis.Journal of Computational and Graphical Statistics,15(2), 265–286.
Acknowledgments
This study was funded by the Foreign Aid Project of the Ministry of Commerce of the People’s Republic of China (2021-28) and the China National Major Water Conservancy Project Construction Fund (0001212012AC50001).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no known competing financial interests or personal relationships that could affect the work reported in this article.
Data Availability
The data that supported the findings of this study are openly available in Department of Natural Resources and Renewables, Nova Scotia, Canada at https://novascotia.ca/natr/meb/geoscience-online/geochemistry.asp.
Rights and permissions
About this article
Cite this article
Liu, J., Cheng, Q., Wang, JG. et al. A “Weighted” Geochemical Variable Classification Method Based on Latent Variables. Nat Resour Res 31, 1925–1941 (2022). https://doi.org/10.1007/s11053-022-10061-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11053-022-10061-8