Abstract
For applications of data mining techniques in geosciences, through mining spatial databases which are constructed with geophysical and geochemical data measured in fields, critical knowledge, such as the spatial distribution of geological targets, the geophysical and geochemical characteristics of geological targets, the differentiation among the geological targets, and the relationship among geophysical and geochemical data, can be discovered. Due to the complexity of geophysical and geochemical data, traditional mining methods of cluster analysis and association analysis have limitations in processing complex data. In this paper, a clustering algorithm based on density and adaptive density-reachable is presented which has the ability to handle clusters of arbitrary shapes, sizes, and densities. For association analysis, mining the continuous attributes may reveal useful and interesting insights about the data objects in geoscientific applications. An approach for distance-based quantitative association analysis is presented in this paper. Experiments and applications indicate that the algorithm and approach are effective in real-world applications.
Similar content being viewed by others
References
Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson, New York (2006)
Ng, R.T., Han, J.: CLARANS: a method for clustering objects for spatial data mining. IEEE Trans. Knowl. Data Eng. 14(3), 1003–1016 (2002)
Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: Proc. Of 1998 ACM-SIGMOD Intl. Conf. on Management of Data, pp. 73–84. ACM Press, June (1998)
Karypis, G., Han, E.-H., Kumar, V.: CHAMELEON: a hierarchical clustering algorithm using dynamic modeling. IEEE Comput. 32(5), 68–75 (1999)
Zhang, T., Ramakrishnan, R., Livny, M.: An efficient data clustering method for very large databases. In: Proc. ACM SIGMOD Conference on Management of Data, pp. 103–114. Montreal, Canada (1996)
Ester, M., Kriegel, H.-P., Sander, J., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings-The 2nd Intl. Conf. on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press, Portland, Oregon (1996)
Sander, J.-O., Ester, M., Kriegel, H.-P., et al.: Density-based clustering in spatial data sets: the algorithm GDBSCAN and its applications. Data Min. Knowl. Discov. 2, 169–194 (1998)
Hinneberg, A., Keim, D.A.: An efficient approach to clustering in large multimedia databases with noise. In: Proc. Of the 4th Intl. Conf. on Knowledge Discovery and Data Mining, pp. 58–65. AAAI press, New York City (1998)
Ertöz, L., Steinbach, M., Kumar, V.: Finding clusters of different size, shapes and densities in noisy high dimensional data. In: Proc. of the 3rd SIAM International Conference on Data Mining, pp. 47–58 (2003)
Ayad, H., Kamel, M.: Finding natural clusters using multi-cluster combiner based on shared nearest neighbors. In: Proceedings of the 4th International Workshop on Multiple Classifier Systems (MCS 2003), pp. 159–175. LNCS 2709, Springer Berlin (2003)
Wang, W., Yang, J., Muntz, R.: STING: a statistical information grid approach to spatial data mining. In: Proc. 1997 Intl. Conf. Very Large Data Bases (VLDB’97), pp. 186–195. Athens, Greece, August (1997)
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: proc. 1998 ACM-SIGMOD Intl. Conf. Management of Data (SIGMOD’98), pp. 94–105. Seattle, WA, June (1998)
Sheikholeslami, G., Chatterjee, S., Zhang, A.: Wave cluster: a multi-resolution clustering approach for very large spatial databases. In: Proc. of the 24th VLDB conf., pp. 428–439. NewYork City, August (1998)
Fisher, D.: Improving inference through conceptual clustering. In: Proc. 1987 AAAI Conf, pp. 461–465. Seattle, WA, July (1987)
Gennari, J., Langley, P., Fisher, D.: Models of incremental concept for mation. Artif. Intell. 40, 11–61 (1989)
Cheeseman, P., Stutz, J.: Bayesian Classification (AutoClass): Theory and Results. Advances in Knowledge Discovery and Data Mining, pp. 153–180. AAAI/MIT, Cambridge (1996)
Meng, H.-D., Song, Y.-C., Song, F.-Y.: Research and implementation of clustering algorithm for arbitrary clusters. In: Proceedings-International Conference on Computer Science and Software Engineering, pp. 255–258, v 4 (2008)
Song, Y.-C., Meng, H.-D., O’Grady, M.J., O’Hare, G.M.P.: Research and application of clustering algorithm for arbitrary data set. In: Proceedings-International Conference on Computer Science and Software Engineering, v 4: pp. 251–254 (2008)
Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. ACM SIGMOD 25(2), 1–12 (1996)
Miller, R.J., Yang, Y.: Association rules over interval data. ACM SIGMOD 26(2), 452–462 (1997)
Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. J. Intell. Inform. Syst. 20(3), 255–283 (2003)
Song, Y.-C., Meng, H.-D., O’Grady, M.J., O’Hare, G.M.P.: The application of cluster analysis in geophysical data interpretation. Comput. Geosci. 14(2), 263–271 (2010)
Dai, P.-G., Gu, X.-P., Yu, J.-L.: Applied Geochemistry. Central South University Press, China (2008)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Meng, HD., Song, YC., Song, FY. et al. Research and application of cluster and association analysis in geochemical data processing. Comput Geosci 15, 87–98 (2011). https://doi.org/10.1007/s10596-010-9199-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10596-010-9199-x