Abstract
Analysis of temporal climate data is an active research area. Advanced data mining methods designed especially for these temporal data support the domain expert’s pursuit to understand phenomena as the climate change, which is crucial for a sustainable world. Important solutions for mining temporal data are cluster tracing approaches, which are used to mine temporal evolutions of clusters. Generally, clusters represent groups of objects with similar values. In a temporal context like tracing, similar values correspond to similar behavior in one snapshot in time. Each cluster can be interpreted as a behavior type and cluster tracing corresponds to tracking similar behaviors over time. Existing tracing approaches are for datasets satisfying two specific conditions: The clusters appear in all attributes, i.e., fullspace clusters, and the data objects have unique identifiers. These identifiers are used for tracking clusters by measuring the number of objects two clusters have in common, i.e. clusters are traced based on similar object sets. These conditions, however, are strict: First, in complex data, clusters are often hidden in individual subsets of the dimensions. Second, mapping clusters based on similar objects sets does not reflect the idea of tracing similar behavior types over time, because similar behavior can even be represented by clusters having no objects in common. A tracing method based on similar object values is needed. In this paper, we introduce a novel approach that traces subspace clusters based on object value similarity. Neither subspace tracing nor tracing by object value similarity has been done before.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Aggarwal CC (2005) On change diagnosis in evolving data streams. IEEE TKDE 17(5): 587–600
Aggarwal CC, Han J, Wang J, Yu PS (2003) A framework for clustering evolving data streams. In: VLDB, pp 81–92
Aggarwal CC, Han J, Wang J, Yu PS (2004) A framework for projected clustering of high dimensional data streams. In: VLDB, pp 852–863
Agrawal R, Gehrke J, Gunopulos D, Raghavan P (1998) Automatic subspace clustering of high dimensional data for data mining applications. In: ACM SIGMOD, pp 94–105
Barnett T, Pierce D, Schnur R (2001) Detection of anthropogenic climate change in the world’s oceans. Science 292(5515): 270
Boriah S, Kumar V, Steinbach M, Potter C, Klooster SA (2008) Land cover change detection: a case study. In: ACM SIGKDD, pp 857–865
Böttcher M, Höppner F, Spiliopoulou M (2008) On exploiting the power of time in data mining. ACM SIGKDD Explorations 10(2): 3–11
Brodeur R, Mills C, Overland J, Walters G, Schumacher J (1999) Evidence for a substantial increase in gelatinous zooplankton in the bering sea, with possible links to climate change. Fisheries Oceanograp 8(4): 296–306
Cao F, Ester M, Qian W, Zhou A (2006) Density-based clustering over an evolving data stream with noise. In: SIAM SDM, pp 328–339, 2006
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J Royal Stat Soc. Series B, pp 1–38
Ester M, Kriegel H-P, JS, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: ACM SIGKDD, pp 226–231
Fu T (2011) A review on time series data mining. Eng Appl Artif Intel 24(1): 164–181
Gaffney S, Smyth P (1999) Trajectory clustering with mixtures of regression models. In: ACM SIGKDD, pp 63–72
Günnemann S, Kremer H, Seidl T (2010) Subspace clustering for uncertain data. In: SIAM SDM, pp 385–396
Hinneburg A, Aggarwal CC, Keim DA (2000) What is the nearest neighbor in high dimensional spaces? In: VLDB, pp 506–515
Hoegh-Guldberg O (1999) Climate change, coral bleaching and the future of the world’s coral reefs. Marine Freshw Res 50(8): 839–866
Hoffman F, Hargrove W Jr, Erickson D III, Oglesby R (2005) Using clustered climate regimes to analyze and compare predictions from fully coupled general circulation models. Earth Interact 9(10): 1–27
Huntington T (2006) Evidence for intensification of the global water cycle: Review and synthesis. J Hydrol 319(1-4): 83–95
Jensen CS, Lin D, Ooi BC (2007) Continuous clustering of moving objects. IEEE TKDE 19(9): 1161–1174
Kalnis P, Mamoulis N, Bakiras S (2005) On discovering moving clusters in spatio-temporal data. In: SSTD, Springer, pp 364–381
Kremer H, Günnemann S, Seidl T (2010) Detecting climate change in multivariate time series data by novel clustering and cluster tracing techniques. In: IEEE ICDM Workshops, pp 96–97
Kremer H, Kranen P, Jansen T, Seidl T, Bifet A, Holmes G, Pfahringer B (2011) An effective evaluation measure for clustering on evolving data streams. In: ACM SIGKDD, pp 868–876
Kriegel H-P, Kröger P, Zimek A (2009) Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM TKDD 3(1): 1–58
Li Y, Han J, Yang J (2004) Clustering moving objects. In: ACM SIGKDD, pp 617–622
Liao TW (2005) Clustering of time series data: a survey. Patt Recogn 38(11): 1857–1874
Longhurst A (1998) Ecological geography of the sea. Academic Press, London
Müller E, Günnemann S, Assent I, Seidl T (2009) Evaluating clustering in subspace projections of high dimensional data. In: VLDB, pp 1270–1281
Parsons L, Haque E, Liu H (2004) Subspace clustering for high dimensional data: a review. ACM SIGKDD Explorations 6(1): 90–105
Patrikainen A, Meila M (2006) Comparing subspace clusterings. IEEE TKDE 18(7): 902–916
Procopiuc CM, Jones M, Agarwal PK, Murali TM (2002) A monte carlo algorithm for fast projective clustering. In ACM SIGMOD, pp 418–427
Rosswog J, Ghose K (2008) Detecting and tracking spatio-temporal clusters with adaptive history filtering. In: IEEE ICDM Workshops, pp 448–457
Siegel D, Doney S, Yoder J (2002) The North Atlantic spring phytoplankton bloom and Sverdrup’s critical depth hypothesis. Science 296(5568): 730
Spiliopoulou M, Ntoutsi I, Theodoridis Y, Schult R (2006) MONIC - modeling and monitoring cluster transitions. In: ACM SIGKDD, pp 706–711
Steinbach M, Tan P-N, Kumar V, Klooster SA, Potter C (2003) Discovery of climate indices using clustering. In: ACM SIGKDD, pp 446–455
Vlachos M, Gunopulos D, Kollios G (2002) Discovering similar multidimensional trajectories. In: IEEE ICDE, pp 673–684
Yiu ML, Mamoulis N (2003) Frequent-pattern based iterative projected clustering. In: IEEE ICDM, pp 689–692
Zhou D, Li J, Zha H (2005) A new mallows distance based metric for comparing clusterings. In: ICML, pp 1028–1035
Acknowledgments
We thank the Alfred Wegener Institute for Polar and Marine Research for providing the Oceanographic Grid Data. This article has been supported by the UMIC Research Centre, RWTH Aachen University.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution,and reproduction in any medium, provided the original author(s) and source are credited.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Katharina Morik, Kanishka Bhaduri and Hillol Kargupta.
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Günnemann, S., Kremer, H., Laufkötter, C. et al. Tracing Evolving Subspace Clusters in Temporal Climate Data. Data Min Knowl Disc 24, 387–410 (2012). https://doi.org/10.1007/s10618-011-0237-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-011-0237-7