Abstract
Knowledge discovery in Trajectory Databases (TD) is an emerging field which has recently gained great interest. On the other hand, the inherent presence of uncertainty in TD (e.g., due to GPS errors) has not been taken yet into account during the mining process. In this paper, we study the effect of uncertainty in TD clustering and introduce a three-step approach to deal with it. First, we propose an intuitionistic point vector representation of trajectories that encompasses the underlying uncertainty and introduce an effective distance metric to cope with uncertainty. Second, we devise CenTra, a novel algorithm which tackles the problem of discovering the Centroid Trajectory of a group of movements taking into advantage the local similarity between portions of trajectories. Third, we propose a variant of the Fuzzy C-Means (FCM) clustering algorithm, which embodies CenTra at its update procedure. Finally, we relax the vector representation of the Centroid Trajectories by introducing an algorithm that post-processes them, as such providing these mobility patterns to the analyst with a more intuitive representation. The experimental evaluation over synthetic and real world TD demonstrates the efficiency and effectiveness of our approach.
Similar content being viewed by others
References
Abul O, Bonchi F, Nanni M (2008) Never walk alone: uncertainty for anonymity in moving objects databases. In: Proceedings of ICDE
Anagnostopoulos A, Vlachos M, Hadjieleftheriou M, Keogh E, Yu PS (2006) Global distance-based segmentation of trajectories. In: Proceedings of KDD
Andrienko G, Andrienko N, Wrobel S (2007) Visual analytics tools for analysis of movement data. ACM SIGKDD Explor 9(2): 38–46
Ankerst M, Breunig MM, Kriegel H-P, Sander J (1999) OPTICS: ordering points to identify the clustering structure. In: Proceedings of SIGMOD
Assent I, Krieger R, Glavic B, Seidl T (2008) Clustering multidimensional sequences in spatial and temporal databases. Knowl Inf Sys 16(1): 29–51
Atanassov KT (1999) Intuitionistic fuzzy sets: theory and applications. Studies in fuzziness and soft computing, p 35
Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 10(2-3): 191–203
Cadez IV, Gaffney S, Smyth P (2000) A general probabilistic framework for clustering individuals and objects. In: Proceedings of SIGKDD
Chen L, Ng R (2004) On the marriage of edit distance and Lp norms. In: Proceedings of VLDB
Chen L, Tamer Özsu M, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of SIGMOD
Chen SM (1995) Measures of similarity between vague sets. Fuzzy Sets Sys 74(2): 217–223
Chen SM (1997) Similarity measures between vague sets and between elements. IEEE TSMC 27(1): 153–158
Dengfeng L, Chuntian C (2002) New similarity measure of intuitionistic fuzzy sets and application to pattern recognitions. Pattern Recogn Lett 23(1–3): 221–225
Denton AM, Besemann CA, Dorr DH (2009) Pattern-based time-series subsequence clustering using radial distribution functions. Knowl Inf Sys 18(1): 1–27
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of KDD
Fan L, Zhangyan X (2001) Similarity measures between vague sets. J Softw 12(6): 922–927
Frentzos E, Gratsias K, Theodoridis Y (2007) Index-based most similar trajectory search. In: Proceedings of ICDE
Gaffney S, Smyth P (1999) Trajectory clustering with mixtures of regression models. In: Proceedings of SIGKDD
Giannotti F, Nanni M, Pedreschi D. Pinelli F (2007) Trajectory Pattern Mining. In: Proceedings of SIGKDD
Giannotti, F, Pedreschi, D (eds) (2008) Mobility, data mining and privacy, geographic knowledge discovery. Springer, UK
Hong DH, Kim C (1999) A note on similarity measures between vague sets and between elements. Inf Sci 115(1–4): 83–96
Hung W-L, Yang M-S (2004) Similarity measures of intuitionistic fuzzy sets based on Hausdorff distance. Pattern Recogn Lett 25(14): 1603–1611
Keogh EJ, Pazzani MJ (2000) A simple dimensionality reduction technique for fast similarity search in large time series databases. In: Proceedings of PAKDD
Kianmehr K, Alshalalfa M, Alhajj R (2009) Fuzzy clustering-based discretization for gene expression classification. Knowl Inf Sys, pp 0219–3116 (Online)
Lee J-G, Han J, Whang K-Y (2007) Trajectory clustering: a partition-and-group framework. In: Proceedings of SIGMOD
Li Y, Olson DL, Qin Z (2007) Similarity measures between vague sets: a comparative analysis. Pattern Recogn Lett 28(2): 278–285
Li Y, Zhongxian C, Degin Y (2002) Similarity measures between vague sets and vague entropy. J Comput Sci 29(12): 129–132
Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theor 28(2): 129–137
Mitchell HB (2003) On the Dengfeng–Chuntian similarity measure and its application to pattern recognition. Pattern Recogn Lett 24(16): 3101–3104
Nanni M, Pedreschi D (2006) Time-focused clustering of trajectories of moving objects. J Intell Inf Sys 27(3): 267–289
Pelekis N, Kopanakis I, Ntoutsi I, Marketos G, Andrienko G, Theodoridis Y (2007) Similarity Search in Trajectory Databases. In: Proceedings of TIME
Pelekis N, Kopanakis I, Kotsifakos EE, Frentzos E, Theodoridis Y (2009) Clustering trajectories of moving objects in an uncertain world. In: Proceedings of ICDM
Pfoser D, Jensen CS (1999) Capturing the uncertainty of moving-object representations. In: Proceedings of SSD
Theodoridis Y, Silva JRO, Nascimento MA (1999) On the generation of spatiotemporal datasets. In: Proceedings of the 6th int’l symposium on spatial databases
Trajcevski G, Wolfson O, Hinrichs K, Chamberlain S (2004) Managing uncertainty in moving objects databases. ACM TODS 29(3): 463–507
Vlachos M, Kollios G, Gunopulos D (2002) Discovering similar multidimensional trajectories. In: Proceedings of ICDE 2002
Wang W, Yang J, Muntz RR (1997) STING: A statistical information grid approach to spatial data mining. In: Proceedings of VLDB
Waterman MS, Smith TF, Beyer WA (1976) Some biological sequence metrics. Adv Math 20(4): 367–387
Weng C-H, Chen Y-L (2009) Mining fuzzy association rules from uncertain data. Knowl Inf Sys, pp 0219–3116 (Online)
Yi B-K, Jagadish H, Faloutsos C (1998) Efficient retrieval of similar time sequences under time warping. In: Proceedings of ICDE
Zadeh LA (1965) Fuzzy sets. Inf Control 8(3): 338–353
Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: An efficient data clustering method for very large databases. In: Proceedings of SIGMOD
Zhizhen L, Pengfei S (2003) Similarity measures on intuitionistic fuzzy sets. Pattern Recogn Lett 24(15): 2687–2693
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pelekis, N., Kopanakis, I., Kotsifakos, E.E. et al. Clustering uncertain trajectories. Knowl Inf Syst 28, 117–147 (2011). https://doi.org/10.1007/s10115-010-0316-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-010-0316-x