Abstract
Trajectory clustering has played a crucial role in data analysis since it reveals underlying trends of moving objects. Due to their sequential nature, trajectory data are often received incrementally, e.g., continuous new points reported by GPS system. However, since existing trajectory clustering algorithms are developed for static datasets, they are not suitable for incremental clustering with the following two requirements. First, clustering should be processed efficiently since it can be frequently requested. Second, huge amounts of trajectory data must be accommodated, as they will accumulate constantly.
An incremental clustering framework for trajectories is proposed in this paper. It contains two parts: online micro-cluster maintenance and offline macro-cluster creation. For online part, when a new bunch of trajectories arrives, each trajectory is simplified into a set of directed line segments in order to find clusters of trajectory subparts. Micro-clusters are used to store compact summaries of similar trajectory line segments, which take much smaller space than raw trajectories. When new data are added, micro-clusters are updated incrementally to reflect the changes. For offline part, when a user requests to see current clustering result, macro-clustering is performed on the set of micro-clusters rather than on all trajectories over the whole time span. Since the number of micro-clusters is smaller than that of original trajectories, macro-clusters are generated efficiently to show clustering result of trajectories. Experimental results on both synthetic and real data sets show that our framework achieves high efficiency as well as high clustering quality.
Keywords
- Line Segment
- Trajectory Data
- Incremental Data
- Trajectory Cluster
- Incremental Cluster
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
The work was supported in part by the U.S. National Science Foundation grants IIS-08-42769 and IIS-09-05215, and a grant from the Boeing company. Any opinions, findings, and conclusions expressed here are those of the authors and do not necessarily reflect the views of the funding agencies.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: VLDB 2003 (2003)
Ankerst, M., Breunig, M., Kriegel, H.-P., Sander, J.: OPTICS: Ordering points to identify the clustering structure. In: SIGMOD 1999 (1999)
Breunig, M.M., Kriegel, H.-P., Kröger, P., Sander, J.: Data bubbles: Quality preserving performance boosting for hierarchical clustering. In: SIGMOD 2001 (2001)
Cadez, I.V., Gaffney, S., Smyth, P.: A general probabilistic framework for clustering individuals and objects. In: KDD 2000 (2000)
Douglas, D., Peucker, T.: Algorithms for the reduction of the number of points required to represent a line or its character. In: The Ameican Cartographer (1973)
Ester, M., Kriegel, H.P., Sander, J., Wimmer, M., Xu, X.: Incremental clustering for mining in data warehousing environment. In: VLDB 1998 (1998)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases. In: KDD 1996 (1996)
Gaffney, S., Robertson, A., Smyth, P., Camargo, S., Ghil, M.: Probabilistic clustering of extratropical cyclones using regression mixture models. Technical Report UCI-ICS 06-02, University of California, Irvine (January 2006)
Gaffney, S., Smyth, P.: Trajectory clustering with mixtures of regression models. In: KDD 1999 (1999)
Chen, M.K.L.J., Gao, Y.: Noisy logo recognition using line segment hausdorff distance. Pattern Recognition (2002)
Lee, J.-G., Han, J., Whang, K.-Y.: Trajectory clustering: A partition-and-group framework. In: SIGMOD 2007 (2007)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proc. 5th Berkeley Symp. Math. Statist., Prob., vol. 1, pp. 281–297 (1967)
Sacharidis, D., Patroumpas, K., Terrovitis, M., Kantere, V., Potamias, M., Mouratidis, K., Sellis, T.: On-line discovery of hot motion paths. In: EDBT 2008 (2008)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: SIGMOD 1996 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, Z., Lee, JG., Li, X., Han, J. (2010). Incremental Clustering for Trajectories. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 5982. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12098-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-12098-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12097-8
Online ISBN: 978-3-642-12098-5
eBook Packages: Computer ScienceComputer Science (R0)