Abstract
Nowadays, almost all kind of electronic devices leave traces of their movements (e.g. smartphone, GPS devices and so on). Thus, the huge number of this “tiny” data sources leads to the generation of massive data streams of geo-referenced data. As a matter of fact, the effective analysis of such amounts of data is challenging, since the possibility to extract useful information from this peculiar kind of data is crucial in many application scenarios such as vehicle traffic management, hand-off in cellular networks, supply chain management. Moreover, spatial data streams management poses new challenges both for their proper definition and acquisition, thus making the overall process harder than for classical point data. In particular, we are interested in solving the problem of effective trajectory data streams clustering, that revealed really intriguing as we deal with sequential data that have to be properly managed due to their ordering. We propose a framework that allow data pre-elaboration in order to make the mining step more effective. As for every data mining tool, the experimental evaluation is crucial, thus we performed several tests on real world datasets that confirmed the efficiency and effectiveness of the proposed approach.
Similar content being viewed by others
Notes
For the sake of completeness we recall that \(L^2(\mathcal{R})\) is the metric space of square-integrable functions, i.e. the measurable functions for which the integral of the square of the absolute value is finite.
It is always possible to find a basis that allows this representation for the search space.
Note that here we need to compute modulo two polynomials to ensure that the dimension of A is finite.
Both available at http://www.rtreeportal.org.
Available at http://weather.unisys.com/hurricane/atlantic/.
References
Aggarwal, C.C., Han, J., Wang, J., Yu, P.S. (2003). A framework for clustering evolving data streams. In VLDB (pp. 81–92).
Arthur, D., & Vassilvitskii, S. (2007). k-means++ the advantages of careful seeding. In SODA (pp. 1027–1035).
Cadez, I.V., Gaffney, S., Smyth, P. (2000). A general probabilistic framework for clustering individuals and objects. In KDD (pp. 140–149).
Cao, H., & Wolfson, O. (2005). Nonmaterialized motion information in transport networks. In ICDT (pp. 173–188).
Chen, L., Özsu, M.T., Oria, V. (2005). Robust and fast similarity search for moving object trajectories. In SIGMOD (pp. 491–502). New York: ACM.
Chihara, T.S. (1978). An introduction to orthogonal polynomials. New York: Gordon and Breach.
Chong, Z., Ni, W., Xu, L., Xu, Z., Shu, H., Zheng, J. (2010). Approximate k-median of location streams with redundancy and inconsistency. International Journal of Software and Informatics, 4(2), 165–182.
Ester, M., Kriegel, H.P., Sander, J., Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD.
Flesca, S., Manco, G., Masciari, E., Pontieri, L., Pugliese, A. (2005). Fast detection of xml structural similarity. IEEE TKDE, 17(2), 160–175.
Gaffney, S., & Smyth, P. (1999). Trajectory clustering with mixtures of regression models. In KDD (pp. 63–72).
Giannotti, F., Nanni, M., Pinelli, F., Pedreschi, D. (2007). Trajectory pattern mining. In KDD (pp. 330–339).
Gonzalez, H., Han, J., Li, X., Klabjan, D. (2006). Warehousing and analyzing massive RFID data sets. In ICDE.
Gudmundsson, J., Katajainen, J., Merrick, D., Ong, C., Wolle, T. (2007). Compressing spatio-temporal trajectories. In Int. conf. algorithms and computation (pp. 763–775).
Han, J., & Kamber, M. (2000). Data mining: Concepts and techniques. San Mateo: Morgan Kaufmann.
Hönle, N., Grossmann, M., Reimann, S., Mitschang, B. (2010). Usability analysis of compression algorithms for position data streams. In GIS (pp. 240–249).
Jeung, H., Yiu, M.L., Zhou, X., Jensen, C.S., Shen, H.T. (2008). Discovery of convoys in trajectory databases. In Proceedings of the VLDB Endowement, vol. 1(1) (pp. 1068–1080).
Keogh, E. (2002). Exact indexing of dynamic time warping. In VLDB (pp. 406–417). VLDB Endowment.
Lee, J.G., Han, J., Li, X. (2008a). Trajectory outlier detection: A partition-and-detect framework. In ICDE (pp. 140–149).
Lee, J.G., Han, J., Li, X., Gonzalez, H. (2008b). TraClass: trajectory classification using hierarchical region-based and trajectory-based clustering. PVLDB, 1(1), 1081–1094.
Lee, J.G., Han, J., Whang, K.Y. (2007). Trajectory clustering: A partition-and-group framework. In SIGMOD.
Li, Y., Han, J., Yang, J. (2004). Clustering moving objects. In KDD (pp. 617–622).
Li, Z., Lee, J.G., Li, X., Han, J. (2010). Incremental clustering for trajectories. In DASFAA (2) (pp. 32–46).
Liu, Y., Chen, L., Pei, J., Chen, Q., Zhao, Y. (2007). Mining frequent trajectory patterns for activity monitoring using radio frequency tag arrays. In PerCom (pp. 37–46).
Lloyd, S. (1982). Least squares quantization in pcm. IEEE TOIT, 28.
Masciari, E. (2009a). A complete framework for clustering trajectories. In ICTAI (pp. 9–16).
Masciari, E. (2009b). Trajectory clustering via effective partitioning. In FQAS (pp. 358–370).
Nehme, R.V., & Rundensteiner, E.A. (2006). Scuba: scalable cluster-based algorithm for evaluating continuous spatio-temporal queries on moving objects. In EDBT (pp. 1001–1019).
Oppenheim, A.V., & Shafer, R.W. (1999). Discrete-time signal processing. Englewood Cliffs: Prentice Hall.
Press, W.H., et al. (2001). Numerical recipes in C++. Cambridge: Cambridge University Press.
Puschel, M., & Rotteler, M. (2005). Fourier transform for the directed quincunx lattice. In ICASSP.
Secker, A., & Taubman, D. (2003). Lifting-based invertible motion adaptive transform (limat) framework for highly scalable video compression. IEEE Transactions on Image Processing, 12(12), 1530–1542.
Veenman, C.J., & Reinders, M.J.T. (2005). The nearest subclass classifier: a compromise between the nearest mean and nearest neighbor classifier. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(9), 1417–1429.
Vlachos, M., Gunopoulos, D., Kollios, G. (2002). Discovering similar multidimensional trajectories. In ICDE (p. 673).
Wang, W., Yang, J., Muntz, R.R. (1997). Sting: a statistical information grid approach to spatial data mining. In VLDB (pp. 186–195).
Yang, J., & Trajpattern, M.Hu. (2006). Mining sequential patterns from imprecise trajectories of mobile objects. In EDBT (pp. 664–681).
Yi, B., Jagadish, H.V., Faloutsos, C. (1998). Efficient retrieval of similar time sequences under time warping. In ICDE (pp. 201–208).
Zhang, T., Ramakrishnan, R., Livny, M. (1996). Birch: an efficient data clustering method for very large databases. In SIGMOD (pp. 103–114).
Zhang, X., Wu, X., Wu, F. (2007). Image coding on quincunx lattice with adaptive lifting and interpolation. In Data compression conf. (pp. 193–202).
Zheng, Y., Zhang, L., Xie, X., Ma, W.Y. (2009). Mining interesting locations and travel sequences from gps trajectories. In WWW (pp. 791–800).
Acknowledgements
The authors would like to thank both the anonymous reviewers and JIIS associate editor who assisted our submission, for their invaluable suggestions and insightful comments which helped us improve the paper significantly.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Costa, G., Manco, G. & Masciari, E. Dealing with trajectory streams by clustering and mathematical transforms. J Intell Inf Syst 42, 155–177 (2014). https://doi.org/10.1007/s10844-013-0267-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-013-0267-2