Online clustering of streaming trajectories

Abstract

With the increasing availability of modern mobile devices and location acquisition technologies, massive trajectory data of moving objects are collected continuously in a streaming manner. Clustering streaming trajectories facilitates finding the representative paths or common moving trends shared by different objects in real time. Although data stream clustering has been studied extensively in the past decade, little effort has been devoted to dealing with streaming trajectories. The main challenge lies in the strict space and time complexities of processing the continuously arriving trajectory data, combined with the difficulty of concept drift. To address this issue, we present two novel synopsis structures to extract the clustering characteristics of trajectories, and develop an incremental algorithm for the online clustering of streaming trajectories (called OCluST). It contains a micro-clustering component to cluster and summarize the most recent sets of trajectory line segments at each time instant, and a macro-clustering component to build large macro-clusters based on micro-clusters over a specified time horizon. Finally, we conduct extensive experiments on four real data sets to evaluate the effectiveness and efficiency of OCluST, and compare it with other congeneric algorithms. Experimental results show that OCluST can achieve superior performance in clustering streaming trajectories.

This is a preview of subscription content, log in to check access.

References

  1. 1.

    Pan B, Zheng Y, Wilkie D, Shahabi C. Crowd sensing of traffic anomalies based on human mobility and social media. In: Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 2013, 334–343

    Google Scholar 

  2. 2.

    Liu H P, Jin C Q, Zhou A Y. Popular route planning with travel cost estimation. In: Proceedings of International Conference on Database Systems for Advanced Applications. 2016, 403–418

    Google Scholar 

  3. 3.

    Chen C, Chen X, Wang Z, Wang Y S, Zhang D Q. ScenicPlanner: planning scenic travel routes leveraging heterogeneous user-generated digital footprints. Frontiers of Computer Science, 2017, 11(1): 61–74

    Article  Google Scholar 

  4. 4.

    Duan X Y, Jin C Q, Wang X L, Zhou A Y, Yue K. Real-time personalized taxi-sharing. In: Proceedings of International Conference on Database Systems for Advanced Applications. 2016, 451–465

    Google Scholar 

  5. 5.

    Wu H, Tu C C, Sun W W, Zheng B H, Su H, Wang W. GLUE: a parameter-tuning- free map updating system. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management. 2015, 683–692

    Google Scholar 

  6. 6.

    Lee J G, Han J W, Whang K Y. Trajectory clustering: a partition-andgroup framework. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 2007, 593–604

    Google Scholar 

  7. 7.

    Ester M, Kriegel H P, Sander J, Xu XW. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. 1996, 226–231

    Google Scholar 

  8. 8.

    Gaffney S, Smyth P. Trajectory clustering with mixtures of regression models. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1999, 63–72

    Google Scholar 

  9. 9.

    Wang W, Yang J, Muntz R R. STING: a statistical information grid approach to spatial data mining. In: Proceedings of the 23rd International Conference on Very Large Data Bases. 1997, 186–195

    Google Scholar 

  10. 10.

    Jensen C S, Lin D, Ooi B C. Continuous clustering of moving objects. IEEE Transactions on Knowledge & Data Engineering, 2007, 19(9): 1161–1174

    Article  Google Scholar 

  11. 11.

    Li Y F, Han J W, Yang J. Clustering moving objects. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2004, 617–622

    Google Scholar 

  12. 12.

    Li Z H, Lee J G, Li X L, Han J W. Incremental clustering for trajectories. In: Proceedings of the 15th International Conference on Database Systems for Advanced Applications. 2010, 32–46

    Google Scholar 

  13. 13.

    Aggarwal C C, Han J W, Wang J Y, Yu P S. A framework for clustering evolving data streams. In: Proceedings of the 29th International Conference on Very Large Data Bases. 2003, 81–92

    Google Scholar 

  14. 14.

    Hönle N, Großmann M, Reimann S, Mitschang B. Usability analysis of compression algorithms for position data streams. In: Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. 2010, 240–249

    Google Scholar 

  15. 15.

    Datar M, Gionis A, Indyk P, Motwani R. Maintaining stream statistics over sliding windows. SIAM Journal on Computing. 2002, 31(6): 635–644

    MathSciNet  Article  MATH  Google Scholar 

  16. 16.

    Chu S, Keogh E J, Hart D M, Pazzani M J. Iterative deepening dynamic time warping for time series. In: Proceedings of the 2nd SIAM International Conference on Data Mining. 2002, 195–212

    Google Scholar 

  17. 17.

    Vlachos M, Gunopulos D, Kollios G. Discovering similar multidimensional trajectories. In: Proceedings of the 18th International Conference on Data Engineering. 2002, 673–684

    Google Scholar 

  18. 18.

    Chen L, Ng R T. On the marriage of Lp-norms and edit distance. In: Proceedings of the 30th International Conference on Very Large Data Bases. 2004, 792–803

    Google Scholar 

  19. 19.

    Chen L, Özsu M T, Oria V. Robust and fast similarity search for moving object trajectories. In: Proceedings of ACMSIGMOD International Conference on Management of Data. 2005, 491–502

    Google Scholar 

  20. 20.

    Roh G, Hwang S. NNCluster: an efficient clustering algorithm for road network trajectories. In: Proceedings of International Conference on Database Systems for Advanced Applications. 2010, 47–61

    Google Scholar 

  21. 21.

    Mao J L, Song Q G, Jin C Q, Zhang Z G, Zhou A Y. TSCluWin: trajectory stream clustering over sliding window. In: Proceedings of International Conference on Database Systems for Advanced Applications. 2016, 133–148

    Google Scholar 

  22. 22.

    Zhang J, Xu J, Liao S S. Aggregating and sampling methods for processing GPS data streams for traffic state estimation. IEEE Transactions on Intelligent Transportation Systems, 2013, 14(4): 1629–1641

    Article  Google Scholar 

  23. 23.

    Castro P S, Zhang D Q, Li S J. Urban traffic modelling and prediction using large scale taxi GPS traces. In: Proceedings of International Conference on Pervasive Computing. 2012, 57–72

    Google Scholar 

  24. 24.

    Lloyd S P. Least squares quantization in PCM. IEEE Transactions on Information Theory, 1982, 28(2): 129–136

    MathSciNet  Article  MATH  Google Scholar 

  25. 25.

    Zhang T, Ramakrishnan R, Livny M. BIRCH: an efficient data clustering method for very large databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 1996, 103–114

    Google Scholar 

  26. 26.

    Babcock B, Datar M, Motwani R, O’Callaghan L. Maintaining variance and k-medians over data stream windows. In: Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. 2003, 234–243

    Google Scholar 

  27. 27.

    Aggarwal C C, Yu P S. A framework for clustering uncertain data streams. In: Proceedings of IEEE International Conference on Data Engineering. 2008, 150–159

    Google Scholar 

  28. 28.

    Zhou A Y, Cao F, Qian W N, Jin C Q. Tracking clusters in evolving data streams over sliding windows. Knowledge and Information Systems, 2008, 15(2): 181–214

    Article  Google Scholar 

  29. 29.

    Jin C Q, Yu J X, Zhou A Y, Cao F. Efficient clustering of uncertain data streams. Knowledge and Information Systems, 2014, 40(3): 509–539

    Article  Google Scholar 

  30. 30.

    Won J I, Kim S W, Baek J H, Lee J. Trajectory clustering in road network environment. In: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining. 2009, 299–305

    Google Scholar 

  31. 31.

    Han B, Liu L, Omiecinski E. Road-network aware trajectory clustering: integrating locality, flow, and density. IEEE Transactions on Mobile Computing, 2015, 14(2): 416–429

    Article  Google Scholar 

  32. 32.

    Lange R, Dürr F, Rothermel K. Efficient real-time trajectory tracking. The VLDB Journal, 2011, 20(5): 671–694

    Article  Google Scholar 

  33. 33.

    Nehme R V, Rundensteiner E A. SCUBA: scalable cluster-based algorithm for evaluating continuous spatio-temporal queries on moving objects. In: Proceedings of the 10th International Conference on Advances in Database Technology. 2006, 1001–1019

    Google Scholar 

  34. 34.

    Sacharidis D, Patroumpas K, Terrovitis M, Kantere V, Potamias M, Mouratidis K, Sellis T. On-line discovery of hot motion paths. In: Proceedings of the 11th International Conference on Extending Database Technology: Advances in Database Technology. 2008, 392–403

    Google Scholar 

  35. 35.

    Zheng Y, Yuan N J, Zheng K, Shang S. On discovery of gathering patterns from trajectories. In: Proceedings of IEEE International Conference on Data Engineering. 2013, 242–253

    Google Scholar 

  36. 36.

    Tang L A, Zheng Y, Yuan J, Han J W, Leung A, Hung C C, Peng W C. On discovery of traveling companions from streaming trajectories. In: Proceedings of the 28th IEEE International Conference on Data Engineering. 2012, 186–197

    Google Scholar 

  37. 37.

    Li X H, Ceikute V, Jensen C S, Tan K L. Effective online group discovery in trajectory databases. IEEE Transactions on Knowledge & Data Engineering, 2013, 25(12): 2752–2766

    Article  Google Scholar 

  38. 38.

    Deng Z, Hu Y Y, Zhu M, Huang X H, Du B. A scalable and fast OPTICS for clustering trajectory big data. Cluster Computing, 2015, 18(2): 549–562

    Article  Google Scholar 

  39. 39.

    Costa G, Manco G, Masciari E. Dealing with trajectory streams by clustering and mathematical transforms. Journal of Intelligent Information Systems, 2014, 42(1): 155–177

    Article  Google Scholar 

  40. 40.

    Yu Y W, Wang Q, Wang X D, Wang H, He J. Online clustering for trajectory data stream of moving objects. Computer Science & Infor mation Systems, 2013, 10(3): 1293–1317

    Article  Google Scholar 

  41. 41.

    Jeung H, Yiu M L, Zhou X F, Jensen C S, Shen H T. Discovery of convoys in trajectory databases. Proceedings of the VLDB Endowment, 2008, 1(1): 1068–1080

    Article  Google Scholar 

Download references

Acknowledgements

Our research was supported by the National Key Research and Development Program of China (2016YFB1000905), the National Natural Science Foundation of China (NSFC) (Grant Nos. 61702423, 61370101, 61532021, U1501252, U1401256 and 61402180), Natural Science Foundation of the Education Department of Sichuan Province (17ZA0381 and 13ZA0015), China West Normal University Special Foundation of National Programme Cultivation (16C005), and Meritocracy Research Funds of China West Normal University (17YC158).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jiali Mao.

Additional information

Jiali Mao is an associate professor at China West Normal University, China. She is currently working toward the PhD degree in the School of Data Science and Engineering, East China Normal University, China. Her current research interests include big data analysis and location-based services.

Qiuge Song received her bachelor’s degree in computer science and technology from Nankai University, China in 2014. She is a graduate student in the School of Software Engineering, East China Normal University, China. Her current research interests include data mining and location-based services.

Zhigang Zhang is currently working toward the PhD degree at the School of Data Science and Engineering, East China Normal University, China. His research interests include location-based services, spatio-temporal data management, and distributed computing.

Aoying Zhou is a professor of computer science at East China Normal University (ECNU), China, as well as the dean of the School of Data Science and Engineering (DaSE), ECNU. His research interests include web data management, data management for data-intensive computing, inmemory cluster computing, benchmarking for big data, and performance.

Electronic supplementary material

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mao, J., Song, Q., Jin, C. et al. Online clustering of streaming trajectories. Front. Comput. Sci. 12, 245–263 (2018). https://doi.org/10.1007/s11704-017-6325-0

Download citation

Keywords

  • streaming trajectory
  • synopsis data structure
  • concept drift
  • sliding window