Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Efficient and robust data augmentation for trajectory analytics: a similarity-based approach


Trajectories between the same origin and destination (OD) offer valuable information for us to better understand the diversity of moving behaviours and the intrinsic relationships between the moving objects and specific locations. However, due to the data sparsity issue, there are always insufficient trajectories to carry out mining algorithms, e.g., classification and clustering, to discover the intrinsic properties of OD mobility. In this work, we propose an efficient and robust trajectory augmentation approach to construct sizeable qualified trajectories with existing data to address the sparsity issue. The high-level idea is to concatenate existing trajectories to reconstruct a sufficient number of trajectories to represent the ones going across the OD pair directly. To achieve this goal, we first propose a transition graph to support efficient sub-trajectories concatenation to tackle the sparsity issue. In addition, we develop a novel similarity metric to measure the similarity between two set of trajectories so as to validate whether the reconstructed trajectory set can well represent the original traces. Empirical studies on a large real trajectory dataset show that our proposed solutions are efficient and robust.

This is a preview of subscription content, log in to check access.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14


  1. 1.

    Alvarez-Garcia, J.A., Ortega, J.A., Gonzalez-Abril, L., Velasco, F.: Trip destination prediction based on past GPS log using a hidden Markov model. Expert Syst. Appl. 37(12), 8166–8171 (2010)

  2. 2.

    Castro, P.S, Zhang, D., Chen, C., Li, S., Pan, G.: From taxi GPS traces to social and community dynamics: a survey. ACM Comput. Surv. (CSUR) 46(2), 17 (2013)

  3. 3.

    Chen, L., Özsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp 491–502. ACM (2005)

  4. 4.

    Chen, Z., Shen, H.T., Zhou, X., Zheng, Y., Xie, X.: Searching trajectories by locations: an efficiency study. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, vol. 2010, pp 255–266. ACM (2010)

  5. 5.

    Chen, Z., Shen, H.T., Zhou, X.: Discovering popular routes from trajectories. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp 900–911. IEEE (2011)

  6. 6.

    Dai, J., Yang, B., Guo, C., Ding, Z.: Personalized route recommendation using big trajectory data. In: 2015 IEEE 31st International Conference on Data Engineering (ICDE), pp 543–554. IEEE (2015)

  7. 7.

    Eddy, S.R.: Hidden markov models. Curr. Opin. Struct. Biol. 6(3), 361–365 (1996)

  8. 8.

    He, D., Ruan, B., Zheng, B., Zhou X.: Origin-destination trajectory diversity analysis: efficient top-k diversified search. In: 2018 19th IEEE International Conference on Mobile Data Management, pp 135–144. IEEE, MDM (2018)

  9. 9.

    He, D., Ruan, B., Zheng, B., Zhou, X.: Trajectory set similarity measure: an emd-based approach. In: Australasian Database Conference, pp 28–40. Springer (2018)

  10. 10.

    Kassidas, A., MacGregor, J.F., Taylor, P.A.: Synchronization of batch trajectories using dynamic time warping. AIChE J. 44(4), 864–875 (1998)

  11. 11.

    Kruskal, J.B.: An overview of sequence comparison: time warps, string edits, and macromolecules. SIAM Rev. 25(2), 201–237 (1983)

  12. 12.

    Kullback, S.: Information Theory and Statistics. Courier Corporation (1997)

  13. 13.

    Lee, J.G., Han, J., Li, X., Gonzalez, H.: Traclass: trajectory classification using hierarchical region-based and trajectory-based clustering. Proce. VLDB Endow. 1(1), 1081–1094 (2008)

  14. 14.

    Lin, B., Su, J.: Shapes based trajectory queries for moving objects. In: Proceedings of the 13th Annual ACM International Workshop on Geographic Information Systems, pp 21–30. ACM (2005)

  15. 15.

    Newson, P., Krumm, J.: Hidden markov map matching through noise and sparseness. In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp 336–343. ACM (2009)

  16. 16.

    Pele, O., Werman, M.: Fast and robust earth mover’s distances. In: 2009 IEEE 12th International Conference on Computer Vision, pp 460–467. IEEE (2009)

  17. 17.

    Pelekis, N., Kopanakis, I., Marketos, G., Ntoutsi, I., Andrienko, G., Theodoridis, Y.: Similarity search in trajectory databases. In: 14th International Symposium on Temporal Representation and Reasoning, pp 129–140. IEEE (2007)

  18. 18.

    Puzicha, J., Hofmann, T., Buhmann, J.M.: Non-parametric similarity measures for unsupervised texture segmentation and image retrieval. In: 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1997. Proceedings, pp 267–272. IEEE (1997)

  19. 19.

    Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)

  20. 20.

    Sanderson, A.C., Wong, A.K.: Pattern trajectory analysis of nonstationary multivariate data. IEEE Trans. Syst. Man Cybern. 10(7), 384–392 (1980)

  21. 21.

    Su, H.: Quality-aware trajectory processing using significant locations. University of Queensland (2015)

  22. 22.

    Swain, M.J., Ballard, D.H.: Color indexing. Int. J. Comput. Vis. 7(1), 11–32 (1991)

  23. 23.

    Vlachos, M., Kollios, G., Gunopulos, D.: Discovering similar multidimensional trajectories. In: 18th International Conference on Data Engineering, 2002. Proceedings, pp 673–684. IEEE (2002)

  24. 24.

    Wang, H., Su, H., Zheng, K., Sadiq, S., Zhou, X.: An effectiveness study on trajectory similarity measures. In: Proceedings of the Twenty-Fourth Australasian Database Conference-Volume, vol. 137, pp 13–22. Australian Computer Society Inc. (2013)

  25. 25.

    Wang, Y., Zheng, Y., Xue, Y.: Travel time estimation of a path using sparse trajectories. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 25–34. ACM (2014)

  26. 26.

    Wei, L.Y., Zheng, Y., Peng W.C.: Constructing popular routes from uncertain trajectories. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 195–203. ACM (2012)

  27. 27.

    Wei, L.Y., Chang, K.P., Peng, W.C.: Discovering pattern-aware routes from trajectories. Distrib. Parallel Databases 33(2), 201–226 (2015)

  28. 28.

    Xue, A.Y., Zhang, R., Zheng, Y., Xie, X., Huang, J., Xu, Z.: Destination prediction by sub-trajectory synthesis and privacy protection against such prediction. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp 254–265. IEEE (2013)

  29. 29.

    Yang, B., Guo, C., Jensen, C.S.: Travel cost inference from sparse, spatio temporally correlated time series using markov models. Proc. VLDB Endow. 6(9), 769–780 (2013)

  30. 30.

    Yi, B.K., Jagadish, H., Faloutsos, C.: Efficient retrieval of similar time sequences under time warping. In: 14th International Conference on Data Engineering, 1998. Proceedings, pp 201–208. IEEE (1998)

Download references


Sibo Wang was supported by CUHK Direct Grant No. 4055114. He was also supported by the CUHK University Startup Grant No. 4930911 and No. 5501570.

Author information

Correspondence to Sibo Wang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

He, D., Wang, S., Ruan, B. et al. Efficient and robust data augmentation for trajectory analytics: a similarity-based approach. World Wide Web (2019).

Download citation


  • Trajectory sparsity
  • Trajectory concatenation
  • Trajectory augmentation
  • Trajectory set similarity