Continuous k nearest neighbor queries over large multi-attribute trajectories: a systematic approach

Abstract

We study multi-attribute trajectories by combining standard trajectories (i.e., a sequence of timestamped locations) and descriptive attributes. A new form of continuous k nearest neighbor queries is proposed by integrating attributes into the evaluation. To enhance the query performance, a hybrid and flexible index is developed to manage both spatio-temporal data and attribute values. The index includes a 3D R-tree and a composite structure which can be popularized to work together with any R-tree based index and Grid-based index. We establish an efficient mechanism to update the index and define a cost model to estimate the I/Os. Query algorithms are proposed, in particular, an efficient method to determine the subtrees containing query attributes. Using synthetic and real datasets, we carry out comprehensive experiments in a prototype database system to evaluate the efficiency, scalability and generality. Our approach gains more than an order of magnitude speedup compared to three alternative approaches by using 1.8 millions of trajectories and hundreds of attribute values. The update performance is evaluated and the cost model is validated.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31
Fig. 32
Fig. 33
Fig. 34
Fig. 35
Fig. 36
Fig. 37
Fig. 38
Fig. 39
Fig. 40

Notes

  1. 1.

    Trajectories containing only a single unit are treated as dirty data and will be removed from the dataset. It is rare and impractical that two consecutive GPS records have a major deviation.

References

  1. 1.

    (2016). http://factory.datatang.com/en/

  2. 2.

    Alvares LO, Bogorny V, Kuijpers B et al (2007) Towards semantic trajectory knowledge discovery. Data Mining and Knowledge Discovery

  3. 3.

    Bentley JL, Ottmann T (1979) Algorithms for reporting and counting geometric intersections. IEEE Trans Computers 28(9):643–647

    Article  Google Scholar 

  4. 4.

    Bercken J, Seeger B, Widmayer P (1997) A generic approach to bulk loading multidimensional index structures. In: VLDB, pp 406–415

  5. 5.

    Biveinis L, Saltenis S, Jensen C (2007) Main-memory operation buffering for efficient r-tree update. In: VLDB, pp 591–602

  6. 6.

    Chakka VP, Everspaugh A, Patel JM (2003) Indexing large trajectory data sets with seti. In: CIDR

  7. 7.

    Chen L, Cong G, Jensen C, Wu D (2013) Spatial keyword query processing: an experimental evaluation. PVLDB 6(3):217–228

    Google Scholar 

  8. 8.

    Chen L, Özsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: SIGMOD, pp 491–502

  9. 9.

    Chen Z, Shen H, Zhou X, Zheng Y, Xie X (2010) Searching trajectories by locations-an efficiency study. In: SIGMOD, pp 255–266

  10. 10.

    Cong G, Jensen C, Wu D (2009) Efficient retrieval of the top-k most relevant spatial web objects. PVLDB 2(1):337–348

    Google Scholar 

  11. 11.

    Dai J, Yang B, Guo C, Ding Z (2015) Personalized route recommendation using big trajectory data. In: ICDE, pp 543–554

  12. 12.

    Ding H, Trajcevski G, Scheuermann P (2008) Efficient maintenance of continuous queries for trajectories. GeoInformatica 12(3):255–288

    Article  Google Scholar 

  13. 13.

    Fang Y, Cheng R, Tang W, Maniu S, Yang XS (2016) Scalable algorithms for nearest-neighbor joins on big trajectory data. IEEE Trans Knowl Data Eng 28(3):785–800

    Article  Google Scholar 

  14. 14.

    Felipe I, Hristidis V, Rishe N (2008) Keyword search on spatial databases. In: ICDE, pp 656–665

  15. 15.

    Forlizzi L, Güting RH, Nardelli E, Schneider M (2000) A data model and data structures for moving objects databases. In: SIGMOD, pp 319–330

  16. 16.

    Frentzos E, Gratsias K, Pelekis N, Theodoridis Y (2005) Nearest neighbor search on moving object trajectories. In: SSTD, pp 328–345

  17. 17.

    Frentzos E, Gratsias K, Pelekis N, Theodoridis Y (2007) Algorithms for nearest neighbor search on moving object trajectories. GeoInformatica 11(2):159–193

    Article  Google Scholar 

  18. 18.

    Gao Y, Zheng B, Chen G, Li Q (2010) Algorithms for constrained k-nearest neighbor queries over moving object trajectories. GeoInformatica 14(2):241–276

    Article  Google Scholar 

  19. 19.

    Gao Y, Zheng B, Chen G, Li Q, Guo X (2011) Continuous visible nearest neighbor query processing in spatial databases. VLDB J 20(3):371–396

    Article  Google Scholar 

  20. 20.

    Güting RH, Behr T, Düntgen C (2010) SECONDO: a platform for moving objects database research and for publishing and integrating research implementations. IEEE Data Eng Bull 33(2):56–63

    Google Scholar 

  21. 21.

    Güting RH, Behr T, Xu J (2010) Efficient k-nearest neighbor search on moving object trajectories. VLDB J 19(5):687–714

    Article  Google Scholar 

  22. 22.

    Güting RH, Böhlen M, Erwig M, Jensen C, Lorentzos N, Schneider M, Vazirgiannis M (2000) A foundation for representing and querying moving objects. ACM TODS 25(1):1–42

    Article  Google Scholar 

  23. 23.

    Güting RH, Valdës F, Damiani M (2015) Symbolic trajectories. ACM Trans Spatial Algo Syst, 1(2):Article 7

    Article  Google Scholar 

  24. 24.

    Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: SIGMOD, pp 47–57

  25. 25.

    Hadjieleftheriou M, Kollios G, Tsotras VJ, Gunopulos D (2002) Efficient indexing of spatiotemporal objects. In: EDBT, pp 251–268

  26. 26.

    Jensen C, Lin D, Ooi BC (2017) Maximum update interval in moving objects databases. In: Encyclopedia of GIS, p 1205

    Google Scholar 

  27. 27.

    Jeung H, Yiu M, Zhou X, Jensen C, Shen H (2008) Discovery of convoys in trajectory databases. PVLDB 1(1):1068–1080

    Google Scholar 

  28. 28.

    Lange R, Dürr F, Rothermel K (2011) Efficient real-time trajectory tracking. VLDB J 20(5):671–694

    Article  Google Scholar 

  29. 29.

    Dinh L, Aref WG, Mokbel MF (2010) Spatio-temporal access methods: Part 2 (2003-2010). IEEE Data Eng Bull 33(2):46–55

    Google Scholar 

  30. 30.

    Lee T, Park J, Lee S, Hwang S, Elnikety S, He Y (2015) Processing and optimizing main memory spatial-keyword queries. PVLDB 9(3):132–143

    Google Scholar 

  31. 31.

    Li Y, Chow C, Deng K, Yuan M, Zeng J, Zhang J, Yang Q, Zhang Z (2015) Sampling big trajectory data. In: CIKM, pp 941–950

  32. 32.

    Li Z, Ding B, Han J, Kays R (2010) Swarm: mining relaxed temporal moving object clusters. PVLDB 3(1):723–734

    Google Scholar 

  33. 33.

    Long C, Wong RC, Jagadish HV (2013) Direction-preserving trajectory simplification. PVLDB 6(10):949–960

    Google Scholar 

  34. 34.

    Long C, Wong RC, Jagadish HV (2014) Trajectory simplification: on minimizing the direction-based error. PVLDB 8(1):49–60

    Google Scholar 

  35. 35.

    Mauroux PC, Wu E, Madden S (2010) Trajstore: an adaptive storage system for very large trajectory data sets. In: ICDE, pp 109–120

  36. 36.

    Parent C, Spaccapietra S, Renso C et al (2013) Semantic trajectories modeling and analysis. ACM Comput Surv 45(4):42

    Article  Google Scholar 

  37. 37.

    Pfoser D, Jensen C (2000) Novel approaches in query processing for moving object trajectories. In: VLDB, pp 395–406

  38. 38.

    Popa IS, Zeitouni K, Oria V, Barth D, Vial S (2011) Indexing in-network trajectory flows. VLDB J 20(5):643–669

    Article  Google Scholar 

  39. 39.

    Rasetic S, Sander J, Elding J, Nascimento MA (2005) A trajectory splitting model for efficient spatio-temporal indexing. In: VLDB, pp 934–945

  40. 40.

    Sidlauskas D, Saltenis S, Jensen C (2014) Processing of extreme moving-object update and query workloads in main memory. VLDB J 23(5):817–841

    Article  Google Scholar 

  41. 41.

    Song Z, Roussopoulos N (2003) Seb-tree: An approach to index continuously moving objects. In: MDM, pp 340–344

  42. 42.

    Su H, Zheng K, Wang H, Huang J, Zhou X (2013) Calibrating trajectory data for similarity-based analysis. In: SIGMOD, pp 833–844

  43. 43.

    Su H, Zheng K, Zeng K, Huang J, Sadiq SW, Yuan N, Zhou X (2015) Making sense of trajectory data A partition-and-summarization approach. In: ICDE, pp 963–974

  44. 44.

    Su Y, Wu Y, Chen ALP (2007) Monitoring heterogeneous nearest neighbors for moving objects considering location-independent attributes. In: DASFAA, pp 300–312

  45. 45.

    Tao Y, Papadias D (2001) Mv3r-tree: a spatio-temporal access method for timestamp and interval queries. In: VLDB, pp 431–440

  46. 46.

    Tao Y, Papadias D, Shen Q (2002) Continuous nearest neighbor search. In: VLDB, pp 287–298

  47. 47.

    Tong Y, Chen L, Zhou Z et al (2018) Slade: a smart large-scale task decomposer in crowdsourcing. IEEE Transactions on Knowledge and Data Engineering to appear

  48. 48.

    Tong Y, Chen Y, Zhou Z et al (2017) The simpler the better: a unified approach to predicting original taxi demands based on large-scale online platforms. In: ACM SIGKDD, pp 1653–1662

  49. 49.

    Tong Y, She J, Ding B, Wang L, Chen L (2016) Online mobile micro-task allocation in spatial crowdsourcing. In: ICDE, pp 49–60

  50. 50.

    Tzoumas K, Yiu ML, Jensen C (2009) Workload-aware indexing of continuously moving objects. PVLDB 2(1):1186–1197

    Google Scholar 

  51. 51.

    Wang H, Zimmermann R (2011) Processing of continuous location-based range queries on moving objects in road networks. IEEE Trans Knowl Data Eng 23 (7):1065–1078

    Article  Google Scholar 

  52. 52.

    Wang X, Zhang Y, Zhang W, Lin X, Huang Z (2016) SKYPE: Top-k spatial-keyword publish/subscribe over sliding window. PVLDB 9(7):588–599

    Google Scholar 

  53. 53.

    Wu D, Yiu ML, Cong G, Jensen C (2012) Joint top-k spatial keyword query processing. IEEE Trans Knowl Data Eng 24(10):1889–1903

    Article  Google Scholar 

  54. 54.

    Lin HZTWX, Ma S, Huai J (2017) One-pass error bounded trajectory simplification. PVLDB 10(7):841–852

    Google Scholar 

  55. 55.

    Xu J, Güting R, Zheng Y (2015) The TM-RTree: an index on generic moving objects for range queries. GeoInformatica 19(3):487–524

    Article  Google Scholar 

  56. 56.

    Xu J, Güting RH (2012) MwgenG: a mini world generator. In: MDM, pp 258–267

  57. 57.

    Xu J, Güting RH (2013) A generic data model for moving objects. GeoInformatica 17(1):125–172

    Article  Google Scholar 

  58. 58.

    Xu X, Han J, Lu W (1990) Rt-tree: an improved r-tree indexing structure for temporal spatial databases. In: SDH, pp 1040–1049

  59. 59.

    Yan Z, Chakraborty D, Parent C, Spaccapietra S, Aberer K (2011) Semitri: a framework for semantic annotation of heterogeneous trajectories. In: EDBT, pp 259–270

  60. 60.

    Yao B, Xiao X, Li F, Wu Y (2014) Dynamic monitoring of optimal locations in road network databases. VLDB J 23(5):697–720

    Article  Google Scholar 

  61. 61.

    Zhang C, Han J, Shou L, Lu J, Porta TFL (2014) Splitter: mining fine-grained sequential patterns in semantic trajectories. PVLDB 7(9):769–780

    Google Scholar 

  62. 62.

    Zheng B, Yuan N, Zheng K, Xie X, Sadiq SW, Zhou X (2015) Approximate keyword search in semantic trajectory database. In: ICDE, pp 975–986

  63. 63.

    Zheng B, Zheng K, Xiao X, Su H, Yin H, Zhou X, Li G (2016) Keyword-aware continuous knn query on road networks. In: IEEE ICDE, pp 871–882

  64. 64.

    Zheng K, Shang S, Yuan N, Yang Y (2013) Towards efficient search for activity trajectories. In: ICDE, pp 230–241

  65. 65.

    Zheng K, Su H (2015) Go beyond raw trajectory data: quality and semantics. IEEE Data Eng Bull 38(2):27–34

    Google Scholar 

  66. 66.

    Zheng K, Zheng Y, Yuan N, Shang S (2013) On discovery of gathering patterns from trajectories. In: ICDE, pp 242–253

Download references

Acknowledgments

We sincerely thank Fabio Valdés, Thomas Behr and Sara Betkas for their helpful comments to improve the preliminary version. This work is supported by National Key Research and Development Plan of China (2018YFB1003902) and the Fundamental Research Funds for the Central Universities (NO. NS2017073).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jianqiu Xu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Unique attribute values by composite numbers

Given a point (x, y), its Z-order value is denoted by z-val(x, y) and the binary representation is z[2 ⋅ m] : z[i] = x[i], z[i + 1] = y[i], i∈ [0, m], x[m], y[m] are arrays of bits for binary representations x and y, respectively, and m is the number of bits to represent the coordinates.

Lemma 6

Let a1 ∈ dom(\(A_{d_{1}}\)) and a2 ∈ dom(\(A_{d_{2}}\)) be attribute values from two different domains, respectively. Then, we have z-val(d1, a1)≠ z-val(d2, a2).

Proof

Let z1[2 ⋅ m] and z2[2 ⋅ m] be binary representataions for z-val(d1, a1) and z-val(d2, a2), respectively. Because of d1d2, then arrays x1[m] and x2[m] are not equal. After the interleaving, there exists an even bit i ∈ [0, 2 ⋅ m - 1] such that z1[i]≠z2[i]. As a result, we have z1[2 ⋅ m]≠z2[2 ⋅ m]. The condition holds regardless of a1 and a2. □

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Xu, J., Güting, R.H. & Gao, Y. Continuous k nearest neighbor queries over large multi-attribute trajectories: a systematic approach. Geoinformatica 22, 723–766 (2018). https://doi.org/10.1007/s10707-018-0326-5

Download citation

Keywords

  • Trajectories
  • Multi-attribute
  • Continuous queries
  • Nearest neighbors
  • Index structure
  • Update