Skip to main content

SharkDB: an in-memory column-oriented storage for trajectory analysis

Abstract

The last decade has witnessed the prevalence of sensor and GPS technologies that produce a high volume of trajectory data representing the motion history of moving objects. However some characteristics of trajectories such as variable lengths and asynchronous sampling rates make it difficult to fit into traditional database systems that are disk-based and tuple-oriented. Motivated by the success of column store and recent development of in-memory databases, we try to explore the potential opportunities of boosting the performance of trajectory data processing by designing a novel trajectory storage within main memory. In contrast to most existing trajectory indexing methods that keep consecutive samples of the same trajectory in the same disk page, we partition the database into frames in which the positions of all moving objects at the same time instant are stored together and aligned in main memory. We found this column-wise storage to be surprisingly well suited for in-memory computing since most frames can be stored in highly compressed form, which is pivotal for increasing the memory throughput and reducing CPU-cache miss. The independence between frames also makes them natural working units when parallelizing data processing on a multi-core environment. Lastly we run a variety of common trajectory queries on both real and synthetic datasets in order to demonstrate advantages and study the limitations of our proposed storage.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21

Notes

  1. 1.

    Many database management systems offer additional components or extensions to support spatial types and operators such as Oracle, MySQL, PostgreSQL, etc

References

  1. 1.

    Ammann, A.C., Hanrahan, M., Krishnamurthy, R.: Design of a memory resident DBMS. In: COMPCON, pp 54–58 (1985)

    Google Scholar 

  2. 2.

    Aßfalg, J., Kriegel, H.P., Kröger, P., Kunath, P., Pryakhin, A., Renz, M.: Similarity search on time series based on threshold queries. In: International Conference on Extending Database Technology, pp. 276–294. Springer (2006)

  3. 3.

    Baulier, J., Bohannon, P., Gogate, S., Gupta, C., Haldar, S.: DataBlitz storage manager: main-memory database performance for critical applications. In: SIGMOD, pp. 519–520 (1999)

    Google Scholar 

  4. 4.

    Bernad, D.: Finding patterns in time series: a dynamic programming approach. Advances in knowledge discovery and data mining (1996)

  5. 5.

    Berndt, D. J., Clifford, J.: Using dynamic time warping to find patterns in time series. In: KDD Workshop, vol. 10, pp. 359–370. Seattle, WA (1994)

  6. 6.

    Binnig, C., Hildenbrand, S., Färber, F.: Dictionary-based order-preserving string compression for main memory column stores. In: SIGMOD, pp. 283–296 (2009)

    Google Scholar 

  7. 7.

    Bitton, D., Hanrahan, M., Turbyfill, C.: Performance of complex queries in main memory database systems. In: ICDE, pp. 72–81 (1987)

    Google Scholar 

  8. 8.

    Boncz, P.A., Zukowski, M., Nes, N.: Monetdb/X100: hyper-pipelining query execution CIDR, pp 225–237 (2005)

    Google Scholar 

  9. 9.

    Botea, V., Mallett, D., Nascimento, M.A., Sander, J.: PIST: an efficient and practical indexing technique for historical spatio-temporal point data. GeoInformatica 12(2), 143–168 (2008)

    Article  Google Scholar 

  10. 10.

    Chakka, V.P., Everspaugh, A.C., Patel, J.M.: Indexing large trajectory data sets with SETI. In: CIDR (2003)

    Google Scholar 

  11. 11.

    Chen, L., Ng, R.: On the marriage of Lp-Norms and edit distance. In: Proceedings of the 13th International Conference on Very Large Data Bases-Volume 30. VLDB Endowment (2004), pp 792–803

  12. 12.

    Chen, L., Özsu, M. T., Oria, V.: Robust and fast similarity search for moving object trajectories Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp 491–502. ACM (2005)

  13. 13.

    Cudre-Mauroux, P., Wu, E., Madden, S.: Trajstore: an adaptive storage system for very large trajectory data sets ICDE, pp 109–120 (2010)

    Google Scholar 

  14. 14.

    Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases, vol. 23. ACM (1994)

  15. 15.

    Forlizzi, L., Güting, R.H., Nardelli, E., Schneider, M.: A data model and data structures for moving objects databases, vol. 29. ACM (2000)

  16. 16.

    Frentzos, E., Gratsias, K., Pelekis, N., Theodoridis, Y.: Algorithms for nearest neighbor search on moving object trajectories. Geoinformatica 11(2), 159–193 (2007)

    Article  Google Scholar 

  17. 17.

    Frentzos, E., Gratsias, K., Theodoridis, Y.: Index-based most similar trajectory search. In: IEEE 23Rd International Conference On Data Engineering, 2007. ICDE 2007, pp 816–825. IEEE (2007)

  18. 18.

    Gawlick, D., Kinkade, D.: Varieties of concurrency control in IMS/VS fast path. DEB 8(2), 3–10 (1985)

    Google Scholar 

  19. 19.

    Gowanlock, M., Casanova, H.: In-memory distance threshold queries on moving object trajectories. In: Proceedings of the 6th International Conference on Advances in Databases, Knowledge, and Data Applications, pp 41–50 (2014)

    Google Scholar 

  20. 20.

    Guttman, A.: R-trees: a dynamic index structure for spatial searching SIGMOD, pp 47–57 (1984)

    Google Scholar 

  21. 21.

    Hadjieleftheriou, M., Kollios, G., Tsotras, V., Gunopulos, D.: Efficient indexing of spatiotemporal objects. Advances in Database Technology—EDBT 2002, 251–268 (2002)

    MATH  Google Scholar 

  22. 22.

    Héman, S., Zukowski, M., Nes, N. J., Sidirourgos, L., Boncz, P.: Positional upyear handling in column stores SIGMOD, pp 543–554 (2010)

    Google Scholar 

  23. 23.

    Ivanova, M.G., Kersten, M.L., Nes, N.J., Gonçalves, R. A.: An architecture for recycling intermediates in a column-store. TODS 35(4), 24:1–24:43 (2010)

    Article  Google Scholar 

  24. 24.

    Keogh, E.: Exact indexing of dynamic time warping. In: Proceedings of the 28th International Conference on Very Large Data Bases. VLDB Endowment (2002), pp 406–417

  25. 25.

    Knuth, D.E., Morris, J.H, Jr. Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput 6(2), 323–350 (1977)

    MathSciNet  Article  MATH  Google Scholar 

  26. 26.

    Krueger, J., Kim, C., Grund, M., Satish, N., Schwalb, D., Chhugani, J., Plattner, H., Dubey, P., Zeier, A.: Fast upyears on read-optimized databases using multi-core CPUs. PVLDB 5(1), 61–72 (2011)

  27. 27.

    Lemke, C., Sattler, K.U., Faerber, F., Zeier, A.: Speeding up queries in column stores Dawak, pp 117–129 (2010)

    Google Scholar 

  28. 28.

    Manegold, S., Boncz, P., Kersten, M.L.: Generic database cost models for hierarchical memory systems. In: PVLDB, pp 191–202 (2002)

    Google Scholar 

  29. 29.

    Meratnia, N., By, R.: Spatiotemporal compression techniques for moving point objects EDBT, pp 765–782 (2004)

    Google Scholar 

  30. 30.

    Pfoser, D., Jensen, C.S., Theodoridis, Y., et al.: Novel approaches to the indexing of moving object trajectories. In: Proceedings of VLDB, pp 395–406 (2000)

    Google Scholar 

  31. 31.

    Plattner, H.: A common database approach for OLTP and OLAP using an in-memory column database. In: SIGMOD, pp. 1–2 (2009)

    Google Scholar 

  32. 32.

    Plattner, H.: SanssouciDb: an in-memory database for processing enterprise workloads. In: BTW, vol. 20, pp 2–21 (2011)

  33. 33.

    Rao, J., Ross, K.A.: Making B+- trees cache conscious in main memory. In: SIGMOD, pp 475–486 (2000)

    Google Scholar 

  34. 34.

    Rasetic, S., Sander, J., Elding, J., Nascimento, M.A.: A trajectory splitting model for efficient spatio-temporal indexing. In: Proceedings of VLDB, pp 934–945 (2005)

    Google Scholar 

  35. 35.

    Setton, E., Girod, B.: Video streaming with Sp and Si frames Visual Communications and Image Processing 2005. International Society for Optics and Photonics (2005), pp 59,606F–59,606F

  36. 36.

    Simonas Saltenis, C.S. J., Leutenegger, S. T., Lopez, M. A.: Indexing the positions of continuously moving objects. In: SIGMOD, pp 331–342 (2000)

    Google Scholar 

  37. 37.

    Stonebraker, M., Abadi, D.J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O’Neil, E., O’Neil, P., Rasin, A., Tran, N., Zdonik, S.: C-store: a column-oriented DBMS VLDB, pp 553–564 (2005)

    Google Scholar 

  38. 38.

    Su, H., Zheng, K., Wang, H., Huang, J., Zhou, X.: Calibrating trajectory data for similarity-based analysis. In: SIGMOD, pp 833–844 (2013)

    Google Scholar 

  39. 39.

    Tao, Y., Papadias, D., Sun, J.: The TPR*-tree: an optimized spatio-temporal access method for predictive queries PVLDB, pp 790–801 (2003)

    Google Scholar 

  40. 40.

    Vlachos, M., Gunopulos, D., Kollios, G.: Robust similarity measures for mobile object trajectories. In: Proceedings of the 13Th International Workshop On Database and Expert Systems Applications, 2002, pp 721–726. IEEE (2002)

  41. 41.

    Vlachos, M., Kollios, G., Gunopulos, D.: Discovering similar multidimensional trajectories. In: Proceedings of the 18th International Conference on Data Engineering, 2002, pp 673–684. IEEE (2002)

  42. 42.

    Wang, H., Zheng, K., Xu, J., Zheng, B., Zhou, X., Sadiq, S.: SharkDB: an in-memory column-oriented trajectory storage. In: CIKM, pp 1409–1418 (2014)

    Google Scholar 

  43. 43.

    Yi, B.K., Jagadish, H., Faloutsos, C.: Efficient retrieval of similar time sequences under time warping. In: Proceedings of the 14th International Conference on Data Engineering, 1998, pp 201–208. IEEE (1998)

Download references

Acknowledgements

This work is partially supported by Natural Science Foundation of China (No. 61502324 and No. 61532018).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Kai Zheng.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zheng, B., Wang, H., Zheng, K. et al. SharkDB: an in-memory column-oriented storage for trajectory analysis. World Wide Web 21, 455–485 (2018). https://doi.org/10.1007/s11280-017-0466-9

Download citation

Keywords

  • Spatial database
  • Trajectory
  • In-memory
  • Storage