TrajSpark: A Scalable and Efficient In-Memory Management System for Big Trajectory Data

Zhang, Zhigang; Jin, Cheqing; Mao, Jiali; Yang, Xiaolin; Zhou, Aoying

doi:10.1007/978-3-319-63579-8_2

Zhigang Zhang¹⁸,
Cheqing Jin¹⁸,
Jiali Mao¹⁸,
Xiaolin Yang¹⁸ &
…
Aoying Zhou¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10366))

Included in the following conference series:

Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data

2303 Accesses
18 Citations

Abstract

The widespread application of mobile positioning devices has generated big trajectory data. Existing disk-based trajectory management systems cannot provide scalable and low latency query services any more. In view of that, we present TrajSpark, a distributed in-memory system to consistently offer efficient management of trajectory data. TrajSpark introduces a new abstraction called IndexTRDD to manage trajectory segments, and exploits a global and local indexing mechanism to accelerate trajectory queries. Furthermore, to alleviate the essential partitioning overhead, it adopts the time-decay model to monitor the change of data distribution and updates the data-partition structure adaptively. This model avoids repartitioning existing data when new batch of data arrives. Extensive experiments of three types of trajectory queries on both real and synthetic dataset demonstrate that the performance of TrajSpark outperforms state-of-the-art systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Aly, A.M., Mahmood, A.R., Hassan, M.S., Aref, W.G., Ouzzani, M., Elmeleegy, H., Qadah, T.: AQWA: adaptive query-workload-aware partitioning of big spatial data. PVLDB 8(13), 2062–2073 (2015)
Google Scholar
Botea, V., Mallett, D., Nascimento, M.A., Sander, J.: PIST: an efficient and practical indexing technique for historical spatio-temporal point data. GeoInformatica 12(2), 143–168 (2008)
Article Google Scholar
Chakka, V.P., Everspaugh, A.C., Patel, J.M.: Indexing large trajectory data sets with seti, vol. 1001, p. 12. Citeseer (2003)
Google Scholar
Cudré-Mauroux, P., Wu, E., Madden, S.: Trajstore: an adaptive storage system for very large trajectory data sets. In: ICDE, pp. 109–120 (2010)
Google Scholar
Eldawy, A., Mokbel, M.F.: SpatialHadoop: a MapReduce framework for spatial data. In: ICDE, pp. 1352–1363 (2015)
Google Scholar
Huang, S., Wang, B., Zhu, J., Wang, G., Yu, G.: R-hbase: a multi-dimensional indexing framework for cloud computing environment. In: ICDM, pp. 569–574 (2014)
Google Scholar
Hughes, J.N., Annex, A., Eichelberger, C.N., Fox, A., Hulbert, A., Ronquest, M.: Geomesa: a distributed architecture for spatio-temporal fusion. In: SPIE Defense+ Security, p. 94730F (2015)
Google Scholar
Lange, R., Dürr, F., Rothermel, K.: Scalable processing of trajectory-based queries in space-partitioned moving objects databases. In: SIGSPATIAL, p. 31 (2008)
Google Scholar
Liu, H., Jin, C., Zhou, A.: Popular route planning with travel cost estimation. In: Navathe, S.B., Wu, W., Shekhar, S., Du, X., Wang, X.S., Xiong, H. (eds.) DASFAA 2016. LNCS, vol. 9643, pp. 403–418. Springer, Cham (2016). doi:10.1007/978-3-319-32049-6_25
Chapter Google Scholar
Ma, Q., Yang, B., Qian, W., Zhou, A.: Query processing of massive trajectory data based on mapreduce. In: CIKM, pp. 9–16 (2009)
Google Scholar
Nishimura, S., Das, S., Agrawal, D., El Abbadi, A.: MD-hbase: design and implementation of an elastic data infrastructure for cloud-scale location services. DPD 31(2), 289–319 (2013)
Google Scholar
Österreicher, F., Vajda, I.: A new class of metric divergences on probability spaces and its applicability in statistics. AISM 55(3), 639–653 (2003)
Article MathSciNet MATH Google Scholar
Tan, H., Luo, W., Ni, L.M.: Clost: a hadoop-based storage system for big spatio-temporal data analytics. In: CIKM, pp. 2139–2143 (2012)
Google Scholar
Tang, M., Yu, Y., Malluhi, Q.M., Ouzzani, M., Aref, W.G.: LocationSpark: a distributed in-memory data management system for big spatial data. PVLDB 9(13), 1565–1568 (2016)
Google Scholar
Tzoumas, K., Yiu, M.L., Jensen, C.S.: OceanST: a distributed analytic system for large-scale spatiotemporal mobile broadband data. PVLDB 7, 1561–1564 (2014)
Google Scholar
Wang, H., Zheng, K., Zhou, X., Sadiq, S.W.: SharkDB: an in-memory storage system for massive trajectory data. In: SIGMOD, pp. 1099–1104 (2015)
Google Scholar
Xie, D., Li, F., Yao, B., Li, G., Zhou, L., Guo, M.: Simba: efficient in-memory spatial analytics. In: SIGMOD, pp. 1071–1085 (2016)
Google Scholar
Xie, X., Mei, B., Chen, J., Du, X., Jensen, C.S.: Elite: an elastic infrastructure for big spatiotemporal trajectories. VLDB J. 25(4), 473–493 (2016)
Article Google Scholar
You, S., Zhang, J., Gruenwald, L.: Large-scale spatial join query processing in cloud. In: ICDE Workshops, pp. 34–41 (2015)
Google Scholar
Yu, J., Wu, J., Sarwat, M.: Geospark: a cluster computing framework for processing large-scale spatial data. In: SIGSPATIAL, pp. 70:1–70:4 (2015)
Google Scholar

Download references

Acknowledgement

This paper is supported by the National Key Research and Development Program of China (2016YFB1000905), NSFC (61370101, 61532021, U1501252, U1401256 and 61402180), Shanghai Knowledge Service Platform Project (No. ZF1213).

Author information

Authors and Affiliations

School of Data Science and Engineering, East China Normal University, Shanghai, China
Zhigang Zhang, Cheqing Jin, Jiali Mao, Xiaolin Yang & Aoying Zhou

Authors

Zhigang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Cheqing Jin
View author publications
You can also search for this author in PubMed Google Scholar
Jiali Mao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Aoying Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheqing Jin .

Editor information

Editors and Affiliations

Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China
Lei Chen
Computer Science, Aarhus University, Aarhus N, Denmark
Christian S. Jensen
Computer Science, University of Southern California, Los Angeles, California, USA
Cyrus Shahabi
Northeastern University, Shenyang, China
Xiaochun Yang
Kent State University, Kent, Ohio, USA
Xiang Lian

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Z., Jin, C., Mao, J., Yang, X., Zhou, A. (2017). TrajSpark: A Scalable and Efficient In-Memory Management System for Big Trajectory Data. In: Chen, L., Jensen, C., Shahabi, C., Yang, X., Lian, X. (eds) Web and Big Data. APWeb-WAIM 2017. Lecture Notes in Computer Science(), vol 10366. Springer, Cham. https://doi.org/10.1007/978-3-319-63579-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-63579-8_2
Published: 03 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63578-1
Online ISBN: 978-3-319-63579-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics