Distributed and parallel processing for real-time and dynamic spatio-temporal graph

  • Junhua FangEmail author
  • Jiafeng Ding
  • Pengpeng Zhao
  • Jiajie Xu
  • An Liu
  • Zhixu Li
Part of the following topical collections:
  1. Special Issue on Graph Data Management in Online Social Networks


As a non-linear data structure consisting of nodes and edges, the graph data span many different domains. In the real world, applications based on such data structures are always time-sensitive, that is, the value of graph data tends to decrease with time. Furthermore, the application based on spatio-temporal graph is one of the typical representatives of time-sensitive, since the time dimension is an inherent feature of spatio-temporal data. The Distributed Stream Processing Engine (DSPE) seems an excellent choice for the above requirement, which is commonly partitioned and concurrently processed by a number of threads to maximize the throughput. However, it is not feasible to do such mission directly using the traditional DSPE. In this paper, we propose a computational model suitable for handling the spatio-temporal graph in DSPE, by reconstructing the DSPE’s parallel processing slots. Specifically, our proposal includes a general processing framework to deal with the data structure of the spatio-temporal graph, a state information compensation mechanism to ensure the correctness of processing such stateful operation in DSPE, a lightweight summary information calculation method to ensure the performance of the system. Empirical studies on real-world stream applications validate the usefulness of our proposals and prove the considerable advantage of our approaches over state-of-the-art solutions in the literature.


Distributed and parallel computing Real-time processing Spatio-temporal graph 



This work is partially supported by NSFC (No.61802273), the Postdoctoral Science Foundation of China under Grant (No. 2017M621813), the Postdoctoral Science Foundation of Jiangsu Province of China under Grant (No. 2018K029C), and the Natural Science Foundation for Colleges and Universities in Jiangsu Province of China under Grant (No. 18KJB520044).


  1. 1.
    Apache Flink Project.
  2. 2.
    Apache Spark Project.
  3. 3.
    Apache Storm Project.
  4. 4.
    Bakalov, P., Hadjieleftheriou, M., Keogh, E., Tsotras, V.J.: Efficient trajectory joins using symbolic representations. In: Proceedings of the 6th international conference on Mobile data management, pages 86–93. ACM (2005)Google Scholar
  5. 5.
    Bakalov, P., Hadjieleftheriou, M., Tsotras, V.J.: Time relaxed spatiotemporal trajectory joins. In: Proceedings of the 13th Annual ACM International Workshop on Geographic Information Systems, pp 182–191. ACM (2005)Google Scholar
  6. 6.
    Balkesen, C., Tatbul, N.: Scalable data partitioning techniques for parallel sliding window processing over data streams. In: International Workshop on Data Management for Sensor Networks (DMSN) (2011)Google Scholar
  7. 7.
    Bruno, N., Kwon, Y., Wu, M.-C.: Advanced join strategies for large-scale distributed computation. Proc. VLDB Endow. 7(13), 1484–1495 (2014)CrossRefGoogle Scholar
  8. 8.
    Cao, P., Wang, Z.: Efficient top-k query calculation in distributed networks. In: Proceedings of the Twenty-Third Annual ACM Symposium on Principles of Distributed Computing, pp 206–215. ACM (2004)Google Scholar
  9. 9.
    Chen, L., Özsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp 491–502. ACM (2005)Google Scholar
  10. 10.
    Cormode, G., Muthukrishnan, S.: An improved data stream summary: The count-min sketch and its applications. J. Algor. 55(1), 58–75 (2005)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Ding, J., Fang, J., Zhang, Z., Zhao, P., Xu, J., Zhao, L.: Real-time trajectory similarity processing using longest common subsequence. In: Proceedings of the 21st High Performance Computing and Communications. IEEE (To appear)Google Scholar
  12. 12.
    Dubuisson, M.-P., Jain, A.K.: A modified Hausdorff distance for object matching. In: Proceedings of 12th International Conference on Pattern Recognition, vol. 1, pp 566–568. IEEE (1994)Google Scholar
  13. 13.
    Eiter, T., Mannila, H.: Computing Discrete Fréchet Distance. Technical report, Citeseer (1994)Google Scholar
  14. 14.
    Gedik, B.: Partitioning functions for stateful data parallelism in stream processing. VLDB J. Int. J. Very Large Data Bases 23(4), 517–539 (2014)CrossRefGoogle Scholar
  15. 15.
    Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: Graphx: Graph processing in a distributed dataflow framework. In: 11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14), pp 599–613 (2014)Google Scholar
  16. 16.
    Haan, H., Streb, J., Bien, S., Rösler, F.: Individual cortical current density reconstructions of the semantic n400 effect: Using a generalized minimum norm model with different constraints (l1 and l2 norm). Hum. Brain Mapp. 11(3), 178–192 (2000)CrossRefGoogle Scholar
  17. 17.
    Ji, S., Mittal, P., Beyah, R.: Graph data anonymization, de-anonymization attacks, and de-anonymizability quantification: a survey. IEEE Commun. Surveys Tutor. 19(2), 1305–1326 (2017)CrossRefGoogle Scholar
  18. 18.
    Li, L., Zheng, K., Wang, S., Hua, W., Zhou, X.: Go slow to go fast: Minimal on-road time route scheduling with parking facilities using historical trajectory. VLDB J. Int. J. Very Large Data Bases 27(3), 321–345 (2018)CrossRefGoogle Scholar
  19. 19.
    Lian, D., Zheng, K., Ge, Y., Cao, L., Chen, E., Xie, X.: Geomf++: Scalable location recommendation via joint geographical modeling and matrix factorization. ACM Trans. Inf. Syst. (TOIS) 36(3), 33 (2018)CrossRefGoogle Scholar
  20. 20.
    Liu, G., Liu, Y., Zheng, K., Liu, A., Li, Z., Wang, Y., Zhou, X.: Mcs-gpm: Multi-constrained simulation based graph pattern matching in contextual social graphs. IEEE Trans. Knowl. Data Eng. 30(6), 1050–1064 (2017)CrossRefGoogle Scholar
  21. 21.
    Nasir, M.A.U., Morales, G.D.F., Garcia-Soriano, D., Kourtellis, N., Serafini, M.: The power of both choices: Practical load balancing for distributed stream processing engines. In: 2015 IEEE 31st International Conference on Data Engineering, pp 137–148. IEEE (2015)Google Scholar
  22. 22.
    Nasir, M.A.U., Morales, G.D.F., Kourtellis, N., Serafini, M.: When two choices are not enough: Balancing at scale in distributed stream processing. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp 589–600. IEEE (2016)Google Scholar
  23. 23.
    Paterson, M., Dančík, V.: Longest common subsequences. In: International Symposium on Mathematical Foundations of Computer Science, pp 127–142. Springer (1994)Google Scholar
  24. 24.
    Rivetti, N., Querzoni, L., Anceaume, E., Busnel, Y., Sericola, B.: Efficient key grouping for near-optimal load balancing in stream processing systems. In: Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems, pp 80–91. ACM (2015)Google Scholar
  25. 25.
    Shang, S., Ding, R., Yuan, B., Xie, K., Zheng, K., Kalnis, P.: User oriented trajectory search for trip recommendation. In: Proceedings of the 15th International Conference on Extending Database Technology, pp 156–167. ACM (2012)Google Scholar
  26. 26.
    Shang, S., Chen, L., Wei, Z., Jensen, C.S., Zheng, K., Kalnis, P.: Trajectory similarity join in spatial networks. Proc. VLDB Endow. 10(11), 1178–1189 (2017)CrossRefGoogle Scholar
  27. 27.
    Tsourakakis, C., Gkantsidis, C., Radunovic, B., Vojnovic, M.: Fennel: Streaming graph partitioning for massive scale graphs. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, pp 333–342. ACM (2014)Google Scholar
  28. 28.
    Vitorovic, A., Elseidy, M., Koch, C.: Load balancing and skew resilience for parallel joins. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp 313–324. IEEE (2016)Google Scholar
  29. 29.
    Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., Keogh, E.: Indexing multi-dimensional time-series with support for multiple distance measures. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 216–225. ACM (2003)Google Scholar
  30. 30.
    Wang, H., Su, H., Zheng, K., Sadiq, S., Zhou, X.: An effectiveness study on trajectory similarity measures. In: Proceedings of the Twenty-Fourth Australasian Database Conference, vol. 137, pp 13–22. Australian Computer Society, Inc. (2013)Google Scholar
  31. 31.
    Xie, D., Li, F., Phillips, J.M.: Distributed trajectory similarity search. Proc. VLDB Endow. 10(11), 1478–1489 (2017)CrossRefGoogle Scholar
  32. 32.
    Xu, Y., Kostamaa, P., Zhou, X., Chen, L.: Handling data skew in parallel joins in shared-nothing systems. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp 1043–1052. ACM (2008)Google Scholar
  33. 33.
    Yi, B.-K., Jagadish, H., Faloutsos, C.: Efficient retrieval of similar time sequences under time warping. In: Proceedings 14th International Conference on Data Engineering, pp 201–208. IEEE (1998)Google Scholar
  34. 34.
    Yin, H., Zhou, X., Cui, B., Wang, H., Zheng, K., Nguyen, Q.V.H.: Adapting to user interest drift for poi recommendation. IEEE Trans. Knowl. Data Eng. 28(10), 2566–2581 (2016)CrossRefGoogle Scholar
  35. 35.
    Yu, H., Li, H.-G., Wu, P., Agrawal, D., El Abbadi, A.: Efficient processing of distributed top-k queries. In: International Conference on Database and Expert Systems Applications, pp 65–74. Springer (2005)Google Scholar
  36. 36.
    Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: Cluster computing with working sets. HotCloud 10(10–10), 95 (2010)Google Scholar
  37. 37.
    Zeinalipour-Yazti, D., Vagena, Z., Gunopulos, D., Kalogeraki, V., Tsotras, V., Vlachos, M., Koudas, N., Srivastava, D.: The threshold join algorithm for top-k queries in distributed sensor networks. In: Proceedings of the 2nd International Workshop on Data Management for Sensor Networks, pp 61–66. ACM (2005)Google Scholar
  38. 38.
    Zeinalipour-Yazti, D., Lin, S., Gunopulos, D.: Distributed spatio-temporal similarity search. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp 14–23. ACM (2006)Google Scholar
  39. 39.
    Zhao, Y., Zheng, K., Li, Y., Su, H., Liu, J., Zhou, X.: Destination-aware task assignment in spatial crowdsourcing: A worker decomposition approach. IEEE Transactions on Knowledge and Data Engineering (2019)Google Scholar
  40. 40.
    Zheng, K., Shang, S., Yuan, N.J., Yang, Y.: Towards efficient search for activity trajectories. In: 2013 IEEE 29Th International Conference on Data Engineering (ICDE), pp 230–241. IEEE (2013)Google Scholar
  41. 41.
    Zheng, K., Zheng, Y., Yuan, N.J., Shang, S., Zhou, X.: Online discovery of gathering patterns over trajectories. IEEE Trans. Knowl. Data Eng. 26(8), 1974–1988 (2013)CrossRefGoogle Scholar
  42. 42.
    Zheng, K., Su, H., Zheng, B., Shang, S., Xu, J., Liu, J., Zhou, X.: Interactive top-k spatial keyword queries. In: 2015 IEEE 31st International Conference on Data Engineering, pp 423–434. IEEE (2015)Google Scholar
  43. 43.
    Zheng, B., Su, H., Hua, W., Zheng, K., Zhou, X., Li, G.: Efficient clue-based route search on road networks. IEEE Trans. Knowl. Data Eng. 29 (9), 1846–1859 (2017)CrossRefGoogle Scholar
  44. 44.
    Zheng, K., Zhao, Y., Lian, D., Zheng, B, Liu, G., Zhou, X.: Reference-based framework for spatio-temporal trajectory compression and query processing. IEEE Transactions on Knowledge and Data Engineering (2019)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Junhua Fang
    • 1
    Email author
  • Jiafeng Ding
    • 1
  • Pengpeng Zhao
    • 1
  • Jiajie Xu
    • 1
  • An Liu
    • 1
  • Zhixu Li
    • 1
  1. 1.Institute of Artificial Intelligence, School of Computer Science and TechnologySoochow UniversitySuzhouChina

Personalised recommendations