Advertisement

Dolha - an efficient and exact data structure for streaming graphs

  • Fan ZhangEmail author
  • Lei Zou
  • Li Zeng
  • Xiangyang Gou
Article
Part of the following topical collections:
  1. Special Issue on Graph Data Management in Online Social Networks

Abstract

A streaming graph is a graph formed by a sequence of incoming edges with time stamps. Unlike the static graphs, the streaming graph is highly dynamic and time-related. Streaming graphs in the real world, which are of the high volume and velocity, can be challenging to the classic graph data structures: data of internet traffic, social network communication, and financial transections, etc. The traditional graph storage models like the adjacency matrix and the adjacency list are no longer sufficient for the large amount data and high frequency updates. And most the streaming graph structures are only supports the specific graph algorithms. Here a new data structure is presented to meet the challenge: a double orthogonal list in hash table (Dolha) as a high speed and high memory efficiency graph structure. Dolha has constant time cost for single edge processing, and near-linear space cost. Moreover, time cost for neighborhood queries in Dolha is linear, which enables it to support most algorithms of graphs without extra cost. A persistent structure based on Dolha is also presented, to handle the sliding window update and time related queries.

Keywords

Streaming graph Data structure Efficient and exact Graph algorithms 

Notes

References

  1. 1.
    Bender, M., Demaine, E., Farach-Colton, M.: Cache-oblivious b-trees. SIAM J. Comput. 35(2), 341—358 (2005)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Bender, M., Hu, H.: An adaptive packed-memory array. ACM Trans. Database Syst. 32(4) (2007)CrossRefGoogle Scholar
  3. 3.
    Boldi, P., Rosa, M., Vigna, S.: Hyperanf: approximating the neighbourhood function of very large graphs on a budget. International World Wide Web Conferences, 625–634 (2011)Google Scholar
  4. 4.
    Broder, A.Z., Mitzenmacher, M.: Network applications of bloom filters: a survey. Internet Math. 1(4), 485–509 (2004)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Caida internet anonymized traces 2015 dataset, http://www.caida.org/home/
  6. 6.
    Cha, M., Haddadi, H., Benevenuto, F., Gummadi, K.P.: Measuring user influence in twitter: the million follower fallacy (2010)Google Scholar
  7. 7.
    Chi, L., Li, B., Zhu, X., Pan, S., Chen, L.: Hashing for adaptive real-time graph stream classification with concept drifts. IEEE Trans. Sys. Man Cybern. 48(5), 1591–1604 (2018)Google Scholar
  8. 8.
    Cormode, G., Muthukrishnan, S.: An improved data stream summary: The count-min sketch and its applications, latin american symposium on theoretical informatics, 29–38 (2004)Google Scholar
  9. 9.
    De Stefani, L., Epasto, A., Riondato, M., Upfal, E.: TRIÈST: counting local and global triangles in fully-dynamic streams with fixed memory size. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, ser. KDD ’16. ACM (2016)Google Scholar
  10. 10.
    Demaine, E., Hajiaghayi, M.: Bigdnd: big dynamic network data, http://projects.csail.mit.edu/dnd/DBLP/
  11. 11.
  12. 12.
    Eswaran, D., Faloutsos, C., Guha, S.: Spotlight: detecting anomalies in streaming graphs. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1378–1386 (2018)Google Scholar
  13. 13.
    Gao, J., Zhou, C., Zhou, J., Yu, J.X.: Continuous pattern detection over billion-edge graph using distributed framework. In: Proc. 30th IEEE international conference on data engineering, pp 556–567 (2014)Google Scholar
  14. 14.
    Gao, L., Golab, L., Ozsu, M.T., Aluc, G.: Stream watdiv: a streaming rdf benchmark (3) (2018)Google Scholar
  15. 15.
    Gtgraph: a suite of synthetic random graph generators, http://www.cse.psu.edu/kxm85/software/GTgraph/
  16. 16.
    Guha, S., Andrew, M.: Graph synopses, sketches, and streams: a survey. PVLDB 5(12), 2030–2031 (2012)Google Scholar
  17. 17.
    Khan, A., Aggarwal, C.C.: Query-friendly compression of graph streams. IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 130–137 (2016)Google Scholar
  18. 18.
    Li, Y., Zou, L., Ozsu, M.T., Zhao, D.: Time constrained continuous subgraph search over streaming graphs. https://arxiv.org/pdf/1801.09240.pdf (2018)
  19. 19.
    Liu, G., Liu, Y., Zheng, K., Liu, A., Li, Z., Wang, Y., Zhou, X.: MCS-GPM: multi-constrained simulation based graph pattern matching in contextual social graphs. IEEE Trans. Knowl. Data Eng. 30, 1050–1064 (2018)CrossRefGoogle Scholar
  20. 20.
    Liu, G., Wang, Y., Orgun, M.: Optimal social trust path selection in complex social networks. PAAAI, 1391–1398 (2010)Google Scholar
  21. 21.
    Liu, G., Wang, Y., Orgun, M.: Finding K optimal social trust paths for the selection of trustworthy service providers in complex social networks. IEEE Trans. Services Comput. 6(2) (2013)Google Scholar
  22. 22.
    Liu, G., Zheng, K., Wang, Y., Orgun, M., Liu, A., Zhao, L., Zhou, X.: Multi-constrained graph pattern matching in large-scale contextual social graphs. ICDE, 351–362 (2015)Google Scholar
  23. 23.
    Liu, G., Zhu, F., Zheng, K., Liu, A., Li, Z., Zhao, L., Zhou, X.: TOSI: a trust-oriented social influence evaluation method in contextual social networks. Neurocomputing 210, 130–140 (2016)CrossRefGoogle Scholar
  24. 24.
    Mcgregor, A.: Graph stream algorithms: a survey. SIGMOD Record 43(1), 9–20 (2014)CrossRefGoogle Scholar
  25. 25.
    Pan, S., Wu, J., Zhu, X., Zhang, C.: Graph ensemble boosting for imbalanced noisy graph stream classification. IEEE Trans. Sys. Man Cybern. 45(5), 940–954 (2015)Google Scholar
  26. 26.
    Pigne, Y., Dutot, A., Guinand, F., Olivier, D.: Graphstream: a tool for bridging the gap between complex systems and dynamic graphs. EPNACS (2007)Google Scholar
  27. 27.
    Qiu, X., Cen, W., Qian, Z., Peng, Y., Zhang, Y., Lin, X., Zhou, J.: Real-time constrained cycle detection in large dynamic graphs. Proceedings of the VLDB Endowment 11(12) (2018)CrossRefGoogle Scholar
  28. 28.
    Schank, T., Wagner, D.: Finding, counting and listing all triangles in large graphs, an experimental study. In: Nikoletseas, S.E. (ed.) Experimental and Efficient Algorithms. Lecture Notes in Computer Science, vol. 3503 (2005)CrossRefGoogle Scholar
  29. 29.
    Stein, C., Drysdale, S., Borgart, K.: Probability Calculations in Hashing, in Discrete Mathematics for Computer Scientists, 1st edn., pp 245–254. Addison-Wesley, Reading (2010)Google Scholar
  30. 30.
    Tang, N., Chen, Q., Mitra, P.: Graph stream summarization: from big bang to big crunch. SIGMOD, 1481–1496 (2016)Google Scholar
  31. 31.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Authors and Affiliations

  1. 1.Peking UniversityBeijingChina

Personalised recommendations