The VLDB Journal

, Volume 25, Issue 2, pp 269–290 | Cite as

Toward continuous pattern detection over evolving large graph with snapshot isolation

Regular Paper

Abstract

This paper studies continuous pattern detection over large evolving graphs, which plays an important role in monitoring-related applications. The problem is challenging due to the large size and dynamic updates of graphs, the massive search space of pattern detection and inconsistent query results on dynamic graphs. This paper first introduces a snapshot isolation requirement, which ensures that the query results come from a consistent graph snapshot instead of a mixture of partial evolving graphs. Second, we propose an SSD (single sink directed acyclic graph) plan friendly to vertex-centric-distributed graph processing frameworks. SSD plan can guide the message transformation and transfer among graph vertices, and determine the satisfaction of the pattern on graph vertices for the sink vertex. Third, we devise strategies for major steps in the SSD evaluation, including the location of valid messages to achieve snapshot isolation, AO-List to determine the satisfaction of transition rule over dynamic graph, and message-on-change policy to reduce outgoing messages. The experiments on billion-edge graphs using Giraph, an open source implementation of Pregel, illustrate the efficiency and effectiveness of our method.

Keywords

Dynamic graph Pattern detection Graph streaming  Snapshot solation 

References

  1. 1.
  2. 2.
    Blanas, S., Patel, J.M., Ercegovac, V., Rao, J., Shekita, E.J., Tian, Y.: A comparison of join algorithms for log processing in mapreduce. In: SIGMOD, pp. 975–986 (2010)Google Scholar
  3. 3.
    Boldi, P., Santini, M., Vigna, S.: A large time-aware graph. SIGIR Forum 42(2), 33–38 (2008)CrossRefGoogle Scholar
  4. 4.
    Bröcheler, M., Pugliese, A., Subrahmanian, V.S.: Cosi: cloud oriented subgraph identification in massive social networks. In: ASONAM, pp. 248–255 (2010)Google Scholar
  5. 5.
    Cheng, R., Hong, J., Kyrola, A., Miao, Y., Weng, X., Wu, M., Yang, F., Zhou, L., Zhao, F., Chen, E.: Kineograph: taking the pulse of a fast-changing and connected world. In: EuroSys, pp. 85–98 (2012)Google Scholar
  6. 6.
    Choudhury, S., Holder, L.B., Chin, G. Jr., Agarwal, K., Feo, J.: A selectivity based approach to continuous pattern detection in streaming graphs. In: EDBT, pp. 157–168 (2015)Google Scholar
  7. 7.
    Diao, Y., Fischer, P.M., Franklin, M.J., To, R.: Yfilter: efficient and scalable filtering of xml documents. In: ICDE, pp. 341–342 (2002)Google Scholar
  8. 8.
    Fan, W., Li, J., Luo, J., Tan, Z., Wang, X., Wu, Y.: Incremental graph pattern matching. In: SIGMOD, pp. 925–936 (2011)Google Scholar
  9. 9.
    Fan, W., Li, J., Ma, S., Tang, N., Wu, Y., Wu, Y.: Graph pattern matching: from intractable to polynomial time. PVLDB 3(1), 264–275 (2010)Google Scholar
  10. 10.
    Gao, J., Zhou, C., Zhou, J., Yu, J.X.: Continuous pattern detection over billion-edge graph using distributed framework. In: ICDE, pp. 556–567 (2014)Google Scholar
  11. 11.
    Green, T.J., Miklau, G., Onizuka, M., Suciu, D.: Processing xml streams with deterministic automata. In: ICDT, pp. 173–189 (2003)Google Scholar
  12. 12.
    Han, M., Daudjee, K., Khaled Ammar, M., Özsu, T., Wang, X., Jin, T.: An experimental comparison of pregel-like graph processing systems. PVLDB 7(12), 1047–1058 (2014)Google Scholar
  13. 13.
    Khan, A., Li, N., Yan, X., Guan, Z., Chakraborty, S., Tao, S.: Neighborhood based fast graph search in large networks. In: SIGMOD, pp. 901–912 (2011)Google Scholar
  14. 14.
    Khurana, U., Deshpande, A.: Efficient snapshot retrieval over historical graph data. In: ICDE, pp. 997–1008 (2013)Google Scholar
  15. 15.
    Kwak, H., Lee, C., Park, H., Moon, S.B.: What is Twitter, a social network or a news media? In: WWW, pp. 591–600 (2010)Google Scholar
  16. 16.
    Lee, J., Han, W.-S., Kasperovics, R., Lee, J.-H.: An in-depth comparison of subgraph isomorphism algorithms in graph databases. PVLDB 6(2), 133–144 (2012)Google Scholar
  17. 17.
    Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning in the cloud. PVLDB 5(8), 716–727 (2012)Google Scholar
  18. 18.
    Ma, S., Cao, Y., Huai, J., Wo, T.: Distributed graph pattern matching. In: WWW, pp. 949–958 (2012)Google Scholar
  19. 19.
    Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD, pp. 135–146 (2010)Google Scholar
  20. 20.
    McCune, R.R., Weninger, T., Madey, G.R.: Thinking like a vertex: a survey of vertex-centric frameworks for distributed graph processing. CoRR, abs/1507.04405 (2015)Google Scholar
  21. 21.
    Mondal., J., Deshpande, A.: Managing large dynamic graphs efficiently. In: SIGMOD, pp. 145–156 (2012)Google Scholar
  22. 22.
    Pugliese, A., Bröcheler, M., Subrahmanian, V.S., Ovelgönne, M.: Efficient multiview maintenance under insertion in huge social networks. TWEB 8(2), 10 (2014)Google Scholar
  23. 23.
    Salihoglu, S., Widom, J.: Optimizing graph algorithms on pregel-like systems. PVLDB 7(7), 577–588 (2014)Google Scholar
  24. 24.
    Shang, H., Zhang, Y., Lin, X., Yu, J.X.: Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. PVLDB 1(1), 364–375 (2008)Google Scholar
  25. 25.
    Shao, B., Wang, H., Xiao, Y.: Trinity: a distributed graph engine on a memory cloud. In: SIGMOD, pp. 505–516 (2013)Google Scholar
  26. 26.
    Sun, Z., Wang, H., Wang, H., Shao, B., Li, J.: Efficient subgraph matching on billion node graphs. PVLDB 5(9), 788–799 (2012)Google Scholar
  27. 27.
    Tian, Y., Balmin, A., Corsten, S.A., Tatikonda, S., McPherson, J.: From think like a vertex to think like a graph. PVLDB 7(3), 193–204 (2013)Google Scholar
  28. 28.
    Tian, Y., Patel, J.M.: Tale: a tool for approximate large graph matching. In: ICDE, pp. 963–972 (2008)Google Scholar
  29. 29.
    Ullmann, J.R.: An algorithm for subgraph isomorphism. J. ACM 23, 31–42 (1976)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Wang, C., Chen, L.: Continuous subgraph pattern search over graph streams. In: ICDE, pp. 393–404 (2009)Google Scholar
  31. 31.
    Wang, X., Ding, X., Tung, A.K.H., Ying, S., Jin, H.: An efficient graph indexing method. In: ICDE, pp. 210–221 (2012)Google Scholar
  32. 32.
    Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: SIGMOD, pp. 335–346 (2004)Google Scholar
  33. 33.
    Zhao, P., Han, J.: On graph query optimization in large networks. PVLDB 3(1), 340–351 (2010)Google Scholar
  34. 34.
    Zhou, C., Gao, J., Sun, B., Yu, J.X.: Mocgraph: scalable distributed graph processing using message online computing. PVLDB 8(4), 377–388 (2014)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Key Laboratory of High Confidence Software Technologies, EECSPeking UniversityBeijingChina
  2. 2.Department of Systems Engineering and Engineering ManagementChinese University of Hong KongShatinChina

Personalised recommendations