Advertisement

LMCC: Lazy Message and Centralized Cache for Asynchronous Graph Computing

  • Ruini Xue
  • Zhibin Dong
  • Wei Su
  • Xiaofang Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11335)

Abstract

Graph has been widely used in complex network applications modeling, and the asynchronous graph processing model is superceding the BSP model because of its better convergence speed. However, the asynchronous GAS model proposed by PowerGraph usually results in irregular and unpredictable communication patterns as well as vertex-scale barriers, so it is difficult for programmers to optimize codes. To address these challenges, we propose LMCC, an improved message management approach including lazy pull-message model and vertex-oriented centralized cache, which can reduce communication cost in terms of message quantity, and reduce the number of computation iterations in turn, without compromising the accuracy of application results. Based on the deep investigation of the GAS phases, LMCC is designed to be totally transparent to user applications. Experimental results show that LMCC can deliver speedup for various types of graph computing benchmarks ranging from 129% to 271%.

Keywords

Graph processing Communication optimization Message combination Centralized cache 

References

  1. 1.
    Abou-Rjeili, A., Karypis, G.: Multilevel algorithms for partitioning power-law graphs. In: 20th International Parallel and Distributed Processing Symposium, IPDPS 2006, pp. 10-pp. IEEE (2006)Google Scholar
  2. 2.
    Ahmed, A., Aly, M., Gonzalez, J., Narayanamurthy, S., Smola, A.J.: Scalable inference in latent variable models. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pp. 123–132. ACM (2012)Google Scholar
  3. 3.
    Backstrom, L., Huttenlocher, D., Kleinberg, J., Lan, X.: Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 44–54. ACM (2006)Google Scholar
  4. 4.
    Biemann, C.: Chinese whispers: an efficient graph clustering algorithm and its application to natural language processing problems. In: Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing, pp. 73–80. Association for Computational Linguistics (2006)Google Scholar
  5. 5.
    Chen, H., Li, X., Huang, Z.: Link prediction approach to collaborative filtering. In: Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2005, pp. 141–142. IEEE (2005)Google Scholar
  6. 6.
    Chen, Q., Bai, S., Li, Z., Gou, Z., Suo, B., Pan, W.: GraphHP: a hybrid platform for iterative graph processing. arXiv preprint arXiv:1706.07221 (2017)
  7. 7.
    Chen, R., Shi, J., Chen, Y., Chen, H.: PowerLyra: differentiated graph computation and partitioning on skewed graphs. In: Réveillère, L., 0001, T.H., Herlihy, M. (eds.) Proceedings of the Tenth European Conference on Computer Systems, EuroSys 2015, Bordeaux, France, 21–24 April 2015, pp. 1:1–1:15. ACM (2015)Google Scholar
  8. 8.
    Cisco, Visual Networking Index: The zettabyte era: Trends and analysis (2017). https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/vni-hyperconnectivity-wp.html. Accessed 07 June 2017
  9. 9.
    Coffman, T., Greenblatt, S., Marcus, S.: Graph-based technologies for intelligence analysis. Commun. ACM 47(3), 45–47 (2004)CrossRefGoogle Scholar
  10. 10.
    Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Graphlab powergraph v2.2. https://github.com/jegonzal/PowerGraph
  11. 11.
    Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: distributed graph-parallel computation on natural graphs. In: OSDI, vol. 12, no. 2 (2012)Google Scholar
  12. 12.
    Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: Graphx: graph processing in a distributed dataflow framework. In: OSDI, vol. 14, pp. 599–613 (2014)Google Scholar
  13. 13.
    Han, M., Daudjee, K.: Giraph unchained: barrierless asynchronous parallel execution in pregel-like graph processing systems. Proc. VLDB Endow. 8(9), 950–961 (2015)CrossRefGoogle Scholar
  14. 14.
    Han, W.S., et al.: TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 77–85. ACM (2013)Google Scholar
  15. 15.
    Hoque, I., Gupta, I.: LFGraph: simple and fast distributed graph analytics. In: Proceedings of the First ACM SIGOPS Conference on Timely Results in Operating Systems, p. 9. ACM (2013)Google Scholar
  16. 16.
    Hu, Y., Koren, Y., Volinsky, C.: Collaborative filtering for implicit feedback datasets. In: Eighth IEEE International Conference on Data Mining, ICDM 2008, pp. 263–272. IEEE (2008)Google Scholar
  17. 17.
    Huang, Z., Chen, H., Zeng, D.: Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Trans. Inf. Syst. (TOIS) 22(1), 116–142 (2004)CrossRefGoogle Scholar
  18. 18.
    Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 177–187. ACM (2005)Google Scholar
  19. 19.
    Leskovec, J., Krevl, A.: SNAP Datasets: stanford large network dataset collection, June 2014. http://snap.stanford.edu/data
  20. 20.
    Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6(1), 29–123 (2009)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Leskovec, J., Mcauley, J.J.: Learning to discover social circles in ego networks. In: Advances in Neural Information Processing Systems, pp. 539–547 (2012)Google Scholar
  22. 22.
    Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)CrossRefGoogle Scholar
  23. 23.
    Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135–146. ACM (2010)Google Scholar
  24. 24.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab (1999)Google Scholar
  25. 25.
    Richardson, M., Agrawal, R., Domingos, P.: Trust management for the semantic web. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 351–368. Springer, Heidelberg (2003).  https://doi.org/10.1007/978-3-540-39718-2_23CrossRefGoogle Scholar
  26. 26.
    Roy, A., Mihailovic, I., Zwaenepoel, W.: X-stream: edge-centric graph processing using streaming partitions. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pp. 472–488. ACM (2013)Google Scholar
  27. 27.
    Takac, L., Zabovsky, M.: Data analysis in public social networks. In: International Scientific Conference and International Workshop Present Day Trends of Innovations, vol. 1 (2012)Google Scholar
  28. 28.
    Tian, Y., Balmin, A., Corsten, S.A., Tatikonda, S., McPherson, J.: From think like a vertex to think like a graph. Proc. VLDB Endow. 7(3), 193–204 (2013)CrossRefGoogle Scholar
  29. 29.
    Vora, K., Koduru, S.C., Gupta, R.: Aspire: exploiting asynchronous parallelism in iterative algorithms using a relaxed consistency based DSM. In: ACM SIGPLAN Notices, vol. 49, pp. 861–878 (2014)CrossRefGoogle Scholar
  30. 30.
    Xie, C., Chen, R., Guan, H., Zang, B., Chen, H.: SYNC or ASYNC: time to fuse for distributed graph-parallel computation. ACM SIGPLAN Not. 50(8), 194–204 (2015)CrossRefGoogle Scholar
  31. 31.
    Yan, D., Cheng, J., Lu, Y., Ng, W.: Blogel: a block-centric framework for distributed computation on real-world graphs. Proc. VLDB Endow. 7(14), 1981–1992 (2014)CrossRefGoogle Scholar
  32. 32.
    Yuan, P., Zhang, W., Xie, C., Jin, H., Liu, L., Lee, K.: Fast iterative graph computation: a path centric approach. In: SC14 International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 401–412. IEEE (2014)Google Scholar
  33. 33.
    Zhang, M., Wu, Y., Chen, K., Qian, X., Li, X., Zheng, W.: Exploring the hidden dimension in graph processing. In: OSDI, vol. 16, pp. 285–300 (2016)Google Scholar
  34. 34.
    Zhou, Y., Wilkinson, D., Schreiber, R., Pan, R.: Large-scale parallel collaborative filtering for the netflix prize. In: Fleischer, R., Xu, J. (eds.) AAIM 2008. LNCS, vol. 5034, pp. 337–348. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-68880-8_32CrossRefGoogle Scholar
  35. 35.
    Zhu, X., Chen, W., Zheng, W., Ma, X.: Gemini: a computation-centric distributed graph processing system. In: OSDI, pp. 301–316 (2016)Google Scholar
  36. 36.
    Zhu, X., Han, W., Chen, W.: GridGraph: large-scale graph processing on a single machine using 2-level hierarchical partitioning. In: USENIX Annual Technical Conference, pp. 375–386 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.University of Electronic Science and Technology of ChinaChengduChina

Personalised recommendations