Advertisement

A Multi-layer Framework for Graph Processing via Overlay Composition

  • Alessandro Lulli
  • Patrizio Dazzi
  • Laura Ricci
  • Emanuele Carlini
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9523)

Abstract

The processing of graph in a parallel and distributed fashion is a constantly rising trend, due to the size of the today’s graphs. This paper proposes a multi-layer graph overlay approach to support the orchestration of distributed, vertex-centric computations targeting large graphs. Our approach takes inspiration from the overlay networks, a widely exploited approach for information dissemination, aggregation and computing orchestration in massively distributed systems. We propose Telos, an environment supporting the definition of multi-layer graph overlays which provides each vertex with a layered, vertex-centric, view of the graph. Telos is defined on the top of Apache Spark and has been evaluated by considering two well-known graph problems. We present a set of experimental results showing the effectiveness of our approach.

Keywords

Ranking Function Large Graph Gossip Protocol Resilient Distribute Dataset MapReduce Paradigm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Aldinucci, M., Danelutto, M., Dazzi, P.: Muskel: an expandable skeleton environment. Scalable Comput. Pract. Exp. 8(4), 325–341 (2007)Google Scholar
  2. 2.
    Carlini, E., Coppola, M., Dazzi, P., Laforenza, D., Martinelli, S., Ricci, L.: Service and resource discovery supports over p2p overlays. In: International Conference on Ultra Modern Telecommunications and Workshops, ICUMT 2009, pp. 1–8. IEEE (2009)Google Scholar
  3. 3.
    Carlini, E., Dazzi, P., Esposito, A., Lulli, A., Ricci, L.: Balanced graph partitioning with Apache Spark. In: Lopes, L., et al. (eds.) Euro-Par 2014, Part I. LNCS, vol. 8805, pp. 129–140. Springer, Heidelberg (2014) Google Scholar
  4. 4.
    Carlini, E., Dazzi, P., Lucchese, C., Lulli, A., Ricci, L.: Cracker: crumbling large graphs into connected components. In: 20th IEEE ISCC, International Symposium on Computer and Communications. IEEE (2015)Google Scholar
  5. 5.
    Carlini, E., Dazzi, P., Mordacchini, M., Ricci, L.: Toward community-driven interest management for distributed virtual environment. In: an Mey, D., et al. (eds.) Euro-Par 2013. LNCS, vol. 8374, pp. 363–373. Springer, Heidelberg (2014) CrossRefGoogle Scholar
  6. 6.
    Ching, A.: Giraph: large-scale graph processing infrastructure on hadoop. In: Proceedings of the Hadoop Summit, Santa Clara (2011)Google Scholar
  7. 7.
    Danelutto, M., Dazzi, P.: A java/jini framework supporting stream parallel computations. In: Proceedings of the International Conference ParCo (2005)Google Scholar
  8. 8.
    Danelutto, M., Pasin, M., Vanneschi, M., Dazzi, P., Laforenza, D., Presti, L.: PAL: exploiting java annotations for parallelism. In: Gorlatch, S., Bubak, M., Priol, T. (eds.) Achievements in European Research on Grid Systems, pp. 83–96. Springer, New York (2008)CrossRefGoogle Scholar
  9. 9.
    Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  10. 10.
    Jelasity, M., Montresor, A., Babaoglu, O.: T-Man: Gossip-based fast overlay topology construction. Comput. Netw. 53(13), 2321–2339 (2009)zbMATHCrossRefGoogle Scholar
  11. 11.
    Leskovec, J., Sosič, R.: SNAP: A general purpose network analysis and graph mining library in C++, June 2014. http://snap.stanford.edu/snap
  12. 12.
    Lua, E.K., Crowcroft, J., Pias, M., Sharma, R., Lim, S., et al.: A survey and comparison of peer-to-peer overlay network schemes. IEEE Commun. Surv. Tutor. 7(1–4), 72–93 (2005)Google Scholar
  13. 13.
    Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146. ACM (2010)Google Scholar
  14. 14.
    McAfee, A., Brynjolfsson, E., Davenport, T.H., Patil, D., Barton, D.: Big data. The management revolution. Harvard Bus. Rev. 90(10), 61–67 (2012)Google Scholar
  15. 15.
    McCune, R.R., Weninger, T., Madey, G.: Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing (2015). arXiv:1507.04405
  16. 16.
    Rahimian, F., Payberah, A.H., Girdzijauskas, S., Jelasity, M., Haridi, S.: Ja-be-ja: a distributed algorithm for balanced graph partitioning. In: IEEE 7th International Conference on Self-Adaptive and Self-Organizing Systems (SASO 2013), pp. 51–60. IEEE (2013)Google Scholar
  17. 17.
    Riondato, M., DeBrabant, J.A., Fonseca, R., Upfal, E.: PARMA: a parallel randomized algorithm for approximate association rules mining in mapreduce. In: International Conference on Information and Knowledge Management, CIKM 2012, pp. 85–94 (2012)Google Scholar
  18. 18.
    Salihoglu, S., Widom, J.: Optimizing graph algorithms on pregel-like systems. PVLDB 7(7), 577–588 (2014)Google Scholar
  19. 19.
    Tian, Y., Balmin, A., Corsten, S.A., Tatikonda, S., McPherson, J.: From “think like a vertex” to “think like a graph”. PVLDB 7(3), 193–204 (2013)Google Scholar
  20. 20.
    Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)CrossRefGoogle Scholar
  21. 21.
    Voulgaris, S., Gavidia, D., Van Steen, M.: Cyclon: inexpensive membership management for unstructured p2p overlays. J. Netw. Syst. Manag. 13(2), 197–217 (2005)CrossRefGoogle Scholar
  22. 22.
    Voulgaris, S., van Steen, M.: VICINITY: a pinch of randomness brings out the structure. In: Eyers, D., Schwan, K. (eds.) Middleware 2013. LNCS, vol. 8275, pp. 21–40. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  23. 23.
    Walshaw, C.: The graph partitioning archive (2002). http://staffweb.cms.gre.ac.uk/~c.walshaw/partition/
  24. 24.
    Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: Graphx: a resilient distributed graph system on spark. In: First International Workshop on Graph Data Management Experiences and Systems, p. 2. ACM (2013)Google Scholar
  25. 25.
    Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, p. 2 (2012)Google Scholar
  26. 26.
    Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, p. 10 (2010)Google Scholar
  27. 27.
    Zhang, C., Li, F., Jestes, J.: Efficient parallel kNN joins for large data in MapReduce. In: 15th International Conference on Extending Database Technology, EDBT 2012, pp. 38–49 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Alessandro Lulli
    • 2
  • Patrizio Dazzi
    • 1
  • Laura Ricci
    • 2
  • Emanuele Carlini
    • 1
  1. 1.Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo”, Consiglio Nazionale delle Ricerche (ISTI-CNR)PisaItaly
  2. 2.Dipartimento di InformaticaUniversità di PisaPisaItaly

Personalised recommendations