Skip to main content
Log in

Towards Systematic Parallelization of Graph Transformations Over Pregel

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Graphs can be used to model many kinds of data, from traditional datasets to social networks or semi-structured datasets. To process large graphs, many systems have been proposed. The Pregel programming model is popular, thanks to its scalability. Although Pregel is simple to understand and use, it is of low-level in programming and requires developers to write programs that are hard to maintain and need to be carefully optimized. On the other hand, structural recursion is powerful to systematically construct efficient parallel programs on lists, arrays and trees, but it has not yet been applied to graphs. In this paper, we propose an efficient method for parallel evaluation of structural recursion on graphs, which is suitable for Pregel. We design and implement a high-level parallel programming framework where a domain-specific language (DSL) is provided to ease the programing task. Specifications written in the DSL are automatically compiled into Pregel programs that are scalable for large graphs. Experimental results show that our framework outperforms the original evaluation of structural recursion, and achieves good scalability and speedup for real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. source code: http://www.prg.nii.ac.jp/members/tungld/gito-graphxApr1.tar.gz.

  2. http://arnetminer.org/billboard/citation, dataset: citation-network V1.

  3. http://netsg.cs.sfu.ca/youtubedata/, dataset: 0222, Feb. 22nd, 2007.

References

  1. Afrati, F.N., Ullman, J.D.: Transitive closure and recursive datalog implemented on clusters. In: Proceedings of the 15th International Conference on Extending Database Technology, EDBT ’12 (2012)

  2. Buneman, P.: Semistructured data. In: Proceedings of the 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS ’97, pp. 117–121. ACM, New York, NY, USA (1997)

  3. Buneman, P., Fernandez, M., Suciu, D.: UnQL: A Query Language and Algebra for Semistructured Data Based on Structural Recursion. VLDB J. 9(1), 76–110 (2000)

  4. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  5. Emoto, K., Fischer, S., Hu, Z.: Generate, test, and aggregate: a calculation-based framework for systematic parallel programming with mapreduce. In: Proceedings of the 21st European Conference on Programming Languages and Systems, ESOP’12, pp. 254–273. Springer, Berlin (2012)

  6. Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: Proceedings of OSDI’12, pp. 17–30 (2012)

  7. Hidaka, S., Hu, Z., Kato, H., Nakano, K.: Towards a compositional approach to model transformation for software development. In: Proceedings of the 2009 ACM Symposium on Applied Computing, SAC ’09, pp. 468–475. ACM, New York, NY, USA (2009)

  8. Hong, S., Salihoglu, S., Widom, J., Olukotun, K.: Simplifying scalable graph processing with a domain-specific language. In: Proceedings of CGO’14, pp. 208–218 (2014)

  9. Krause, C., Tichy, M., Giese, H.: Implementing graph transformations in the bulk synchronous parallel model. In: Gnesi, S., Rensink, A. (eds.) Fundamental Approaches to SoftwareEngineering, Lecture Notes in Computer Science, vol. 8411, pp. 325–339. Springer, Berlin (2014)

  10. Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)

    Article  Google Scholar 

  11. Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD ’10 (2010)

  12. Matsuzaki, K., Iwasaki, H., Emoto, K., Hu, Z.: A library of constructive skeletons for sequential style of parallel programming. In: Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale ’06. ACM, New York, NY, USA (2006)

  13. Nolé, M., Sartiani, C.: Processing regular path queries on giraph. In: EDBT/ICDT Workshops (2014)

  14. Salihoglu, S., Widom, J.: HelP: High-level primitives for large-scale graph processing. In: Proceedings of Workshop on GRAph Data Management Experiences and Systems, GRADES’14, pp. 3:1–3:6 (2014)

  15. Suciu, D.: Distributed query evaluation on semistructured data. ACM Trans. Database Syst. 27(1), 1–62 (2002)

  16. Tung, L.D., Nguyen-Van, Q., Hu, Z.: Efficient query evaluation on distributed graphs with hadoop environment. In: Proceedings of the 4th Symposium on Information and Communication Technology, SoICT ’13. ACM, New York, NY, USA (2013)

  17. Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)

    Article  Google Scholar 

  18. Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: a resilient distributed graph system on spark. In: First International Workshop on Graph Data Management Experiences and Systems, GRADES ’13, pp. 2:1–2:6 (2013)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Le-Duc Tung.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tung, LD., Hu, Z. Towards Systematic Parallelization of Graph Transformations Over Pregel. Int J Parallel Prog 45, 320–339 (2017). https://doi.org/10.1007/s10766-016-0418-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-016-0418-5

Keywords

Navigation