Towards Systematic Parallelization of Graph Transformations Over Pregel

Tung, Le-Duc; Hu, Zhenjiang

doi:10.1007/s10766-016-0418-5

Towards Systematic Parallelization of Graph Transformations Over Pregel

Published: 28 March 2016

Volume 45, pages 320–339, (2017)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Le-Duc Tung¹ &
Zhenjiang Hu²

312 Accesses
3 Citations
Explore all metrics

Abstract

Graphs can be used to model many kinds of data, from traditional datasets to social networks or semi-structured datasets. To process large graphs, many systems have been proposed. The Pregel programming model is popular, thanks to its scalability. Although Pregel is simple to understand and use, it is of low-level in programming and requires developers to write programs that are hard to maintain and need to be carefully optimized. On the other hand, structural recursion is powerful to systematically construct efficient parallel programs on lists, arrays and trees, but it has not yet been applied to graphs. In this paper, we propose an efficient method for parallel evaluation of structural recursion on graphs, which is suitable for Pregel. We design and implement a high-level parallel programming framework where a domain-specific language (DSL) is provided to ease the programing task. Specifications written in the DSL are automatically compiled into Pregel programs that are scalable for large graphs. Experimental results show that our framework outperforms the original evaluation of structural recursion, and achieves good scalability and speedup for real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

source code: http://www.prg.nii.ac.jp/members/tungld/gito-graphxApr1.tar.gz.
http://arnetminer.org/billboard/citation, dataset: citation-network V1.
http://netsg.cs.sfu.ca/youtubedata/, dataset: 0222, Feb. 22nd, 2007.

References

Afrati, F.N., Ullman, J.D.: Transitive closure and recursive datalog implemented on clusters. In: Proceedings of the 15th International Conference on Extending Database Technology, EDBT ’12 (2012)
Buneman, P.: Semistructured data. In: Proceedings of the 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS ’97, pp. 117–121. ACM, New York, NY, USA (1997)
Buneman, P., Fernandez, M., Suciu, D.: UnQL: A Query Language and Algebra for Semistructured Data Based on Structural Recursion. VLDB J. 9(1), 76–110 (2000)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Article Google Scholar
Emoto, K., Fischer, S., Hu, Z.: Generate, test, and aggregate: a calculation-based framework for systematic parallel programming with mapreduce. In: Proceedings of the 21st European Conference on Programming Languages and Systems, ESOP’12, pp. 254–273. Springer, Berlin (2012)
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: Proceedings of OSDI’12, pp. 17–30 (2012)
Hidaka, S., Hu, Z., Kato, H., Nakano, K.: Towards a compositional approach to model transformation for software development. In: Proceedings of the 2009 ACM Symposium on Applied Computing, SAC ’09, pp. 468–475. ACM, New York, NY, USA (2009)
Hong, S., Salihoglu, S., Widom, J., Olukotun, K.: Simplifying scalable graph processing with a domain-specific language. In: Proceedings of CGO’14, pp. 208–218 (2014)
Krause, C., Tichy, M., Giese, H.: Implementing graph transformations in the bulk synchronous parallel model. In: Gnesi, S., Rensink, A. (eds.) Fundamental Approaches to SoftwareEngineering, Lecture Notes in Computer Science, vol. 8411, pp. 325–339. Springer, Berlin (2014)
Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)
Article Google Scholar
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD ’10 (2010)
Matsuzaki, K., Iwasaki, H., Emoto, K., Hu, Z.: A library of constructive skeletons for sequential style of parallel programming. In: Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale ’06. ACM, New York, NY, USA (2006)
Nolé, M., Sartiani, C.: Processing regular path queries on giraph. In: EDBT/ICDT Workshops (2014)
Salihoglu, S., Widom, J.: HelP: High-level primitives for large-scale graph processing. In: Proceedings of Workshop on GRAph Data Management Experiences and Systems, GRADES’14, pp. 3:1–3:6 (2014)
Suciu, D.: Distributed query evaluation on semistructured data. ACM Trans. Database Syst. 27(1), 1–62 (2002)
Tung, L.D., Nguyen-Van, Q., Hu, Z.: Efficient query evaluation on distributed graphs with hadoop environment. In: Proceedings of the 4th Symposium on Information and Communication Technology, SoICT ’13. ACM, New York, NY, USA (2013)
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)
Article Google Scholar
Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: a resilient distributed graph system on spark. In: First International Workshop on Graph Data Management Experiences and Systems, GRADES ’13, pp. 2:1–2:6 (2013)

Download references

Author information

Authors and Affiliations

SOKENDAI (The Graduate University for Advanced Studies), Shonan Village, Hayama, Kanagawa, 240-0193, Japan
Le-Duc Tung
SOKENDAI/National Institute of Informatics (NII), 2-1-2 Hitotsubashi, Chiyoda, Tokyo, 101-8430, Japan
Zhenjiang Hu

Authors

Le-Duc Tung
View author publications
You can also search for this author in PubMed Google Scholar
Zhenjiang Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Le-Duc Tung.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tung, LD., Hu, Z. Towards Systematic Parallelization of Graph Transformations Over Pregel. Int J Parallel Prog 45, 320–339 (2017). https://doi.org/10.1007/s10766-016-0418-5

Download citation

Received: 03 September 2015
Accepted: 21 March 2016
Published: 28 March 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s10766-016-0418-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards Systematic Parallelization of Graph Transformations Over Pregel

Abstract

Access this article

Similar content being viewed by others

Parallel Processing of Graphs

A DSL for graph parallel programming with vertex subsets

Accelerating Computation of Steiner Trees on GPUs

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Towards Systematic Parallelization of Graph Transformations Over Pregel

Abstract

Access this article

Similar content being viewed by others

Parallel Processing of Graphs

A DSL for graph parallel programming with vertex subsets

Accelerating Computation of Steiner Trees on GPUs

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation