Skip to main content
Log in

Application-driven graph partitioning

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript


Graph partitioning is crucial to parallel computations on large graphs. The choice of partitioning strategies has strong impact on the performance of graph algorithms. For an algorithm of our interest, what partitioning strategy fits it the best and improves its parallel execution? Is it possible to provide a uniform partition to a batch of algorithms that run on the same graph simultaneously, and speed up each and every of them? This paper aims to answer these questions. We propose an application-driven hybrid partitioning strategy that, given a graph algorithm \({{\mathcal {A}}}\), learns a cost model for \({{\mathcal {A}}}\) as polynomial regression. We develop partitioners that, given the learned cost model, refine an edge-cut or vertex-cut partition to a hybrid partition and reduce the parallel cost of \({{\mathcal {A}}}\). Moreover, we extend the cost-driven strategy to support multiple algorithms at the same time and reduce the parallel cost of each of them. Using real-life and synthetic graphs, we experimentally verify that our partitioning strategy improves the performance of a variety of graph algorithms, up to \(22.5\times \).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others


  1. We do not include the result of \(\mathsf {CN}\)   since there exists no official implementation for \(\mathsf {CN}\)   with Gunrock.


  1. Gunrock. examples (2020)

  2. Livejournal. (2009)

  3. Traffic. (2010)

  4. Twitter. (2012)

  5. UKWeb. union-2006-06-2007-05 (2006)

  6. Graphscope. (2020)

  7. Andreev, K., Racke, H.: Balanced graph partitioning. TCS 39(6): 929–939 (2006)

  8. Avdiukhin, D., Pupyrev, S., Yaroslavtsev, G.: Multi-dimensional balanced graph partitioning via projected gradient descent. PVLDB 12(8), 906–919 (2019)

    Google Scholar 

  9. Bang-Jensen, J., Gutin, G.Z.: Digraphs: Theory, Algorithms and Applications. Springer (2008)

  10. Bichot, C.E., Siarry, P.: Graph Partitioning. Wiley (2013)

  11. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006)

  12. Bourse, F., Lelarge, M., Vojnovic, M.: Balanced graph edge partition. In: SIGKDD, pp. 1456–1465 (2014)

  13. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: WWW, pp. 107–117 (1998)

  14. Buluç, A., Meyerhenke, H., Safro, I., Sanders, P., Schulz, C.: Recent advances in graph partitioning. In: Algorithm Engineering—Selected Results and Surveys, pp. 117–158 (2016)

  15. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)

    Article  Google Scholar 

  16. Chen, R., Shi, J., Chen, Y., Chen, H.: PowerLyra: differentiated graph computation and partitioning on skewed graphs. In: EuroSys, pp. 1:1–1:15 (2015)

  17. Chvatal, V.: A greedy heuristic for the set-covering problem. Math. Oper. Res. 4(3), 233–235 (1979)

    Article  MATH  Google Scholar 

  18. Cukierski, W., Hamner, B., Yang, B.: Graph-based features for supervised link prediction. In: INCC, pp. 1237–1244. IEEE (2011)

  19. Dai, D., Zhang, W., Chen, Y.: IOGP: An incremental online graph partitioning algorithm for distributed graph databases. In: HPDC, pp. 219–230 (2017)

  20. Fan, W., Jin, R., Liu, M., Lu, P., Luo, X., Xu, R., Yin, Q., Yu, W., Zhou, J.: Application driven graph partitioning. In: SIGMOD, pp. 1765–1779. ACM (2020)

  21. Fan, W., Liu, M., Lu, P., Yin, Q.: Graph algorithms with partition transparency. IEEE Trans Knowl data Eng pp. 1–1 (2021).

  22. Fan, W., Yu, W., Xu, J., Zhou, J., Luo, X., Yin, Q., Lu, P., Cao, Y., Xu, R.: Parallelizing sequential graph computations. TODS 43(4), 18:1-18:39 (2018)

  23. Garey, M., Johnson, D.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company (1979)

  24. Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: OSDI, pp. 17–30 (2012)

  25. Huang, J., Abadi, D.: LEOPARD: lightweight edge-oriented partitioning and replication for dynamic graphs. proc. VLDB endow. 9(7): 540–551(2016)

  26. Huang, L., Jia, J., Yu, B., gon Chun, B., Maniatis, P., Naik, M.: Predicting execution time of computer programs using sparse polynomial regression. In: NIPS (2010)

  27. Itai, A., Rodeh, M.: Finding a minimum circuit in a graph. SIAM J. Comput. 7(4), 413–423 (1978)

    Article  MATH  Google Scholar 

  28. Jain, N., Liao, G., Willke, T.L.: Graphbuilder: scalable graph ETL framework. Graph Data Manag. Exp. Syst. pp. 1–6 (2013). 10.1145/2484425.2484429

  29. Karypis, G.: Metis and parmetis. In: Encyclopedia of Parallel Computing, pp. 1117–1124 (2011)

  30. Karypis, G., Kumar, V.: Metis-unstructured graph partitioning and sparse matrix ordering system, version 2.0. pp. 1–16 (1995)

  31. Karypis, G., Kumar, V.: METIS a software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices, version 4. pp. 1–44 (1998)

  32. Karypis, G., Kumar, V.: Multilevelk-way partitioning scheme for irregular graphs. JPDC 48(1), 96–129 (1998)

    Google Scholar 

  33. Kim, M., Candan, K.S.: SBV-Cut: vertex-cut based graph partitioning using structural balance vertices. DKE 72, 285–303 (2012)

    Article  Google Scholar 

  34. Krauthgamer, R., Naor, J., Schwartz, R.: Partitioning graphs into balanced components. In: SODA (2009)

  35. Li, D., Zhang, Y., Wang, J., Tan, K.: TopoX: topology refactorization for efficient graph partitioning and processing. PVLDB 12(8), 891–905 (2019)

    Google Scholar 

  36. Liben-Nowell, D., Kleinberg, J.: The link prediction problem for social networks. In: CIKM (2003)

  37. Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD (2010)

  38. Margo, D.W., Seltzer, M.I.: A scalable distributed graph partitioner. PVLDB 8(12), 1478–1489 (2015)

    Google Scholar 

  39. Mondal, J., Deshpande, A.: Managing large dynamic graphs efficiently. In: SIGMOD, pp. 145–156 (2012)

  40. Newman, M.E., Watts, D.J., Strogatz, S.H.: Random graph models of social networks. Proc. Natl. Acad. Sci. 99(1), 2566–2572 (2002)

    Article  MATH  Google Scholar 

  41. Park, H., Stefanski, L.: Relative-error prediction. Stat. Probab. Lett. 40(3), 227–236 (1998)

    Article  MATH  Google Scholar 

  42. Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in PyTorch. In: NIPS Autodiff Workshop (2017)

  43. Petroni, F., Querzoni, L., Daudjee, K., Kamali, S., Iacoboni, G.: HDRF: stream-based partitioning for power-law graphs. In: CIKM (2015)

  44. Pothen, A., Simon, H.D., Liou, K.P.: Partitioning sparse matrices with eigenvectors of graphs. SIMAX 11(3), 430–452 (1990)

    Article  MATH  Google Scholar 

  45. Raz, R., Safra, S.: A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of np. In: STOC, pp. 475–484 (1997)

  46. Slota, G.M., Rajamanickam, S., Madduri, K.: Pulp/xtrapulp: partitioning tools for extreme-scale graphs. Tech. Rep., Sandia National Lab.(SNL-NM), Albuquerque, NM (United States) (2017)

  47. Tsourakakis, C.E., Gkantsidis, C., Radunovic, B., Vojnovic, M.: FENNEL: streaming graph partitioning for massive scale graphs. In: WSDM, pp. 333–342 (2014)

  48. Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)

  49. Wang, Y., Davidson, A., Pan, Y., Wu, Y., Riffel, A., Owens, J.D.: Gunrock: a high-performance graph processing library on the GPU. In: Proceedings of the 21st ACM SIGPLAN symposium on principles and practice of parallel programming, pp. 1–12 (2016)

  50. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440 (1998)

  51. Wikipedia: Stone-Weierstrass Theorem.

  52. Yang, S., Yan, X., Zong, B., Khan, A.: Towards effective partition management for large graphs. In: SIGMOD, p. 517 (2012)

  53. Zhang, C., Wei, F., Liu, Q., Tang, Z.G., Li, Z.: Graph edge partitioning via neighborhood heuristic. In: KDD (2017)

  54. Zhu, X., Chen, W., Zheng, W., Ma, X.: Gemini: a computation-centric distributed graph processing system. In: OSDI, pp. 301–316 (2016)

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Qiang Yin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: More experimental study

Appendix: More experimental study

1.1 Impact of different phases

We tested the phases of \({\mathsf {ParE2H}}\) and \({\mathsf {ParV2H}}\) for their effectiveness. Denote by \({\mathsf {ParE2H}}_{k}\) (resp. \({\mathsf {ParV2H}}_{k}\)) (\(1\le k \le 3\)) the partitioner with the first k phases of \({\mathsf {ParE2H}}\) (resp. \({\mathsf {ParV2H}}\)). We assessed the speedup gain of the kth phase of \({\mathsf {ParE2H}}\) by comparing \({\mathsf {ParE2H}}_{k-1}\) and \({\mathsf {ParE2H}}_{k}\); similarly for \({\mathsf {ParV2H}}\). Figure 11a, b reports the normalized speedup ratio over \(\mathsf {Twitter}\) with \(n=96\) for \(\mathsf {HxtraPuLP}\) and \(\mathsf {HGrid}\), respectively. The results over \(\mathsf {liveJournal}\) and \(\mathsf {UKWeb}\) and other hybrid partitioners are consistent (not shown). We find the following.

  1. (1)

    \({\mathsf {ParE2H}}\). (a) Phase \({\mathsf {EMigrate}}\) accounts for 67.5%, 26.3%, 83.5%, 74.4% and \(89.2\%\) of the total speedup of \(\mathsf {CN}\)\(\mathsf {TC}\), \(\mathsf {WCC}\), \(\mathsf {PR}\) and \(\mathsf {SSSP}\), respectively. (b) \({\mathsf {ESplit}}\) alone improves \(\mathsf {CN}\)  and \(\mathsf {TC}\) by 1.1 and 2.7 times, respectively. For \(\mathsf {WCC}\), \(\mathsf {PR}\)  and \(\mathsf {SSSP}\), its impact is smaller, since \(\mathsf {CN}\)  and \(\mathsf {TC}\) are more sensitive to workload imbalance. The impact of \({\mathsf {ESplit}}\) on \(\mathsf {CN}\) over \(\mathsf {Twitter}\) is smaller, since we filtered large-degree vertices for \(\mathsf {CN}\). Without filtering, \({\mathsf {ESplit}}\) improves \(\mathsf {CN}\)   over \(\mathsf {liveJournal}\) by 1.9 times. (c) \({\mathsf {MAssign}}\) accounts for another 22.3, 30.1, 13.8, 21.9 and \(6.3\%\) of the speedup of \(\mathsf {CN}\)\(\mathsf {TC}\), \(\mathsf {WCC}\), \(\mathsf {PR}\)  and \(\mathsf {SSSP}\), respectively.

  2. (2)

    \({\mathsf {ParV2H}}\). (a) Phase \({\mathsf {VMigrate}}\) contributes the most to the speedup of \(\mathsf {CN}\)\(\mathsf {TC}\), \(\mathsf {WCC}\), \(\mathsf {PR}\)  and \(\mathsf {SSSP}\), which account for about 71.2, 81.2, 87.1, 78.2 and \(96.7\%\) of the total speedup, respectively. (b) By merging v-cut nodes into e-cut nodes, \({\mathsf {VMerge}}\) contributes 16.5, 5.8, 2.6, 7.1 and 1.2% of the total speedup for the five algorithms tested, respectively. (c) Phase \({\mathsf {MAssign}}\) contributes 9.9% on average.

Fig. 11
figure 11

Phase decomposition of \({\mathsf {ParE2H}}\) and \({\mathsf {ParV2H}}\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, W., Xu, R., Yin, Q. et al. Application-driven graph partitioning. The VLDB Journal 32, 149–172 (2023).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: