Abstract
Graph processing has become an integral part of big data analytics. With the ever increasing size of the graphs, one needs to partition them into smaller clusters, which can be managed and processed more easily on multiple machines in a distributed fashion. While there exist numerous solutions for edge-cut partitioning of graphs, very little effort has been made for vertex-cut partitioning. This is in spite of the fact that vertex-cuts are proved significantly more effective than edge-cuts for processing most real world graphs. In this paper we present Ja-be-Ja-vc, a parallel and distributed algorithm for vertex-cut partitioning of large graphs. In a nutshell, Ja-be-Ja-vc is a local search algorithm that iteratively improves upon an initial random assignment of edges to partitions. We propose several heuristics for this optimization and study their impact on the final partitioning. Moreover, we employ simulated annealing technique to escape local optima. We evaluate our solution on various graphs and with variety of settings, and compare it against two state-of-the-art solutions. We show that Ja-be-Ja-vc outperforms the existing solutions in that it not only creates partitions of any requested size, but also requires a vertex-cut that is better than its counterparts and more than 70% better than random partitioning.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abou-Rjeili, A., Karypis, G.: Multilevel algorithms for partitioning power-law graphs. In: Proc. of IPDPS 2006, p. 10. IEEE (2006)
Lang, K.: Finding good nearly balanced cuts in power law graphs (2004) (preprint)
Leskovec, J., Lang, K., Dasgupta, A., Mahoney, M.: Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics 6(1), 29–123 (2009)
Albert, R., Jeong, H., Barabási, A.: Error and attack tolerance of complex networks. Nature 406(6794), 378–382 (2000)
Gonzalez, J., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: Distributed graph-parallel computation on natural graphs. In: Proc. of OSDI 2012, pp. 17–30 (2012)
Xin, R., Gonzalez, J., Franklin, M., Stoica, I.: Graphx: A resilient distributed graph system on spark. In: Proc. of GRADES 2013, pp. 1–6. ACM (2013)
Rahimian, F., Payberah, A., Girdzijauskas, S., Jelasity, M., Haridi, S.: Ja-Be-Ja: A distributed algorithm for balanced graph partitioning. In: Proc. of SASO 2013. IEEE (2013)
Talbi, E.: Metaheuristics: From design to implementation, vol. 74. John Wiley & Sons (2009)
Guerrieri, A., Montresor, A.: Distributed Edge Partitioning for Graph Processing. CoRR abs/1403.6270 (2014)
Voulgaris, S., Gavidia, D., Van Steen, M.: Cyclon: Inexpensive membership management for unstructured p2p overlays. Journal of Network and Systems Management 13(2), 197–217 (2005)
Jelasity, M., Montresor, A.: Epidemic-style proactive aggregation in large overlay networks. In: Proc. of ICDCS 2004, pp. 102–109. IEEE (2004)
Payberah, A.H., Dowling, J., Haridi, S.: Gozar: Nat-friendly peer sampling with one-hop distributed nat traversal. In: Felber, P., Rouvoy, R. (eds.) DAIS 2011. LNCS, vol. 6723, pp. 1–14. Springer, Heidelberg (2011)
Dowling, J., Payberah, A.: Shuffling with a croupier: Nat-aware peer-sampling. In: Proc. of ICDCS 2012, pp. 102–111. IEEE (2012)
Massoulié, L., Le Merrer, E., Kermarrec, A., Ganesh, A.: Peer counting and sampling in overlay networks: Random walk methods. In: Proc. of PODC 2006, pp. 123–132. ACM (2006)
Leskovec, J.: The graph partitioning archive (2012), http://staffweb.cms.gre.ac.uk/~wc06/partition
Leskovec, J.: Stanford large network dataset collection (2011), http://snap.stanford.edu/data/index.html
Baños, R., Gil, C., Ortega, J., Montoya, F.G.: Multilevel heuristic algorithm for graph partitioning. In: Cagnoni, S., et al. (eds.) EvoWorkshops 2003. LNCS, vol. 2611, pp. 143–153. Springer, Heidelberg (2003)
Bui, T., Moon, B.: Genetic algorithm and graph partitioning. Transactions on Computers 45(7), 841–855 (1996)
Hendrickson, B., Leland, R.: A multi-level algorithm for partitioning graphs. SCÂ 95, 28 (1995)
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. Journal on Scientific Computing 20(1), 359–392 (1998)
Karypis, G., Kumar, V.: Parallel multilevel series k-way partitioning scheme for irregular graphs. Siam Review 41(2), 278–300 (1999)
Walshaw, C., Cross, M.: Mesh partitioning: A multilevel balancing and refinement algorithm. Journal on Scientific Computing 22(1), 63–80 (2000)
Sanders, P., Schulz, C.: Engineering multilevel graph partitioning algorithms. In: Demetrescu, C., Halldórsson, M.M. (eds.) ESA 2011. LNCS, vol. 6942, pp. 469–480. Springer, Heidelberg (2011)
Soper, A., Walshaw, C., Cross, M.: A combined evolutionary search and multilevel optimisation approach to graph-partitioning. Journal of Global Optimization 29(2), 225–241 (2004)
Chardaire, P., Barake, M., McKeown, G.: A probe-based heuristic for graph partitioning. Transactions on Computers 56(12), 1707–1720 (2007)
Benlic, U., Hao, J.: An effective multilevel tabu search approach for balanced graph partitioning. Computers & Operations Research 38(7), 1066–1075 (2011)
Sanders, P., Schulz, C.: Distributed evolutionary graph partitioning. arXiv preprint arXiv:1110.0477 (2011)
Talbi, E., Bessiere, P.: A parallel genetic algorithm for the graph partitioning problem. In: Proceedings of the 5th International Conference on Supercomputing, pp. 312–320. ACM (1991)
Luque, G., Alba, E.: Parallel Genetic Algorithms: Theory and Real World Applications. SCI, vol. 367. Springer (2011)
Gehweiler, J., Meyerhenke, H.: A distributed diffusive heuristic for clustering a virtual p2p supercomputer. In: Proc. of IPDPSW 2010, pp. 1–8. IEEE (2010)
Ramaswamy, L., Gedik, B., Liu, L.: A distributed approach to node clustering in decentralized peer-to-peer networks. Transactions on Parallel and Distributed Systems 16(9), 814–829 (2005)
Kim, M., Candan, K.: SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices. Data & Knowledge Engineering 72, 285–303 (2012)
Zaharia, M., Chowdhury, M., Franklin, M., Shenker, S., Stoica, I.: Spark: Cluster computing with working sets. In: Proc. of HotCloud 2010, p. 10. USENIX (2010)
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M., Shenker, S., Stoica, I.: Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In: Proc. of NSDI 2012, p. 2. USENIX (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 IFIP International Federation for Information Processing
About this paper
Cite this paper
Rahimian, F., Payberah, A.H., Girdzijauskas, S., Haridi, S. (2014). Distributed Vertex-Cut Partitioning. In: Magoutis, K., Pietzuch, P. (eds) Distributed Applications and Interoperable Systems. DAIS 2014. Lecture Notes in Computer Science(), vol 8460. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43352-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-662-43352-2_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43351-5
Online ISBN: 978-3-662-43352-2
eBook Packages: Computer ScienceComputer Science (R0)