Chapter

Distributed Applications and Interoperable Systems

Volume 8460 of the series Lecture Notes in Computer Science pp 186-200

Distributed Vertex-Cut Partitioning

  • Fatemeh RahimianAffiliated withDept. Computer Sciences, Lancaster UniversitySwedish Institute of Computer ScienceKTH - Royal Institute of Technology Email author 
  • , Amir H. PayberahAffiliated withDept. Computer Sciences, Lancaster UniversitySwedish Institute of Computer Science
  • , Sarunas GirdzijauskasAffiliated withDept. Computer Sciences, Lancaster UniversityKTH - Royal Institute of Technology
  • , Seif HaridiAffiliated withDept. Computer Sciences, Lancaster UniversitySwedish Institute of Computer Science

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Graph processing has become an integral part of big data analytics. With the ever increasing size of the graphs, one needs to partition them into smaller clusters, which can be managed and processed more easily on multiple machines in a distributed fashion. While there exist numerous solutions for edge-cut partitioning of graphs, very little effort has been made for vertex-cut partitioning. This is in spite of the fact that vertex-cuts are proved significantly more effective than edge-cuts for processing most real world graphs. In this paper we present Ja-be-Ja-vc, a parallel and distributed algorithm for vertex-cut partitioning of large graphs. In a nutshell, Ja-be-Ja-vc is a local search algorithm that iteratively improves upon an initial random assignment of edges to partitions. We propose several heuristics for this optimization and study their impact on the final partitioning. Moreover, we employ simulated annealing technique to escape local optima. We evaluate our solution on various graphs and with variety of settings, and compare it against two state-of-the-art solutions. We show that Ja-be-Ja-vc outperforms the existing solutions in that it not only creates partitions of any requested size, but also requires a vertex-cut that is better than its counterparts and more than 70% better than random partitioning.