Datenbank-Spektrum

, Volume 14, Issue 2, pp 107–117

Iterative Computation of Connected Graph Components with MapReduce

SCHWERPUNKTBEITRAG

DOI: 10.1007/s13222-014-0154-1

Cite this article as:
Kolb, L., Sehili, Z. & Rahm, E. Datenbank Spektrum (2014) 14: 107. doi:10.1007/s13222-014-0154-1

Abstract

The use of the MapReduce framework for iterative graph algorithms is challenging. To achieve high performance it is critical to limit the amount of intermediate results as well as the number of necessary iterations. We address these issues for the important problem of finding connected components in large graphs. We analyze an existing MapReduce algorithm, CC-MR, and present techniques to improve its performance including a memory-based connection of subgraphs in the map phase. Our evaluation with several large graph datasets shows that the improvements can substantially reduce the amount of generated data by up to a factor of 8.8 and runtime by up to factor of 3.5.

Keywords

MapReduce Hadoop Connected graph components Transitive closure 

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Institut für InformatikUniversität LeipzigLeipzigGermany

Personalised recommendations