Algorithmica

, Volume 64, Issue 3, pp 329–361 | Cite as

Beyond Good Partition Shapes: An Analysis of Diffusive Graph Partitioning

Article

Abstract

In this paper we study the prevalent problem of graph partitioning by analyzing the diffusion-based partitioning heuristic Bubble-FOS/C, a key component of a practical successful graph partitioner (Meyerhenke et al. in J. Parallel Distrib. Comput. 69(9):750–761, 2009).

We begin by studying the disturbed diffusion scheme FOS/C, which computes the similarity measure used in Bubble-FOS/C and is therefore the most crucial component. By relating FOS/C to random walks, we obtain precise characterizations of the behavior of FOS/C on tori and hypercubes. Besides leading to new knowledge on FOS/C (and therefore also on Bubble-FOS/C), these characterizations have been recently used for the analysis of load balancing algorithms (Berenbrink et al. in Proceedings of the 22nd Annual Symposium on Discrete Algorithms, pp. 429–439, 2011).

We then regard Bubble-FOS/C, which has been shown in previous experiments to produce solutions with good partition shapes and other favorable properties. In this paper we prove that it computes a relaxed solution to an edge cut minimizing binary quadratic program (BQP). This result provides the first substantial theoretical insight why Bubble-FOS/C yields good experimental results in terms of graph partitioning metrics. Moreover, we show that in bisections computed by Bubble-FOS/C, at least one of the two parts is connected. Using the aforementioned relation between FOS/C and random walks, we prove that in vertex-transitive graphs both parts must be connected components.

Keywords

Diffusive graph partitioning Relaxed cut optimization Disturbed diffusion Random walks 

References

  1. 1.
    Alon, N., Spencer, J.H.: The Probabilistic Method, 2nd edn. Wiley, New York (2000) MATHCrossRefGoogle Scholar
  2. 2.
    Andersen, R., Chung, F.R.K., Lang, K.J.: Local graph partitioning using pagerank vectors. In: Proceedings of the 47th Annual Symposium on Foundations of Computer Science (FOCS’06), pp. 475–486 (2006) Google Scholar
  3. 3.
    Andersen, R., Peres, Y.: Finding sparse cuts locally using evolving sets. In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC’09), pp. 235–244. ACM, New York (2009) CrossRefGoogle Scholar
  4. 4.
    Andreev, K., Räcke, H.: Balanced graph partitioning. Theory Comput. Syst. 39(6), 929–939 (2006) MathSciNetMATHCrossRefGoogle Scholar
  5. 5.
    Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming. Theory and Algorithms, 2nd edn. Wiley, New York (1993) MATHGoogle Scholar
  6. 6.
    Berenbrink, P., Cooper, C., Friedetzky, T., Friedrich, T., Sauerwald, T.: Randomized diffusion for indivisible loads. In: Proceedings of the 22nd Annual Symposium on Discrete Algorithms (SODA’11), pp. 429–439 (2011) Google Scholar
  7. 7.
    Biggs, N.: Algebraic Graph Theory. Cambridge University Press, Cambridge (1993) Google Scholar
  8. 8.
    Chevalier, C., Pellegrini, F.: PT-Scotch: A tool for efficient parallel graph ordering. Parallel Comput. 34(6–8), 318–331 (2008) MathSciNetCrossRefGoogle Scholar
  9. 9.
    Coifman, R.R., Lafon, S., Lee, A.B., Maggioni, M., Nadler, B., Warner, F., Zucker, S.W.: Geometric diffusions as a tool for harmonic analysis and structure definition of data. Parts I and II. Proc. Natl. Acad. Sci. USA 102(21), 7426–7437 (2005) CrossRefGoogle Scholar
  10. 10.
    Cybenko, G.: Dynamic load balancing for distributed memory multiprocessors. J. Parallel Distrib. Comput. 7, 279–301 (1989) CrossRefGoogle Scholar
  11. 11.
    Dhillon, I.S., Guan, Y., Kulis, B.: Weighted graph cuts without eigenvectors: A multilevel approach. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 1944–1957 (2007) CrossRefGoogle Scholar
  12. 12.
    Diaconis, P., Graham, R.L., Morrison, J.A.: Asymptotic analysis of a random walk on a hypercube with many dimensions. Random Struct. Algorithms 1(1), 51–72 (1990) MathSciNetMATHCrossRefGoogle Scholar
  13. 13.
    Diekmann, R., Frommer, A., Monien, B.: Efficient schemes for nearest neighbor load balancing. Parallel Comput. 25(7), 789–812 (1999) MathSciNetCrossRefGoogle Scholar
  14. 14.
    Doyle, P.G., Snell, J.L.: Random Walks and Electric Networks. Math. Assoc. of America, Washington (1984) MATHGoogle Scholar
  15. 15.
    Feldmann, A.E., Foschini, L.: Balanced partitions of trees and applications. In: Proceedings of the 29th International Symposium on Theoretical Aspects of Computer Science, STACS 2012, pp. 100–111 (2012) Google Scholar
  16. 16.
    Fiedler, M.: A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory. Czechoslov. Math. J. 25, 619–633 (1975) MathSciNetGoogle Scholar
  17. 17.
    Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York (1979) MATHGoogle Scholar
  18. 18.
    Godsil, C., Royle, G.: Algebraic Graph Theory. Springer, Berlin (2001) MATHCrossRefGoogle Scholar
  19. 19.
    Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins Univ. Press, Baltimore (1996) MATHGoogle Scholar
  20. 20.
    Grady, L.: Space-variant computer vision: a graph-theoretic approach. PhD thesis, Boston University, Boston, MA (2004) Google Scholar
  21. 21.
    Grimmett, G.R., Stirzaker, D.R.: Probability and Random Processes, 3rd edn. Oxford University Press, Oxford (2001) Google Scholar
  22. 22.
    Hendrickson, B., Leland, R.: An improved spectral graph partitioning algorithm for mapping parallel computations. SIAM J. Sci. Comput. 16(2), 452–469 (1995) MathSciNetMATHCrossRefGoogle Scholar
  23. 23.
    Karypis, G., Kumar, V.: Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput. 48(1), 96–129 (1998) MathSciNetCrossRefGoogle Scholar
  24. 24.
    Kaufmann, H., Pape, H.: Clusteranalyse. In: Fahrmeir, L., Hamerle, A., Tutz, G. (eds.) Multivariate statistische Verfahren 2nd edn. de Gruyter, Berlin (1996) Google Scholar
  25. 25.
    Kemeny, J.G., Snell, J.L.: Finite Markov Chains. Springer, Berlin (1976) MATHGoogle Scholar
  26. 26.
    Kernighan, B.W., Lin, S.: An efficient heuristic for partitioning graphs. Bell Syst. Tech. J. 49, 291–308 (1970) MATHGoogle Scholar
  27. 27.
    Leighton, F.T.: Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. San Mateo, Morgan Kaufmann (1992) MATHGoogle Scholar
  28. 28.
    Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–136 (1982) MathSciNetMATHCrossRefGoogle Scholar
  29. 29.
    Lovász, L.: Random walks on graphs: a survey. Combinatorics 2, 1–46 (1993) Google Scholar
  30. 30.
    Meila, M., Shi, J.: A random walks view of spectral segmentation. In: 8th International Workshop on Artificial Intelligence and Statistics (AISTATS) (2001) Google Scholar
  31. 31.
    Meyerhenke, H.: Disturbed diffusive processes for solving partitioning problems on graphs. PhD thesis, Universität Paderborn (2008) Google Scholar
  32. 32.
    Meyerhenke, H.: Beyond good shapes: Diffusion-based graph partitioning is relaxed cut optimization. In: Proceedings of the 21st International Symposium on Algorithms and Computation (ISAAC’10), Part II. Lecture Notes in Computer Science, vol. 6507, pp. 387–398. Springer, Berlin (2010) Google Scholar
  33. 33.
    Meyerhenke, H., Monien, B., Sauerwald, T.: A new diffusion-based multilevel algorithm for computing graph partitions. J. Parallel Distrib. Comput. 69(9), 750–761 (2009) Best Paper Awards and Panel Summary: IPDPS 2008 CrossRefGoogle Scholar
  34. 34.
    Meyerhenke, H., Monien, B., Schamberger, S.: Graph partitioning and disturbed diffusion. Parallel Comput. 35(10–11), 544–569 (2009) CrossRefGoogle Scholar
  35. 35.
    Meyerhenke, H., Sauerwald, T.: Analyzing disturbed diffusion on networks. In: Proceedings of the 17th International Symposium on Algorithms and Computation (ISAAC’06), pp. 429–438. Springer, Berlin (2006) Google Scholar
  36. 36.
    Nadler, B., Lafon, S., Coifman, R.R., Kevrekidis, I.G.: Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators. In: Proceedings of Advances in Neural Information Processing Systems 18 (NIPS’05) (2005) Google Scholar
  37. 37.
    Pellegrini, F.: A parallelisable multi-level banded diffusion scheme for computing balanced partitions with smooth boundaries. In: Proceedings of the 13th International Euro-Par Conference (EURO-PAR’07). Lecture Notes in Computer Science, vol. 4641, pp. 195–204. Springer, Berlin (2007) Google Scholar
  38. 38.
    Rabani, Y., Sinclair, A., Wanka, R.: Local divergence of Markov chains and the analysis of iterative load balancing schemes. In: Proceedings of the 39th Annual Symposium on Foundations of Computer Science (FOCS’98), pp. 694–705 (1998) Google Scholar
  39. 39.
    Räcke, H.: Optimal hierarchical decompositions for congestion minimization in networks. In: Proc. 40th Annual ACM Symposium on Theory of Computing, Victoria, British Columbia, Canada, May 17–20, 2008, pp. 255–264 (2008) Google Scholar
  40. 40.
    Saerens, M., Fouss, F., Yen, L., Dupont, P.: The principal components analysis of a graph, and its relationship to spectral clustering. In: Proceedings of the 15th European Conference on Machine Learning (ECML’04), pp. 371–383 (2004) Google Scholar
  41. 41.
    Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007) MathSciNetCrossRefGoogle Scholar
  42. 42.
    Schloegel, K., Karypis, G., Kumar, V.: Graph partitioning for high performance scientific simulations. In: The Sourcebook of Parallel Computing, pp. 491–541. San Mateo, Morgan Kaufmann (2003) Google Scholar
  43. 43.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000) CrossRefGoogle Scholar
  44. 44.
    The BlueGene/L Team: An overview of the BlueGene/L supercomputer. In: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, pp. 1–22. ACM, New York (2002) Google Scholar
  45. 45.
    Tishby, N., Slonim, N.: Data clustering by Markovian relaxation and the information bottleneck method. In: Proceedings of Advances in Neural Information Processing Systems 13 (NIPS), pp. 640–646 (2000) Google Scholar
  46. 46.
    Trefethen, L.N., Bau, D.: Numerical Linear Algebra. Philadelphia, SIAM (1997) MATHCrossRefGoogle Scholar
  47. 47.
    Trottenberg, U., Oosterlee, C.W., Schüller, A.: Multigrid. Academic Press, San Diego (2000) Google Scholar
  48. 48.
    van Dongen, S.: Graph clustering by flow simulation. PhD thesis, University of Utrecht (2000) Google Scholar
  49. 49.
    Walshaw, C.: The graph partitioning archive. http://staffweb.cms.gre.ac.uk/~c.walshaw/partition/ (2010). Last access: 31 May 2012
  50. 50.
    Xu, C., Lau, F.C.M.: Load Balancing in Parallel Computers. Kluwer, Dordrecht (1997) Google Scholar
  51. 51.
    Yen, L., Vanvyve, D., Wouters, F., Fouss, F., Verleysen, M., Saerens, M.: Clustering using a random-walk based distance measure. In: Proceedings of the 13th European Symposium on Artificial Neural Networks (ESANN’05), pp. 317–324 (2005) Google Scholar
  52. 52.
    Zha, H., He, X., Ding, C.H.Q., Gu, M., Simon, H.D.: Spectral relaxation for k-means clustering. In: Proceedings of Advances in Neural Information Processing Systems 14 (NIPS), pp. 1057–1064. MIT Press, Cambridge (2001) Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Institute of Theoretical InformaticsKarlsruhe Institute of Technology (KIT)KarlsruheGermany
  2. 2.Department 1: Algorithms & ComplexityMax-Planck Institute for Computer ScienceSaarbrückenGermany

Personalised recommendations