Advertisement

Algorithmica

, Volume 60, Issue 2, pp 316–334 | Cite as

Exact Algorithms for Cluster Editing: Evaluation and Experiments

  • Sebastian Böcker
  • Sebastian Briesemeister
  • Gunnar W. Klau
Article

Abstract

The Cluster Editing problem is defined as follows: Given an undirected, loopless graph, we want to find a set of edge modifications (insertions and deletions) of minimum cardinality, such that the modified graph consists of disjoint cliques.

We present empirical results for this problem using exact methods from fixed-parameter algorithmics and linear programming. We investigate parameter-independent data reduction methods and find that effective preprocessing is possible if the number of edge modifications k is smaller than some multiple of  \(\lvert V\rvert\) , where V is the vertex set of the input graph. In particular, combining parameter-dependent data reduction with lower and upper bounds we can effectively reduce graphs satisfying \(k\leq25\lvert V\rvert\) .

In addition to the fastest known fixed-parameter branching strategy for the problem, we investigate an integer linear program (ILP) formulation of the problem using a cutting plane approach. Our results indicate that both approaches are capable of solving large graphs with 1000 vertices and several thousand edge modifications. For the first time, complex and very large graphs such as biological instances allow for an exact solution, using a combination of the above techniques. (A preliminary version of this paper appeared under the title “Exact algorithms for cluster editing: Evaluation and experiments” in the Proceedings of the 7th Workshop on Experimental Algorithms, WEA 2008, in: LNCS, vol. 5038, Springer, pp. 289–302.)

Keywords

Cluster editing Algorithm engineering Computer experiments NP-complete problem Fixed-parameter tractability FPT Integer linear programming ILP Branch-and-cut algorithm 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ben-Dor, A., Shamir, R., Yakhini, Z.: Clustering gene expression patterns. J. Comput. Biol. 6(3–4), 281–297 (1999) CrossRefGoogle Scholar
  2. 2.
    Böcker, S., Briesemeister, S., Bui, Q.B.A., Truß, A.: A fixed-parameter approach for weighted cluster editing. In: Proc. of Asia-Pacific Bioinformatics Conference (APBC 2008). Series on Advances in Bioinformatics and Computational Biology, vol. 5, pp. 211–220. Imperial College Press, London (2008) CrossRefGoogle Scholar
  3. 3.
    Böcker, S., Briesemeister, S., Bui, Q.B.A., Truß, A.: Going weighted: Parameterized algorithms for cluster editing. In: Proc. of Conference on Combinatorial Optimization and Applications (COCOA 2008). Lect. Notes Comput. Sc., vol. 5165, pp. 1–12. Springer, Berlin (2008) CrossRefGoogle Scholar
  4. 4.
    Charikar, M., Guruswami, V., Wirth, A.: Clustering with qualitative information. J. Comput. Syst. Sci. 71(3), 360–383 (2005) CrossRefMathSciNetzbMATHGoogle Scholar
  5. 5.
    Dehne, F., Langston, M.A., Luo, X., Pitre, S., Shaw, P., Zhang, Y.: The cluster editing problem: Implementations and experiments. In: Proc. of International Workshop on Parameterized and Exact Computation (IWPEC 2006). Lect. Notes Comput. Sc., vol. 4169, pp. 13–24. Springer, Berlin (2006) CrossRefGoogle Scholar
  6. 6.
    Gramm, J., Guo, J., Hüffner, F., Niedermeier, R.: Automated generation of search tree algorithms for hard graph modification problems. Algorithmica 39(4), 321–347 (2004) CrossRefMathSciNetzbMATHGoogle Scholar
  7. 7.
    Gramm, J., Guo, J., Hüffner, F., Niedermeier, R.: Graph-modeled data clustering: Fixed-parameter algorithms for clique generation. Theor. Comput. Syst. 38(4), 373–392 (2005) CrossRefzbMATHGoogle Scholar
  8. 8.
    Grötschel, M., Wakabayashi, Y.: A cutting plane algorithm for a clustering problem. Math. Program. 45, 52–96 (1989) CrossRefGoogle Scholar
  9. 9.
    Guo, J.: A more effective linear kernelization for cluster editing. Theor. Comput. Sci. 410(8–10), 718–726 (2009) CrossRefzbMATHGoogle Scholar
  10. 10.
    Kochenberger, G.A., Glover, F., Alidaee, B., Wang, H.: Clustering of microarray data via clique partitioning. J. Comb. Optim. 10(1), 77–92 (2005) CrossRefMathSciNetzbMATHGoogle Scholar
  11. 11.
    Křivánek, M., Morávek, J.: NP-hard problems in hierarchical-tree clustering. Acta Inform. 23(3), 311–323 (1986) CrossRefMathSciNetzbMATHGoogle Scholar
  12. 12.
    Müller, R.: On the partial order polytope of a digraph. Math. Program. 73, 31–49 (1996) CrossRefzbMATHGoogle Scholar
  13. 13.
    Niedermeier, R.: Invitation to Fixed-Parameter Algorithms. Oxford University Press, London (2006) CrossRefzbMATHGoogle Scholar
  14. 14.
    Rahmann, S., Wittkop, T., Baumbach, J., Martin, M., Truß, A., Böcker, S.: Exact and heuristic algorithms for weighted cluster editing. In: Proc. of Computational Systems Bioinformatics (CSB 2007), vol. 6, pp. 391–401 (2007) Google Scholar
  15. 15.
    Shamir, R., Sharan, R., Tsur, D.: Cluster graph modification problems. Discrete Appl. Math. 144(1–2), 173–182 (2004) CrossRefMathSciNetzbMATHGoogle Scholar
  16. 16.
    Sharan, R., Maron-Katz, A., Shamir, R.: CLICK and EXPANDER: a system for clustering and visualizing gene expression data. Bioinformatics 19(14), 1787–1799 (2003) CrossRefGoogle Scholar
  17. 17.
    Stoer, M., Wagner, F.: A simple min-cut algorithm. J. ACM 4, 585–591 (1997) CrossRefMathSciNetGoogle Scholar
  18. 18.
    Tatusov, R.L., Fedorova, N.D., Jackson, J.D., Jacobs, A.R., Kiryutin, B., Koonin, E.V., Krylov, D.M., Mazumder, R., Mekhedov, S.L., Nikolskaya, A.N., Rao, B.S., Smirnov, S., Sverdlov, A.V., Vasudevan, S., Wolf, Y.I., Yin, J.J., Natale, D.A.: The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41 (2003) CrossRefGoogle Scholar
  19. 19.
    van Zuylen, A., Williamson, D.P.: Deterministic algorithms for rank aggregation and other ranking and clustering problems. In: Proc. of Workshop on Approximation and Online Algorithms (WAOA 2007). Lect. Notes Comput. Sc., vol. 4927, pp. 260–273. Springer, Berlin (2008) CrossRefGoogle Scholar
  20. 20.
    Wittkop, T., Baumbach, J., Lobo, F., Rahmann, S.: Large scale clustering of protein sequences with FORCE—a layout based heuristic for weighted cluster editing. BMC Bioinformatics 8(1), 396 (2007) CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Sebastian Böcker
    • 1
    • 2
  • Sebastian Briesemeister
    • 3
  • Gunnar W. Klau
    • 4
  1. 1.Institut für InformatikFriedrich-Schiller-Universität JenaJenaGermany
  2. 2.Jena Centre for BioinformaticsJenaGermany
  3. 3.Div. for Simulation of Biological Systems, ZBIT/WSIEberhard Karls Universität TübingenTübingenGermany
  4. 4.CWIAmsterdamNetherlands

Personalised recommendations