Exact Algorithms for Cluster Editing: Evaluation and Experiments

  • Sebastian Böcker
  • Sebastian Briesemeister
  • Gunnar W. Klau
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5038)

Abstract

We present empirical results for the Cluster Editing problem using exact methods from fixed-parameter algorithmics and linear programming. We investigate parameter-independent data reduction methods and find that effective preprocessing is possible if the number of edge modifications k is smaller than some multiple of \(\left\lvert{V}\right\rvert\). In particular, combining parameter-dependent data reduction with lower and upper bounds we can effectively reduce graphs satisfying \(k \leq 25\left\lvert{V}\right\rvert\).

In addition to the fastest known fixed-parameter branching strategy for the problem, we investigate an integer linear program (ILP) formulation of the problem using a cutting plane approach. Our results indicate that both approaches are capable of solving large graphs with 1000 vertices and several thousand edge modifications. For the first time, complex and very large graphs such as biological instances allow for an exact solution, using a combination of the above techniques.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ben-Dor, A., Shamir, R., Yakhini, Z.: Clustering gene expression patterns. J. Comput. Biol. 6(3-4), 281–297 (1999)CrossRefGoogle Scholar
  2. 2.
    Böcker, S., Briesemeister, S., Bui, Q.B.A., Truß, A.: A fixed-parameter approach for weighted cluster editing. In: Proc. of Asia-Pacific Bioinformatics Conference (APBC 2008). Series on Advances in Bioinformatics and Computational Biology, vol. 5, pp. 211–220. Imperial College Press (2008)Google Scholar
  3. 3.
    Böcker, S., Briesemeister, S., Bui, Q.B.A., Truß, A.: Going weighted: Parameterized algorithms for cluster editing (Manuscript) (2008)Google Scholar
  4. 4.
    Charikar, M., Guruswami, V., Wirth, A.: Clustering with qualitative information. J. Comput. Syst. Sci. 71(3), 360–383 (2005)CrossRefMathSciNetMATHGoogle Scholar
  5. 5.
    Dehne, F., Langston, M.A., Luo, X., Pitre, S., Shaw, P., Zhang, Y.: The cluster editing problem: Implementations and experiments. In: Bodlaender, H.L., Langston, M.A. (eds.) IWPEC 2006. LNCS, vol. 4169, pp. 13–24. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Gramm, J., Guo, J., Hüffner, F., Niedermeier, R.: Automated generation of search tree algorithms for hard graph modification problems. Algorithmica 39(4), 321–347 (2004)CrossRefMathSciNetMATHGoogle Scholar
  7. 7.
    Gramm, J., Guo, J., Hüffner, F., Niedermeier, R.: Graph-modeled data clustering: Fixed-parameter algorithms for clique generation. Theor. Comput. Syst. 38(4), 373–392 (2005)CrossRefMATHGoogle Scholar
  8. 8.
    Grötschel, M., Wakabayashi, Y.: A cutting plane algorithm for a clustering problem. Math. Program. 45, 52–96 (1989)CrossRefGoogle Scholar
  9. 9.
    Guo, J.: A more effective linear kernelization for Cluster Editing. In: Chen, B., Paterson, M., Zhang, G. (eds.) ESCAPE 2007. LNCS, vol. 4614, pp. 36–47. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  10. 10.
    Kochenberger, G.A., Glover, F., Alidaee, B., Wang, H.: Clustering of microarray data via clique partitioning. J. Comb. Optim. 10(1), 77–92 (2005)CrossRefMathSciNetMATHGoogle Scholar
  11. 11.
    Křivánek, M., Morávek, J.: NP-hard problems in hierarchical-tree clustering. Acta Inform. 23(3), 311–323 (1986)CrossRefMathSciNetMATHGoogle Scholar
  12. 12.
    Niedermeier, R.: Invitation to Fixed-Parameter Algorithms. Oxford University Press, Oxford (2006)MATHGoogle Scholar
  13. 13.
    Rahmann, S., Wittkop, T., Baumbach, J., Martin, M., Truß, A., Böcker, S.: Exact and heuristic algorithms for weighted cluster editing. In: Proc. of Computational Systems Bioinformatics (CSB 2007), vol. 6, pp. 391–401 (2007)Google Scholar
  14. 14.
    Shamir, R., Sharan, R., Tsur, D.: Cluster graph modification problems. Discrete Appl. Math. 144(1–2), 173–182 (2004)CrossRefMathSciNetMATHGoogle Scholar
  15. 15.
    Sharan, R., Maron-Katz, A., Shamir, R.: CLICK and EXPANDER: a system for clustering and visualizing gene expression data. Bioinformatics 19(14), 1787–1799 (2003)CrossRefGoogle Scholar
  16. 16.
    Tatusov, R.L., Fedorova, N.D., Jackson, J.D., Jacobs, A.R., Kiryutin, B., Koonin, E.V., Krylov, D.M., Mazumder, R., Mekhedov, S.L., Nikolskaya, A.N., Rao, B.S., Smirnov, S., Sverdlov, A.V., Vasudevan, S., Wolf, Y.I., Yin, J.J., Natale, D.A.: The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41 (2003)CrossRefGoogle Scholar
  17. 17.
    van Zuylen, A., Williamson, D.P.: Deterministic algorithms for rank aggregation and other ranking and clustering problems. In: Proc. of Workshop on Approximation and Online Algorithms (WAOA 2007). Lect. Notes Comput. Sc., vol. 4927, pp. 260–273. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  18. 18.
    Wittkop, T., Baumbach, J., Lobo, F., Rahmann, S.: Large scale clustering of protein sequences with FORCE – a layout based heuristic for weighted cluster editing. BMC Bioinformatics 8(1), 396 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Sebastian Böcker
    • 1
    • 2
  • Sebastian Briesemeister
    • 3
  • Gunnar W. Klau
    • 4
    • 5
  1. 1.Institut für InformatikFriedrich-Schiller-Universität JenaGermany
  2. 2.Jena Centre for BioinformaticsJenaGermany
  3. 3.Div. for Simulation of Biological SystemsZBIT/WSI, Eberhard Karls Universität TübingenGermany
  4. 4.Department of Mathematics and Computer ScienceFreie Universität BerlinGermany
  5. 5.DFG Research Center MatheonBerlinGermany

Personalised recommendations