Journal of Combinatorial Optimization

, Volume 31, Issue 1, pp 347–371 | Cite as

Efficient algorithms for cluster editing

  • Lucas Bastos
  • Luiz Satoru Ochi
  • Fábio Protti
  • Anand Subramanian
  • Ivan César Martins
  • Rian Gabriel S. Pinheiro


The cluster editing problem consists of transforming an input graph \(G\) into a cluster graph (a disjoint union of complete graphs) by performing a minimum number of edge editing operations. Each edge editing operation consists of either adding a new edge or removing an existing edge. In this paper we propose new theoretical results on data reduction and instance generation for the cluster editing problem, as well as two algorithms based on coupling an exact method to, respectively, a GRASP or ILS heuristic. Experimental results show that the proposed algorithms are able to find high-quality solutions in practical runtime.


Combinatorial optimization Cluster editing Data reduction Metaheuristics Exact methods Hybrid algorithms 


  1. Aiex RM, Binato S, Resende MGC (2003) Parallel grasp with path-relinking for job shop scheduling. Parallel Comput 29:393–430MathSciNetCrossRefGoogle Scholar
  2. Bastos LO (2012) Novos algoritmos e resultados teóricos para o problema de particionamento de grafos por edição de arestas. PhD thesis, Universidade Federal Fluminense (in Portuguese)Google Scholar
  3. Baumbach J, Emig D, Kleinbolting N, Lange S, Rahmann S, Wittkop T (2014) TransClust., Accessed January 28, 2014
  4. Ben-Dor A, Shamir R, Yakhini Z (1999) Clustering gene expression patterns. J Comput Biol 6(3/4):281–297CrossRefGoogle Scholar
  5. Böcker S, Briesemeister B, A QB, Truss A (2008) Going weighted: parameterized algorithms for cluster editing. Comb Optim Appl 5165:1–12zbMATHGoogle Scholar
  6. Böcker S, Briesemeister S, Klau G (2009) Exact algorithms for cluster editing: evaluation and experiments. Algorithmica. 1–19Google Scholar
  7. Charikar M, Guruswami V, Wirth A (2005) Clustering with qualitative information. J Comput Sys Sci 71:360–383MathSciNetCrossRefzbMATHGoogle Scholar
  8. Dehne F, Langston MA, Luo X, Pitre S, Shaw P, Zhang Y (2006) The cluster editing problem: implementations and experiments. LNCS 4169:13–24MathSciNetzbMATHGoogle Scholar
  9. Erdős P, Rényi A (1959) On random graphs i. Publicationes Mathematicae 6:290–297MathSciNetzbMATHGoogle Scholar
  10. Gilbert EN (1959) Random graphs. Ann Math Stat 30:1141–1144CrossRefzbMATHGoogle Scholar
  11. Gramm J, Guo J, Hüffner F, Niedermeier R (2005) Graph-modeled data clustering: exact algorithms for clique generation. Theor Comput Sys 38:373–392CrossRefzbMATHGoogle Scholar
  12. Grötschel M, Wakabayashi Y (1989) A cutting plane algorithm for a clustering problem. Math Prog 45(1):59–96CrossRefMathSciNetzbMATHGoogle Scholar
  13. Guo J (2009) A more effective linear kernelization for cluster editing. Theor Comput Sci 410:718–726CrossRefMathSciNetzbMATHGoogle Scholar
  14. Hansen P, Mladenovic N (2003) Variable neighborhood search. In: Glover F, Kochenberger G (eds) Handbook of metaheuristics, Chap 6, Kluwer Academic Publishers, Philip Drive Norwell, MA, pp 145–183Google Scholar
  15. Hartuv E, Schmitt AO, Lange J, Meier-Ewert S, Lehrach H, Shamir R (2000) An algorithm for clustering cdna fingerprints. Genomics 66(3):249–256CrossRefGoogle Scholar
  16. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323CrossRefGoogle Scholar
  17. Klee V, Larman D (1981) Diameters of random graphs. Can J Math 33(3):618–640MathSciNetCrossRefzbMATHGoogle Scholar
  18. Lourenço HR, Martin OC, Stützle T (2003) Iterated local search. In: Glover F, Kochenberger G (eds) Handbook of metaheuristics, Chap 11, Kluwer Academic Publishers, Philip Drive Norwell, MA, pp 321–353Google Scholar
  19. Milosavljevic A, Strezoska Z, Zeremski M, Grujic D, Paunesku T, Crkvenjakov R (1995) Clone clustering by hybridization. Genomics 27(1):83–89CrossRefGoogle Scholar
  20. Penna PHV, Subramanian A, Ochi LS (2013) An iterated local search heuristic for the heterogeneous fleet vehiclerouting problem. J Heuristics 19(2):201–232CrossRefGoogle Scholar
  21. Protti F, Silva MD, Szwarcfiter J (2009) Applying modular decomposition to parameterized cluster editing problems. Theory Comput Sys 44:91–104CrossRefMathSciNetzbMATHGoogle Scholar
  22. Rahmann S, Wittkop T, Baumbach J, Martin M, Truss A, Böcker S (2007) Exact and heuristic algorithms for weighted cluster editing. In: Markstein P, Xu Y (eds) Comput Sys Bioinforma: CSB 2007 Conference Proceedings of the, Imp. Coll. Press, Covent Garden, London WC2H 9HE, vol 6, pp 391–400.Google Scholar
  23. Resende M, Ribeiro C (2003) Greedy randomized adaptive search procedures, Chap 8, Kluwer Academic Publishers, Philip Drive Norwell MA, pp 219–249Google Scholar
  24. Sen Gupta A, Palit A (1979) On clique generation using boolean equations. In: Proceedings of the IEEE. The IEEE Inc, New York. NY 10017 67:T178–180Google Scholar
  25. Shamir R, Sharan R, Tsur D (2004) Cluster graph modification problems. Discret Appl Math 144:173–182MathSciNetCrossRefzbMATHGoogle Scholar
  26. Sharan R, Maron-Katz A, Shamir R (2003) Click and expander: a system for clustering and visualizing gene expression data. Bioinforma 19(14):1787–1799CrossRefGoogle Scholar
  27. Souza MJF, Mine MT, de Silva MSA, Ochi LS, Subramanian A (2011) A hybrid heuristic, based on iterated local search and genius, for the vehicle routing problem with simultaneous pickup and delivery. Int J Logist Sys Manag 10(2):142–157CrossRefGoogle Scholar
  28. Tatusov R, Fedorova N, Jackson J, Jacobs A, Kiryutin B, Koonin E, Krylov D, Mazumder R, Mekhedov S, Nikolskaya A, Rao BS, Smirnov S, Sverdlov A, Vasudevan S, Wolf Y, Yin J, Natale D (2003) The cog database: an updated version includes eukaryotes. BMC Bioinforma 4(1):41CrossRefGoogle Scholar
  29. Wittkop T, Baumbach J, Lobo FP, Rahmann S (2007) Large scale clustering of protein sequences with FORCE: a layout based heuristic for weighted cluster editing. BMC Bioinforma 8:396CrossRefGoogle Scholar
  30. Wittkop T, Emig D, Lange SJ, Rahmann S, Albrecht M, Morris JH, Böcker S, Stoye J, Baumbach J (2010) Partitioning biological data with transitivity clustering. Nature Methods 7(6):419–420CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Lucas Bastos
    • 1
  • Luiz Satoru Ochi
    • 2
  • Fábio Protti
    • 2
  • Anand Subramanian
    • 3
  • Ivan César Martins
    • 2
  • Rian Gabriel S. Pinheiro
    • 2
  1. 1.Financiadora de Estudos e Projetos (FINEP)Praia do Flamengo 200 - 1ºandarRio de JaneiroBrazil
  2. 2.Universidade Federal FluminenseInstituto de ComputaçãoNiteróiBrazil
  3. 3.Universidade Federal da ParaíbaDepartamento de Engenharia de ProduçãoJoão PessoaBrazil

Personalised recommendations